I need to run a regression on a panel data . It has 3 dimensions (Year * Company * Country). For example:
============================================
year | co
I think you can also do:
df <-transform(df, ID = as.numeric(interaction(comp, count, drop=TRUE)))
And then estimate
result <- plm(value.y ~ value.x, data = df, index = ("ID","year"))
This question is much like these:
You may not want to create a new dummy, then with dplyr package you can use the group_indices
function. Although it do not support mutate
, the following approach is straightforward:
fakedata$id <- fakedata %>% group_indices(comp, count)
The id
variable will be your first panel dimension. So, you need to set the plm index argument to index = c("id", "year")
.
For alternatives you can take a look at this question: R create ID within a group.
I think you want to use lm()
instead of plm(
). This blog post here discusses what you're after:
https://www.r-bloggers.com/r-tutorial-series-multiple-linear-regression/
for your example I'd imagine it would look something like the following:
lm(formula = comp ~ count + year, data = dataname)
If you want to control for another dimension in a within model, simply add a dummy for it:
plm(value.y ~ value.x + count, data = dataname, index = c("comp","year"))
Alternatively (especially for high-dimensional data), look at the lfe
package which can 'absorb' the additional dimension so the summary output is not polluted by the dummy variable.