问题
colleagues! I have panel data:
Company year Beta NI Sales Export Hedge FL QR AT Foreign
1 1 2010 -2.2052800 293000 1881000 78.6816 0 23.5158 1.289 0.6554 3000
2 1 2011 -2.2536069 316000 2647000 81.4885 0 21.7945 1.1787 0.8282 22000
3 1 2012 0.3258693 363000 2987000 82.4908 0 24.5782 1.2428 0.813 -11000
4 1 2013 0.4006030 549000 4546000 79.4325 0 31.4168 0.6038 0.7905 71000
5 1 2014 -0.4508811 348000 5376000 79.2411 0 37.1451 0.6563 0.661 -64000
6 1 2015 0.1494696 355000 5038000 77.1735 0 33.3852 0.9798 0.5483 37000
But R shows the mistake when I try to use plm package for the regression:
panel <- read.csv("Panel.csv", header=T, sep=";")
p=plm(data=panel,Beta~NI, model="within",index=c("id","year"))
Error in pdim.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
In addition: Warning messages:
1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
3: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
I searched this error in the Internet and read that it's connected with the id of company and year. But I did not find the way how to avoid this problem. Also, when I do na.omit(panel), R does not show the error, but it's significant to stay NA data and companies in the data. Please, tell me to do with this problem. Thank you.
回答1:
Let consider the Produc
dataset in the plm
package.
data("Produc", package = "plm")
head(Produc)
state year region pcap hwy water util pc gsp emp unemp
1 ALABAMA 1970 6 15032.67 7325.80 1655.68 6051.20 35793.80 28418 1010.5 4.7
2 ALABAMA 1971 6 15501.94 7525.94 1721.02 6254.98 37299.91 29375 1021.9 5.2
3 ALABAMA 1972 6 15972.41 7765.42 1764.75 6442.23 38670.30 31303 1072.3 4.7
4 ALABAMA 1973 6 16406.26 7907.66 1742.41 6756.19 40084.01 33430 1135.5 3.9
5 ALABAMA 1974 6 16762.67 8025.52 1734.85 7002.29 42057.31 33749 1169.8 5.5
6 ALABAMA 1975 6 17316.26 8158.23 1752.27 7405.76 43971.71 33604 1155.4 7.7
In this dataset information are collected over time (17 years) and over the same sample units (48 US States).
table(Produc$state, Produc$year)
1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986
ALABAMA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ARIZONA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ARKANSAS 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CALIFORNIA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
...
plm
requires that each (state, year) pair be unique.
any(table(Produc$state, Produc$year)!=1)
[1] FALSE
The command plm
works nicely with this dataset:
plmFit1 <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
data = Produc, index = c("state","year"))
summary(plmFit1)
Oneway (individual) effect Within Model
Call:
plm(formula = log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
data = Produc, index = c("state", "year"))
Balanced Panel: n=48, T=17, N=816
Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-0.12000 -0.02370 -0.00204 0.01810 0.17500
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
log(pcap) -0.02614965 0.02900158 -0.9017 0.3675
log(pc) 0.29200693 0.02511967 11.6246 < 2.2e-16 ***
log(emp) 0.76815947 0.03009174 25.5273 < 2.2e-16 ***
unemp -0.00529774 0.00098873 -5.3582 1.114e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Total Sum of Squares: 18.941
Residual Sum of Squares: 1.1112
R-Squared: 0.94134
Adj. R-Squared: 0.93742
F-statistic: 3064.81 on 4 and 764 DF, p-value: < 2.22e-16
Now we duplicate one of the (state, year) pairs:
Produc[2,2] <- 1970
any(table(Produc$state, Produc$year)>1)
[1] TRUE
and plm
now generates the same error message that you described above:
zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
data = Produc, index = c("state","year"))
Error in pdim.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
Inoltre: Warning messages:
1: In pdata.frame(data, index) :
duplicate couples (id-time) in resulting pdata.frame
to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
3: In is.pbalanced.default(index[[1]], index[[2]]) :
duplicate couples (id-time)
Hope this can help you.
来源:https://stackoverflow.com/questions/43663594/error-in-plm-regression