Error in `row.names<-.data.frame using mlogit in R language

橙三吉。 提交于 2020-01-15 19:15:57

问题


Here are the steps I'm following to do a Multinomial Linear Regression.

> z<-read.table("2008 Racedata.txt", header=TRUE, sep="\t", row.names=NULL)

> head(z)

     datekey raceno horseno place winner draw winodds log_odds jwt  hwt
1 2008091501      1       8     1      1    2    12.0 2.484907 128 1170
2 2008091501      1      11     2      0    3     8.6 2.151762 123 1135
3 2008091501      1       6     3      0    5     7.0 1.945910 127 1114
4 2008091501      1      12     4      0   10    23.0 3.135494 123 1018
5 2008091501      1      14     5      0    4    11.0 2.397895 113 1027
6 2008091501      1       5     6      0   14    50.0 3.912023 131  972

> x<-mlogit.data(z,choice="winner",shape="long",id.var="datekey",alt.var="horseno")

Error in `row.names<-.data.frame`(`*tmp*`, value = c("1.8", "1.11", "1.6",  : 
  duplicate 'row.names' are not allowed

In addition: Warning message:
non-unique values when setting 'row.names': ‘10.2’, ‘10.4’, ‘10.8’,
‘100.7’, ‘101.12’, ‘102.1’, ‘102.3’, ‘103.2’, ‘103.4’, 
‘103.6’, ‘104.12’, ‘104.3’, ‘104.9’, ‘105.1’, ‘105.5’, 
‘105.6’, ‘105.8’, ‘106.11’, ‘106.12’, ‘106.13’, ‘106.7’, 
‘107.10’, ‘107.14’, ‘107.3’, ‘108.12’, ‘108.2’, ‘108.6’, 
‘108.9’, ‘109.1’, ‘109.14’, ‘109.7’, ‘11.12’, ‘11.5’, 
‘11.9’, ‘110.2’, ‘110.3’, ‘110.4’, ‘110.9’, ‘111.1’, 
‘111.7’, ‘112.12’, ‘112.3’, ‘112.6’, ‘112.8’, ‘113.10’, 
‘113.13’, ‘113.7’, ‘114.12’, ‘114.2’, ‘114.9’, ‘115.10’, 
‘115.13’, ‘115.5’, ‘116.11’, ‘116.6’, ‘117.14’, ‘117.3’, 
‘117.7’, ‘118.1’, ‘118.13’, ‘118.2’, ‘118.9’, ‘119.10’, 
‘119.5’, ‘119.6’, ‘119.8’, ‘12.1’, ‘12.10’, ‘12.3’, 
‚Äò12.6‚Äô, ‚Äò120.2‚Äô, ‚Äò120.4‚Äô, ‚Äò120.7‚ [... truncated] 
> 

What step am I missing here? Why the duplicates in row.names?

Thanks, Walt


回答1:


Two problems.

You seem to have some problem with encoding since we are seeing lots of umlauts and accent marks in that error message. Furthernore I am wondering if that datekey column got converted into a factor class?

In this case it it referring to an error in construction of the row.names attribute of the new object, x. If you do:

 with( z, table( datekey, horseno) )

... you may see an a horse with multiple entries on the same day.

Actually there were no duplicate datekey x horseno combos. Changing to factor for horseno and datekey and then switching the "long" argument to "wide" produces error free result with this result:

z$datekey <- as.character(z$datekey)
z$horseno <- as.character(z$horseno)
x<-mlogit.data(z,choice="winner",shape="wide",id.var="datekey",alt.var="horseno")
str(x)
#----------
Classes ‘mlogit.data’ and 'data.frame': 18312 obs. of  11 variables:
 $ datekey : Factor w/ 733 levels "2008091501","2008091502",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ raceno  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ horseno : chr  "0" "1" "0" "1" ...
 $ place   : int  1 1 2 2 3 3 4 4 5 5 ...
 $ winner  : logi  FALSE TRUE TRUE FALSE TRUE FALSE ...
 $ draw    : int  2 2 3 3 5 5 10 10 4 4 ...
 $ winodds : num  12 12 8.6 8.6 7 7 23 23 11 11 ...
 $ log_odds: num  2.48 2.48 2.15 2.15 1.95 ...
 $ jwt     : int  128 128 123 123 127 127 123 123 113 113 ...
 $ hwt     : int  1170 1170 1135 1135 1114 1114 1018 1018 1027 1027 ...
 $ chid    : num  1 1 2 2 3 3 4 4 5 5 ...
 - attr(*, "index")='data.frame':   18312 obs. of  3 variables:
  ..$ chid: Factor w/ 9156 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5 ...
  ..$ alt : Factor w/ 2 levels "0","1": 1 2 1 2 1 2 1 2 1 2 ...
  ..$ id  : Factor w/ 733 levels "2008091501","2008091502",..: 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "choice")= chr "winner"


来源:https://stackoverflow.com/questions/22441657/error-in-row-names-data-frame-using-mlogit-in-r-language

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!