问题
I want to calculate the Gini coefficient for each column in a 2090 x 25
dataframe. I am using the Gini function in the ineq package and the following code:
gini <- sapply(mydata, function(x) ineq(x,type="Gini")).
This produces results that look valid but also the following warning message:
Warning messages:
1: In n * sum(x) : NAs produced by integer overflow
2: In sum(x * 1:n) : Integer overflow - use sum(as.numeric(.))
3: In n * sum(x) : NAs produced by integer overflow
To overcome the integer overflow I converted the dataframe to a matrix (mymatrix <- as.matrix(mydf))
but then the results were all zeros or NAs. I think this is because ineq package requires a vector and matrix is not a vector.
My questions are:
- how can I convert integer columns to numeric and retain a vector class?
- are there any other options to work around the integer overflow problem?
Thanks
Nerida
回答1:
In absence of more info, my guess would be you might prefer
sapply(1:25, function(x) ineq(as.numeric(mydata[,x],type='Gini')) )
Edit: as @James and @Roman pointed out, sapply
will grab each element of a dataframe in turn, so
sapply(mydata,function(x) ineq(as.numeric(x),type='Gini') )
should produce the same result.
来源:https://stackoverflow.com/questions/16890686/gini-coefficient-ineq-package-in-r-and-integer-overflow