问题
I have a data frame taken from a .csv-file which contains numeric and character values. I want to convert this data frame into a matrix. All containing information is numbers (the non-number-rows I deleted), so it should be possible to convert the data frame into a numeric matrix. However, I do get a character matrix.
I found the only way to solve this is to use as.numeric
for each and every row, but this is quite time-consuming. I am quite sure there is a way to do this with some kind of if(i in 1:n)
-form, but I cannot figure out how it might work. Or is the only way really to already start with numeric values, like proposed here(Making matrix numeric and name orders)?
Probably this is a very easy thing for most of you :P
The matrix is a lot bigger, this is only the first few rows... Here's the code:
cbind(
as.numeric(SFI.Matrix[ ,1]),
as.numeric(SFI.Matrix[ ,2]),
as.numeric(SFI.Matrix[ ,3]),
as.numeric(SFI.Matrix[ ,4]),
as.numeric(SFI.Matrix[ ,5]),
as.numeric(SFI.Matrix[ ,6]))
# to get something like this again:
Social.Assistance Danger.Poverty GINI S80S20 Low.Edu Unemployment
0.147 0.125 0.34 5.5 0.149 0.135 0.18683691
0.258 0.229 0.27 3.8 0.211 0.175 0.22329362
0.207 0.119 0.22 3.1 0.139 0.163 0.07170422
0.219 0.166 0.25 3.6 0.114 0.163 0.03638525
0.278 0.218 0.29 4.1 0.270 0.198 0.27407825
0.288 0.204 0.26 3.6 0.303 0.211 0.22372633
Thank you for any help!
回答1:
Edit 2: See @flodel's answer. Much better.
Try:
# assuming SFI is your data.frame
as.matrix(sapply(SFI, as.numeric))
Edit: or as @ CarlWitthoft suggested in the comments:
matrix(as.numeric(unlist(SFI)),nrow=nrow(SFI))
回答2:
data.matrix(SFI)
From ?data.matrix
:
Description:
Return the matrix obtained by converting all the variables in a
data frame to numeric mode and then binding them together as the
columns of a matrix. Factors and ordered factors are replaced by
their internal codes.
回答3:
Here is an alternative way if the data frame just contains numbers.
apply(as.matrix.noquote(SFI),2,as.numeric)
but the most reliable way of converting a data frame to a matrix is using data.matrix()
function.
回答4:
I had the same problem and I solved it like this, by taking the original data frame without row names and adding them later
SFIo <- as.matrix(apply(SFI[,-1],2,as.numeric))
row.names(SFIo) <- SFI[,1]
回答5:
Another way of doing it is by using the read.table()
argument colClasses
to specify the column type by making colClasses=c(*column class types*)
.
If there are 6 columns whose members you want as numeric, you need to repeat the character string "numeric"
six times separated by commas, importing the data frame, and as.matrix()
the data frame.
P.S. looks like you have headers, so I put header=T
.
as.matrix(read.table(SFI.matrix,header=T,
colClasses=c("numeric","numeric","numeric","numeric","numeric","numeric"),
sep=","))
回答6:
I manually filled NAs by exporting the CSV then editing it and reimporting, as below.
Perhaps one of you experts might explain why this procedure worked so well
(the first file had columns with data of types char
, INT
and num
(floating point numbers)), which all became char
type after STEP 1; but at the end of STEP 3 R correctly recognized the datatype of each column).
# STEP 1:
MainOptionFile <- read.csv("XLUopt_XLUstk_v3.csv",
header=T, stringsAsFactors=FALSE)
#... STEP 2:
TestFrame <- subset(MainOptionFile, str_locate(option_symbol,"120616P00034000") > 0)
write.csv(TestFrame, file = "TestFrame2.csv")
# ...
# STEP 3:
# I made various amendments to `TestFrame2.csv`, including replacing all missing data cells with appropriate numbers. I then read that amended data frame back into R as follows:
XLU_34P_16Jun12 <- read.csv("TestFrame2_v2.csv",
header=T,stringsAsFactors=FALSE)
On arrival back in R, all columns had their correct measurement levels automatically recognized by R!
来源:https://stackoverflow.com/questions/16518428/right-way-to-convert-data-frame-to-a-numeric-matrix-when-df-also-contains-strin