问题
I have a large matrix of gene ID's followed by a series of bootstrap values.
For example:
NM_001040105 1.80711736583 0.877742720548 1.0842704195 1.80711736583 0.505992862434 0.877742720548 1.37340919803 0.722846946334 1.0842704195 1.0842704195 2.52996431217 1.80711736583 1.0842704195 2.52996431217 1.80711736583 1.0842704195 1.37340919803 1.37340919803 1.0842704195 1.37340919803 0.877742720548 1.0842704195 2.52996431217 1.80711736583 1.80711736583 0.877742720548 0.877742720548 0.877742720548 1.80711736583 1.0842704195 0.722846946334 0.877742720548 0.722846946334 1.80711736583 0.877742720548 8.31273988284 1.37340919803 0.722846946334 1.0842704195 1.0842704195 1.0842704195 1.37340919803 2.52996431217 1.80711736583 1.37340919803 1.37340919803 8.31273988284 3.97565820484 1.80711736583 ...
The problem is that not every gene has the same amount of bootstrap values, so the matrix is not rectangular, thus read.table() won't work. readLines() won't necessarily work either, as I need the gene IDs to be associated with their respective bootstrap values. Is there any way to read a table like this into R?
Thanks, Marcus
回答1:
What about
#sample data
test<-c("NM_001040105 1.80711736583 0.877742720548 1.0842704195",
"PR_00104145 0.722846946334",
"QQ_001678941 1.37340919803 0.877742720548 1.0842704195 2.52996431217 1.80711736583 1.80711736583 0.877742720548")
Here I use a textConnection to read in the sample data, but you should just be able to pass a filename to readLines as well. I also split the data right away on spaces
con<-textConnection(test)
nn<-strsplit(readLines(con), " ")
close(con)
Now i turn it into a list and make the values numeric. I use the first element as the name, and all the rest as the values.
Map(function(a,b)b, sapply(nn,"[",1),
lapply(nn,function(x) as.numeric(tail(x,-1))))
回答2:
A reasonably performant way, assuming test
is the result of readLines()
:
space <- regexpr(" ", test, fixed=TRUE)
id <- substring(test, 1L, space-1L)
setNames(strsplit(substring(test, space+1L), " ", fixed=TRUE), id)
来源:https://stackoverflow.com/questions/24091275/how-to-read-a-non-rectangular-matrix-into-r