How to read a non-rectangular matrix into R

无人久伴 提交于 2019-12-12 01:39:57

问题


I have a large matrix of gene ID's followed by a series of bootstrap values.

For example:

NM_001040105 1.80711736583 0.877742720548 1.0842704195 1.80711736583 0.505992862434 0.877742720548 1.37340919803 0.722846946334 1.0842704195 1.0842704195 2.52996431217 1.80711736583 1.0842704195 2.52996431217 1.80711736583 1.0842704195 1.37340919803 1.37340919803 1.0842704195 1.37340919803 0.877742720548 1.0842704195 2.52996431217 1.80711736583 1.80711736583 0.877742720548 0.877742720548 0.877742720548 1.80711736583 1.0842704195 0.722846946334 0.877742720548 0.722846946334 1.80711736583 0.877742720548 8.31273988284 1.37340919803 0.722846946334 1.0842704195 1.0842704195 1.0842704195 1.37340919803 2.52996431217 1.80711736583 1.37340919803 1.37340919803 8.31273988284 3.97565820484 1.80711736583 ...

The problem is that not every gene has the same amount of bootstrap values, so the matrix is not rectangular, thus read.table() won't work. readLines() won't necessarily work either, as I need the gene IDs to be associated with their respective bootstrap values. Is there any way to read a table like this into R?

Thanks, Marcus


回答1:


What about

#sample data
test<-c("NM_001040105 1.80711736583 0.877742720548 1.0842704195",
"PR_00104145 0.722846946334", 
"QQ_001678941 1.37340919803 0.877742720548 1.0842704195 2.52996431217 1.80711736583 1.80711736583 0.877742720548")

Here I use a textConnection to read in the sample data, but you should just be able to pass a filename to readLines as well. I also split the data right away on spaces

con<-textConnection(test)
nn<-strsplit(readLines(con), " ")
close(con)

Now i turn it into a list and make the values numeric. I use the first element as the name, and all the rest as the values.

Map(function(a,b)b, sapply(nn,"[",1), 
    lapply(nn,function(x) as.numeric(tail(x,-1))))



回答2:


A reasonably performant way, assuming test is the result of readLines():

space <- regexpr(" ", test, fixed=TRUE)
id <- substring(test, 1L, space-1L)
setNames(strsplit(substring(test, space+1L), " ", fixed=TRUE), id)


来源:https://stackoverflow.com/questions/24091275/how-to-read-a-non-rectangular-matrix-into-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!