I would like to delete collumns with a specific string "Gtype." from a .txt tab delimited file. I already have tried this command in R: df <- df[, -grep("GType.", colnames(df))]
to do this task. However my matrix is too big (more than 13 GB), and R cannot deal with it. (Error: cannot allocate vector of size....)
My input file:
Log.NE122 Gtype.NE122 Log.NE144 Gtype.NE144
-0.33 AA 1.0 AB
My expected output:
Log.NE122 Log.NE144
-0.33 1.0
I am wondering that it works in bash. If someone have other options....
Using awk:
awk 'NR==1{for (i=1; i<=NF; i++) if ($i ~ /Gtype/) a[i];
else printf "%s%s", $i, OFS; print ""; next}
{for (i=1; i<=NF; i++) if (!(i in a)) printf "%s%s", $i, OFS; print "" }' file
Log.NE122 Log.NE144
-0.33 1.0
You can also try using the 'data.table' package and assign the columns NULL:
dt <- data.table(df)
dt[, colToDelete := NULL]
"data.table" tries to do most of its operations without having to make copies. The way that you are doing it on data.frame
s requires a copy to be made.
来源:https://stackoverflow.com/questions/23130502/delete-columns-in-text-files-with-specific-string