问题
I'm importing table from a fixed width format .txt file in R. This table has about 100 observations and 200000 lines (a few lines below).
11111 2008 7 31 21 2008 8 1 21 3 4 6 18 4 7 0 12 0 0 0 0 0 1 0 0 0 0 0 0 0 5 0 0 7 5 0 1 0 2 0 0 0 0 0 0 2 0 0 0.0 5 14.9 0 14.9 0 14.0 0 16.5 0 14.9 0 15.6 0 15.3 0 0 15.6 0 15.6 0 17.6 0 16.1 0 17.10 0 1 97 0 0.60 0 1 15.1 0 986.6 0 1002.9 0 7 0 0.2 0
11111 2008 8 1 0 2008 8 1 0 4 7 6 18 4 98 0 1 9 0 0 0 2 0 1 0 0 0 0 0 0 0 5 0 0 7 0 0 0 1 0 2 0 260 0 1 0 0 2 0 0 0.0 5 14.4 0 14.4 0 13.0 0 14.9 0 14.9 0 15.2 0 14.6 0 0 15.2 0 14.8 0 16.1 0 15.7 0 16.10 0 1 93 0 1.20 0 1 14.1 0 986.1 0 1002.4 0 7 0 0.5 0
11111 2008 8 1 3 2008 8 1 3 5 10 6 18 4 98 0 1 3 0 0 0 1 0 0 0 0 0 0 0 0 0 5 0 0 7 5 0 1 0 2 0 200 0 1 0 0 4 0 0 0.0 5 25.8 0 7 14.4 0 26.0 0 26.0 0 19.8 0 17.0 0 0 19.8 0 15.2 0 20.1 0 20.1 0 17.10 0 1 74 0 6.00 0 1 15.1 0 984.5 0 1000.6 0 8 0 1.6 0
11111 2008 8 1 6 2008 8 1 6 6 13 6 18 4 98 0 1 7 0 6 0 1 0 0 0 1 0 0 0 0 0 1000 0 1 0 7 5 0 1 0 2 0 230 0 2 0 0 8 0 0 0.0 5 36.0 0 5 5 40.0 0 36.0 0 23.7 0 17.4 0 0 23.7 0 19.8 0 24.6 0 24.0 0 14.80 0 1 51 0 14.50 0 1 12.8 0 983.9 0 999.7 0 6 0 0.6 0
11111 2008 8 1 9 2008 8 1 9 7 16 6 18 4 96 0 0 9 0 9 0 0 0 0 0 2 0 0 0 0 0 1200 0 0 0 7 5 0 7 95 0 300 0 3 0 0 13 0 0 0.0 5 23.5 0 5 5 43.8 0 23.6 0 19.6 0 17.3 0 0 19.6 0 19.6 0 26.0 0 19.8 0 17.90 0 1 79 0 4.90 0 1 15.8 0 981.9 0 997.9 0 8 0 2.0 0
Right now, I'm using the following code leading to a pretty long loading (about 1 minute):
col_width <- c(5,5,3,3,3,5,3,3,3,2,
3,3,3,2,3,2,2,3,2,3,
2,2,2,2,2,2,2,2,2,2,
2,5,2,2,2,2,2,2,2,2,
2,3,2,4,2,3,2,2,3,2,
2,7,2,6,2,6,2,6,2,6,
2,6,2,6,2,6,2,2,6,2,
6,2,6,2,6,2,6,2,2,4,
2,6,2,2,6,2,7,2,7,2,
3,2,5,2)
df.h.tomsk <- read.fwf(path,
widths=col_width,
header=FALSE,
sep="\t",
nrows=200000,
comment.char="",
buffersize=5000)
Any suggestion(s) to accelerate the process? For example is there something like fread from data.table working with fwf format?
回答1:
Have you tried using fread
of library(data.table)
? Please copy paste some lines of your file to check it...
来源:https://stackoverflow.com/questions/22144325/speed-up-import-of-fixed-width-format-table-in-r