问题
I have 91 files - .log format:
rajectory Log File
Rock type: 2 (0: Sphere, 1: Cuboid, 2: Rock)
Nr of Trajectories: 91
Trajectory-Mode: ON
Average Slope (Degrees): 28.05 / 51.99 / 64.83
Filename: test_tschamut_Pos1.xml
Z-offset: 1.32000
Rock Position X: 696621.38
Rock Position Y: 167730.02
Rock Position Z: 1679.6400
Friction:
Overall Type: Medium
t (s) x (m) y (m) z (m) p0 () p1 () p2 () p3 () vx (m s-1) vy (m s-1) vz (m s-1) wx (rot s-1) wy (rot s-1) wz (rot s-1) Etot (kJ) Ekin (kJ) Ekintrans (kJ) Ekinrot (kJ) zt (m) Fv (kN) Fh (kN) Slippage (m) mu_s (N s m-1) v_res (m s-1) w_res (rot s-1) JumpH (m) ProjDist (m) Jc () JH_Jc (m) SD (m)
0.000 696621.380 167730.020 1680.960 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1192.526 0.000 0.000 0.000 1677.754 0.000 0.000 0.000 0.350 0.000 0.000 3.206 0.000 0.000 0.000 0.000
0.010 696621.380 167730.020 1680.959 1.000 0.000 -0.000 0.000 0.000 0.000 -0.098 0.000 0.000 0.000 1192.526 0.010 0.010 0.000 1677.754 0.000 0.000 0.000 0.350 0.098 0.000 3.205 0.000 0.000 0.000 0.000
0.020 696621.380 167730.020 1680.958 1.000 0.000 -0.000 0.000 0.000 0.000 -0.196 0.000 0.000 0.000 1192.526 0.039 0.039 0.000 1677.754 0.000 0.000 0.000 0.350 0.196 0.000 3.204 0.000 0.000 0.000 0.000
0.040 696621.380 167730.020 1680.952 1.000 0.000 -0.000 0.000 0.000 0.000 -0.392 0.000 0.000 0.000 1192.526 0.158 0.158 0.000 1677.754 0.000 0.000 0.000 0.350 0.392 0.000 3.198 0.000 0.000 0.000 0.000
0.060 696621.380 167730.020 1680.942 1.000 0.000 -0.000 0.000 0.000 0.000 -0.589 0.000 0.000 0.000 1192.526 0.355 0.355 0.000 1677.754 0.000 0.000 0.000 0.350 0.589 0.000 3.188 0.000 0.000 0.000 0.000
I have managed to import one single file, and to retain only the desired variables which are: x
, y
, z
, Etot
:
trjct <- read.table('trajectory_test_tschamut_Pos1.log', skip = 23)
trjct <- trjct[,c("V1","V2","V3", "V4", "V15")]
colnames(trjct) <- c("t", "x", "y", "z", "Etot")
> str(trjct)
'data.frame': 1149 obs. of 5 variables:
$ t : num 0 0.01 0.02 0.04 0.06 0.08 0.11 0.13 0.15 0.16 ...
$ x : num 696621 696621 696621 696621 696621 ...
$ y : num 167730 167730 167730 167730 167730 ...
$ z : num 1681 1681 1681 1681 1681 ...
$ Etot: num 1193 1193 1193 1193 1193 ...
However I have 91 of these files and would like to analyse them simultaneously. Therefore, I want to create one large dataset, that distingishes the data from every file by adding an ID - similiar question has been answered here.
I have applied the code to my data and needs and adjusted it here and there, but I always keep getting some errors.
# importing all files at the same time
files.list <- list.files(pattern = ".log")
trjct <- data.frame(t=numeric(),
x=numeric(),
z=numeric(),
Etot=numeric(),
stringsAsFactors=FALSE)
for (i in 1: length(files.list)) {
df.next <- read.table(files.list[[i]], header=F, skip = 23)
df.next$ID <- paste0('simu', i)
df <- rbind(df, df.next)
}
I am getting an error:
Error in rep(xi, length.out = nvar) :
attempt to replicate an object of type 'closure'
QUESTIONS:
Where is the problem and how can I fix it?
Is there a better solution?
回答1:
You could also check out purrr::map_df
which behaves like lapply or for loop but returns a data.frame
read_traj <- function(fi) {
df <- read.table(fi, header=F, skip=23)
df <- df[, c(1:4, 15)]
colnames(df) <- c("t", "x", "y", "z", "Etot")
return(df)
}
files.list <- list.files(pattern = ".log")
library(tidyverse)
map_df
has a handy feature .id=...
that creates a column, id
, with numbers 1...N
where N is number of files.
map_df(files.list, ~read_traj(.x), .id="id")
If you want to save the file name instead, use the id
column to access files.list
map_df(files.list, ~read_traj(.x), .id="id") %>%
mutate(id = files.list[as.numeric(id)])
回答2:
First of all, you should encapsulate the reading part in a function :
read_log_file <- function(path) {
trjct <- read.table(path, skip = 23)
trjct <- trjct[,c("V1","V2","V3", "V4", "V15")]
colnames(trjct) <- c("t", "x", "y", "z", "Etot")
return(trjct)
}
Then, you can create a list of data.frame using mapply (kind of apply which can take two parameters, go to datacamp article on apply family if you want to know more).
files.list <- list.files(pattern = ".log")
ids <- 1:length(files.list)
df_list = mapply(function(path, id) {
df = read_log_file(path)
df$ID = id
return(df)
}, files.list, ids, SIMPLIFY=FALSE)
Note the SIMPLIFY=FALSE
part, it avoids mapply to return a kind of data.frame and return a raw list of data.frame instead.
Finally, you can concatenate all your data.frame in one with bind_rows
from dplyr package :
df = dplyr::bind_rows(df_list)
Note : in general, in R, it's better to use *apply functions family.
来源:https://stackoverflow.com/questions/49132265/importing-many-files-at-the-same-time-and-adding-id-indicator