I\'m wondering if there is a good way to delete multiple columns over a few different data sets in R. I have a data set that looks like:
RangeNumber Time
You want to gather those dataframes into a list and then run the Extract function over them. The first argument given to "[" should be TRUE so that all rows are obtained, and the second argument should be the column names (I made up three dataframes that varied in their row numbers and column names but all had 'Time' and 'Value' columns:
> datlist <- list(dat1,dat2,dat3)
> TimVal <- lapply(datlist, "[", TRUE, c("Time","Value") )
> TimVal
[[1]]
Time Value
1 2:00 1
2 2:05 4
[[2]]
Time Value
1 2:00 1
2 2:05 4
[[3]]
Time Value
1 2:00 1
2 2:05 4
2.1 2:05 4
1.1 2:00 1
This is added in case the goal was to have them all together in the same dataframe:
> do.call(rbind, TimVal)
Time Value
1 2:00 1
2 2:05 4
3 2:00 1
4 2:05 4
11 2:00 1
21 2:05 4
2.1 2:05 4
1.1 2:00 1
If you are very new to R you may not have figured out that the last code did not change TimVal; it only showed what value would be returned and to make the effect durable you would need to assign to a name. Perhaps even the same name:
TimVal <- do.call(rbind, TimVal):
Rather than delete, just choose the columns that you want, i.e.
data1 = data1[, c(2, 3)]
The question still remains about your other data sets: data2
, etc. I suspect that since your data frames are all "similar", you could combine them into a single data frame with an additional identifier column, id
, which tells you the data set number. How you combine your data sets depends on how you data is stored. But typically, a for
loop over read.csv
is the way to go.
I'm not sure if I should recommend these since these are pretty "destructive" methods.... Be sure that you have a backup of your original data before trying ;-)
This approach assumes that the datasets are already in your workspace and you just want new versions of them.
Both of these are pretty much the same. One option uses lapply()
and the other uses for
.
lapply
lapply(ls(pattern = "data[0-9+]"),
function(x) { assign(x, get(x)[2:3], envir = .GlobalEnv) })
for
temp <- ls(pattern = "data[0-9+]")
for (i in 1:length(temp)) {
assign(temp[i], get(temp[i])[2:3])
}
Basically, ls(.etc.)
will create a vector of datasets in your workspace matching the naming pattern you provide. Then, you write a small function to select the columns you want to keep.
A less "destructive" approach would be to create new data.frame
s instead of overwriting the original ones. Something like this should do the trick:
lapply(ls(pattern = "data[0-9+]"),
function(x) { assign(paste(x, "T", sep="."),
get(x)[2:3], envir = .GlobalEnv) })