\'order\' in R seems like \'sort\' in Stata. Here\'s a dataset for example (only variable names listed):
v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15 v16 v17 v
This should give you the same file:
#snip
gtinfo <- rbind(tweetinfo, noretweetinfo)
gtinfo$deleted=""
retweetinfo <- transform(retweetinfo, reTweetId="", reUserId="")
gtinfo <- rbind(gtinfo, retweetinfo)
gtinfo <-gtinfo[,c(1:16,18,17)]
#snip
It is possible to implement a function like Strata's order function in R, but I don't think there is much demand for that.
The package dplyr
and the function dplyr::relocate
, a new verb introduced in dplyr 1.0.0
, does exactly what you are looking for.
library(dplyr)
data %>% relocate(v17, v18, .before = v13)
data %>% relocate(v6, v16, .after = last_col())
data %>% relocate(age, .after = gender)
Because I'm procrastinating and experimenting with different things, here's a function that I whipped up. Ultimately, it depends on append
:
moveme <- function(invec, movecommand) {
movecommand <- lapply(strsplit(strsplit(movecommand, ";")[[1]], ",|\\s+"),
function(x) x[x != ""])
movelist <- lapply(movecommand, function(x) {
Where <- x[which(x %in% c("before", "after", "first", "last")):length(x)]
ToMove <- setdiff(x, Where)
list(ToMove, Where)
})
myVec <- invec
for (i in seq_along(movelist)) {
temp <- setdiff(myVec, movelist[[i]][[1]])
A <- movelist[[i]][[2]][1]
if (A %in% c("before", "after")) {
ba <- movelist[[i]][[2]][2]
if (A == "before") {
after <- match(ba, temp)-1
} else if (A == "after") {
after <- match(ba, temp)
}
} else if (A == "first") {
after <- 0
} else if (A == "last") {
after <- length(myVec)
}
myVec <- append(temp, values = movelist[[i]][[1]], after = after)
}
myVec
}
Here's some sample data representing the names of your dataset:
x <- paste0("v", 1:18)
Imagine now that we wanted "v17" and "v18" before "v3", "v6" and "v16" at the end, and "v5" at the beginning:
moveme(x, "v17, v18 before v3; v6, v16 last; v5 first")
# [1] "v5" "v1" "v2" "v17" "v18" "v3" "v4" "v7" "v8" "v9" "v10" "v11" "v12"
# [14] "v13" "v14" "v15" "v6" "v16"
So, the obvious usage would be, for a data.frame
named "df":
df[moveme(names(df), "how you want to move the columns")]
And, for a data.table
named "DT" (which, as @mnel points out, would be more memory efficient):
setcolorder(DT, moveme(names(DT), "how you want to move the columns"))
Note that compound moves are specified by semicolons.
The recognized moves are:
before
(move the specified columns to before another named column)after
(move the specified columns to after another named column)first
(move the specified columns to the first position)last
(move the specified columns to the last position)It is very unclear what you would like to do, but your first sentence makes me assume you would like to sort dataset.
Actually, there is a built-in order
function, which returns the indices of the ordered sequence. Are you searching this?
> x <- c(3,2,1)
> order(x)
[1] 3 2 1
> x[order(x)]
[1] 1 2 3
I get your problem. I now have code to offer:
move <- function(data,variable,before) {
m <- data[variable]
r <- data[names(data)!=variable]
i <- match(before,names(data))
pre <- r[1:i-1]
post <- r[i:length(names(r))]
cbind(pre,m,post)
}
# Example.
library(MASS)
data(painters)
str(painters)
# Move 'Expression' variable before 'Drawing' variable.
new <- move(painters,"Expression","Drawing")
View(new)
You could write your own function that does this.
The following will give you the new order for your column names using similar syntax to stata
where
is a named list with 4 possibilities
list(last = T)
list(first = T)
list(before = x)
where x
is the variable name in questionlist(after = x)
where x
is the variable name in questionsorted = T
will sort var_list
lexicographically (a combination of alphabetic
and sequential
from the stata
command
The function works on the names only, (once you pass a data.frame
object as data
, and returns a reordered list of names
eg
stata.order <- function(var_list, where, sorted = F, data) {
all_names = names(data)
# are all the variable names in
check <- var_list %in% all_names
if (any(!check)) {
stop("Not all variables in var_list exist within data")
}
if (names(where) == "before") {
if (!(where %in% all_names)) {
stop("before variable not in the data set")
}
}
if (names(where) == "after") {
if (!(where %in% all_names)) {
stop("after variable not in the data set")
}
}
if (sorted) {
var_list <- sort(var_list)
}
where_in <- which(all_names %in% var_list)
full_list <- seq_along(data)
others <- full_list[-c(where_in)]
.nwhere <- names(where)
if (!(.nwhere %in% c("last", "first", "before", "after"))) {
stop("where must be a list of a named element first, last, before or after")
}
do_what <- switch(names(where), last = length(others), first = 0, before = which(all_names[others] ==
where) - 1, after = which(all_names[others] == where))
new_order <- append(others, where_in, do_what)
return(all_names[new_order])
}
tmp <- as.data.frame(matrix(1:100, ncol = 10))
stata.order(var_list = c("V2", "V5"), where = list(last = T), data = tmp)
## [1] "V1" "V3" "V4" "V6" "V7" "V8" "V9" "V10" "V2" "V5"
stata.order(var_list = c("V2", "V5"), where = list(first = T), data = tmp)
## [1] "V2" "V5" "V1" "V3" "V4" "V6" "V7" "V8" "V9" "V10"
stata.order(var_list = c("V2", "V5"), where = list(before = "V6"), data = tmp)
## [1] "V1" "V3" "V4" "V2" "V5" "V6" "V7" "V8" "V9" "V10"
stata.order(var_list = c("V2", "V5"), where = list(after = "V4"), data = tmp)
## [1] "V1" "V3" "V4" "V2" "V5" "V6" "V7" "V8" "V9" "V10"
# throws an error
stata.order(var_list = c("V2", "V5"), where = list(before = "v11"), data = tmp)
## Error: before variable not in the data set
if you want to do the reordering memory-efficiently (by reference, without copying) use data.table
DT <- data.table(tmp)
# sets by reference, no copying
setcolorder(DT, stata.order(var_list = c("V2", "V5"), where = list(after = "V4"),
data = DT))
DT
## V1 V3 V4 V2 V5 V6 V7 V8 V9 V10
## 1: 1 21 31 11 41 51 61 71 81 91
## 2: 2 22 32 12 42 52 62 72 82 92
## 3: 3 23 33 13 43 53 63 73 83 93
## 4: 4 24 34 14 44 54 64 74 84 94
## 5: 5 25 35 15 45 55 65 75 85 95
## 6: 6 26 36 16 46 56 66 76 86 96
## 7: 7 27 37 17 47 57 67 77 87 97
## 8: 8 28 38 18 48 58 68 78 88 98
## 9: 9 29 39 19 49 59 69 79 89 99
## 10: 10 30 40 20 50 60 70 80 90 100