I have a data frame with a numerical ID variable which identify the Primary, Secondary and Ultimate Sampling Units from a multistage sampling scheme. I want to split the origina
Several neat answers have been made years ago, but a solution I find useful, using the outer
function, has not been mentioned. In this age of search engines, I put it here in case others could find it handy.
I was faced with a slightly simpler problem: turning a column of 6-digit numbers into 6 columns representing each digit. This can be solved using a combination of outer
, integer division (%/%
) and modulo (%%
).
DF <- data.frame("ID" = runif(3)*10^6, "a" = sample(letters, 3,T))
DF <- cbind(DF, "ID" = outer(DF$ID, 10^c(5:0), function(a, b) a %/% b %% 10))
DF
# ID a ID.1 ID.2 ID.3 ID.4 ID.5 ID.6
# 1 814895 z 8 1 4 8 9 5
# 2 417209 q 4 1 7 2 0 9
# 3 545797 c 5 4 5 7 9 7
The question asked here is slightly more complex, requiring different values for both integer division and modulo.
DF <- data.frame("ID" = runif(3)*10^6, "a" = sample(letters, 3,T))
DF <- cbind(DF, "ID" = outer(DF$ID, c(1:3), function(a,b) a %/% 10^c(5,3,0)[b] %% 10^b))
DF
# ID a ID.1 ID.2 ID.3
# 1 809372 q 8 9 372
# 2 954790 g 9 54 789
# 3 166970 l 1 66 969