问题
I am trying to reproduce the results of a reshape
in Stata using base R's reshape
function.
Stata
webuse reshape3, clear
li, clean
// reshape long
reshape long inc@r ue, i(id) j(year)
list, sepby(id) clean
This produces, before the reshape
:
. li, clean
id sex inc80r inc81r inc82r ue80 ue81 ue82
1. 1 0 5000 5500 6000 0 1 0
2. 2 1 2000 2200 3300 1 0 0
3. 3 0 3000 2000 1000 0 0 1
Note the pattern of the names for the stub inc
. After the reshape
, I get:
. list, sepby(id) clean
id year sex incr ue
1. 1 80 0 5000 0
2. 1 81 0 5500 1
3. 1 82 0 6000 0
4. 2 80 1 2000 1
5. 2 81 1 2200 0
6. 2 82 1 3300 0
7. 3 80 0 3000 0
8. 3 81 0 2000 0
9. 3 82 0 1000 1
R
I run into trouble in R since I don't know how to specify the regular expressiokn required to parse the wide format variable names.
library(foreign)
dfReshape3 <- read.dta('http://www.stata-press.com/data/r12/reshape3.dta')
reshape(dfReshape3, dir='long', varying=3:8, v.names=c('inc', 'ue'),
times = c('80', '81', '82'))
However, this gives me:
id sex time inc ue
1.80 1 0 80 5000 5500
2.80 2 1 80 2000 2200
3.80 3 0 80 3000 2000
1.81 1 0 81 6000 0
2.81 2 1 81 3300 1
3.81 3 0 81 1000 0
1.82 1 0 82 1 0
2.82 2 1 82 0 0
3.82 3 0 82 0 1
Any help appreciated.
回答1:
You was really close, just give a list to varying
reshape(dfReshape3, dir='long', varying=list(c(3:5),c(6:8)), v.names=c('inc', 'ue'),times = c('80', '81', '82'))
id sex time inc ue
1.80 1 0 80 5000 0
2.80 2 1 80 2000 1
3.80 3 0 80 3000 0
1.81 1 0 81 5500 1
2.81 2 1 81 2200 0
3.81 3 0 81 2000 0
1.82 1 0 82 6000 0
2.82 2 1 82 3300 0
3.82 3 0 82 1000 1
来源:https://stackoverflow.com/questions/14673027/reshape-in-r-with-variable-name-patterns