Making more sense of the reshape() function

守給你的承諾、 提交于 2019-12-11 04:04:10

问题


Consider the following dataset which is taken from this question: Going from wide to long w/ coupled-columns: Is there a more R way to do this (i.e. - without using a for loop)?

(dput at the end of the question)

   phrase wo1sp wo2sp wo3sp wo1sc wo2sc wo3sc
1   hello   dan  mark  todd    10     5     4
2   hello  mark   dan chris     8     9     4
3 goodbye  mark   dan   kev     2     4    10
4    what   kev   dan  mark     4     5     5

The goal is to reshape the data from wide to long taking into account that there is some pattern in the column names. The expected output is

    phrase time    sp sc
1    hello    1   dan 10
2    hello    1  mark  8
3  goodbye    1  mark  2
4     what    1   kev  4
5    hello    2  mark  5
6    hello    2   dan  9
7  goodbye    2   dan  4
8     what    2   dan  5
9    hello    3  todd  4
10   hello    3 chris  4
11 goodbye    3   kev 10
12    what    3  mark  5

@docendodiscimus provided a sound solution using melt from data.table but I'd like to use reshape() from the stats library for the sake of practice.

This function is certainly quite powerful but I almost always run into a problem with one of its arguments, so this time it's the new.row.names argument which I rarely used before.

I tried

reshape(
  dat,
  idvar = "phrase",
  varying = list(
    "sp" = grep("sp$", names(dat)), 
    "sc" = grep("sc$", names(dat))
  ),
  direction = "long", 
  v.names = c("sp", "sc") # name of cols in long format
)

This returns the error

error in row.names<-.data.frame(*tmp*, value = paste(ids, times[i], : duplicate 'row.names' are not allowed

In addition: Warning message: non-unique value when setting 'row.names': ‘hello.1’

Reading the error message I found the 'solution' to be the new.row.names argument which I set to be 1:12, see below. (I cheated here because I looked how many rows were returned by the data.table solution.)

My question is what would be the generic solution to this problem?

# works!
reshape(
  dat,
  idvar = "phrase",
  varying = list(
    "sp" = grep("sp$", names(dat)),
    "sc" = grep("sc$", names(dat))
  ),
  direction = "long", 
  v.names = c("sp", "sc"),
  new.row.names = 1:12 # 1:10000 would also work
)

data

dat <- structure(list(phrase = c("hello", "hello", "goodbye", "what"
), wo1sp = c("dan", "mark", "mark", "kev"), wo2sp = c("mark", 
"dan", "dan", "dan"), wo3sp = c("todd", "chris", "kev", "mark"
), wo1sc = c(10L, 8L, 2L, 4L), wo2sc = c(5L, 9L, 4L, 5L), wo3sc = c(4L, 
4L, 10L, 5L)), .Names = c("phrase", "wo1sp", "wo2sp", "wo3sp", 
"wo1sc", "wo2sc", "wo3sc"), class = "data.frame", row.names = c(NA, 
-4L))

来源:https://stackoverflow.com/questions/54082564/making-more-sense-of-the-reshape-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!