Reshape horizontal to to long format using pivot_longer

久未见 提交于 2021-02-05 08:09:10

问题


I am trying to reshape my data to long instead of wide format using the same code provided earlier link , however it doesn't work even after several trials to modify names_pattern = "(.*)_(pre|post.*)",

My data sample is

data1<-read.table(text="
Serial_ID   pre_EDV pre_ESV pre_LVEF    post_EDV    post_ESV    post_LVEF
1   76.2    32.9    56.8    86.3    36.6    57.6
2   65.4    35.9    45.1    60.1    26.1    56.7
3   64.4    35.1    45.5    72.5    41.1    43.3
4   50      13.9    72.1    46.4    18.4    60.4
5   89.6    32      64.3    70.9    19.3    72.8
6   62      20.6    66.7    55.9    17.8    68.2
7   91.2    37.7    58.6    61.9    23.8    61.6
8   62      24      61.3    69.3    34.9    49.6
9   104.1   22.7    78.8    38.6    11.5    70.1
10  90.6    31.2    65.6    48      16.1    66.4", sep="", header=T)

I want to reshape my data to

  1. put identical column headings below each other eg post_EDV below pre_EDV
  2. Create new column Pre vs. post
  3. Fix column heading (remove "pre_" and "post_" to be "EDV" only (as shown in the screenshot below)).

This is the used code:

library(dplyr);library(tidyr);library(stringr)
out <- data %>% pivot_longer(cols = -Serial_ID, 
           names_to = c(".value", "prevspost"), 
           names_pattern =  "(.*)_(pre|post.*)",
           names_sep="_") #%>% as.data.frame

Also I tried names_prefix = c("pre_","post_") instead of names_pattern = "(.*)_(pre|post.*)", but it doesn't work.

Any advice will be greatly appreciated.


回答1:


Edit I recommend using @Dave2e's superior approach.

The reason your attempt didn't work is because the pattern has to match in order. You could try this:

library(tidyr)
library(dplyr) 
data1 %>% pivot_longer(cols = -Serial_ID, 
           names_to = c("prevspost",".value"), 
           names_pattern =  "(pre|post)_(\\w+)") %>%
   dplyr::arrange(desc(prevspost),Serial_ID)
# A tibble: 20 x 5
   Serial_ID prevspost   EDV   ESV  LVEF
       <int> <chr>       <dbl> <dbl> <dbl>
 1         1 pre          76.2  32.9  56.8
 2         2 pre          65.4  35.9  45.1
 3         3 pre          64.4  35.1  45.5
 4         4 pre          50    13.9  72.1
 5         5 pre          89.6  32    64.3
 6         6 pre          62    20.6  66.7
 7         7 pre          91.2  37.7  58.6
 8         8 pre          62    24    61.3
 9         9 pre         104.   22.7  78.8
10        10 pre          90.6  31.2  65.6
11         1 post         86.3  36.6  57.6
12         2 post         60.1  26.1  56.7
13         3 post         72.5  41.1  43.3
14         4 post         46.4  18.4  60.4
15         5 post         70.9  19.3  72.8
16         6 post         55.9  17.8  68.2
17         7 post         61.9  23.8  61.6
18         8 post         69.3  34.9  49.6
19         9 post         38.6  11.5  70.1
20        10 post         48    16.1  66.4



回答2:


Your initial approach very close, it needed some simplification. Use only "names_sep" or "names_pattern"

library(tidyr)
library(dplyr)

data1 %>% pivot_longer(cols = -Serial_ID, 
                      names_to = c("Pre vs. post", '.value'), 
                      names_sep="_")

# A tibble: 20 x 5
Serial_ID `Pre vs. post`   EDV   ESV  LVEF
<int> <chr>     <dbl> <dbl> <dbl>
1         1 pre        76.2  32.9  56.8
2         1 post       86.3  36.6  57.6
3         2 pre        65.4  35.9  45.1
4         2 post       60.1  26.1  56.7
5         3 pre        64.4  35.1  45.5
6         3 post       72.5  41.1  43.3
7         4 pre        50    13.9  72.1
8         4 post       46.4  18.4  60.4
9         5 pre        89.6  32    64.3
10         5 post       70.9  19.3  72.8
11         6 pre        62    20.6  66.7
12         6 post       55.9  17.8  68.2
13         7 pre        91.2  37.7  58.6
14         7 post       61.9  23.8  61.6
15         8 pre        62    24    61.3
16         8 post       69.3  34.9  49.6
17         9 pre       104.   22.7  78.8
18         9 post       38.6  11.5  70.1
19        10 pre        90.6  31.2  65.6
20        10 post       48    16.1  66.4



回答3:


try this:

library(dplyr);library(tidyr);library(stringr)
out <- data1 %>% pivot_longer(-Serial_ID,
                             names_to = c("measurement", "names"),
                             values_to = "values",
                             names_sep = "_")
out
# # A tibble: 60 x 4
# Serial_ID measurement names values
# <int> <chr>       <chr>  <dbl>
# 1         1 pre         EDV     76.2
# 2         1 pre         ESV     32.9
# 3         1 pre         LVEF    56.8
# 4         1 post        EDV     86.3
# 5         1 post        ESV     36.6
# 6         1 post        LVEF    57.6
# 7         2 pre         EDV     65.4
# 8         2 pre         ESV     35.9
# 9         2 pre         LVEF    45.1
# 10         2 post        EDV     60.1
# # ... with 50 more rows

Your code snipped passed the object "data" instead of "data1" into the pipe which produced an error:

"Error: No tidyselect variables were registered".



来源:https://stackoverflow.com/questions/61756213/reshape-horizontal-to-to-long-format-using-pivot-longer

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!