Split a huge dataframe in many smaller dataframes to create a corpus in r

后端未结

关注

 1  618

余生分开走 2021-01-23 23:18

I need to create a corpus from a huge dataframe (about 170.000 rows, but only two columns) to mine some text and group by usernames according to to the search terms. For example

1条回答

清酒与你 (楼主)

2021-01-24 00:08
We can split the dataset ('df1') into a list
```
lst <- split(df1, df1$username)
```
Usually, it is better to stop here and do all the calculations/analysis within the list itself. But, if we want to create l000's of objects in the global environment, one way is using list2env after naming the list elements with the object names we desire.
```
list2env(setNames(lst, paste0('DataFrame', 
                 seq_along(lst)), envir=.GlobalEnv)

DataFrame1
DataFrame2 
```
Another way of keeping the data would be to nest it
```
library(dplyr)
library(tidyr)
df1 %>% 
     nest(-username)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...