Split a huge dataframe in many smaller dataframes to create a corpus in r

后端 未结 1 614
余生分开走
余生分开走 2021-01-23 23:18

I need to create a corpus from a huge dataframe (about 170.000 rows, but only two columns) to mine some text and group by usernames according to to the search terms. For example

1条回答
  •  清酒与你
    2021-01-24 00:08

    We can split the dataset ('df1') into a list

    lst <- split(df1, df1$username)
    

    Usually, it is better to stop here and do all the calculations/analysis within the list itself. But, if we want to create l000's of objects in the global environment, one way is using list2env after naming the list elements with the object names we desire.

    list2env(setNames(lst, paste0('DataFrame', 
                     seq_along(lst)), envir=.GlobalEnv)
    
    DataFrame1
    DataFrame2 
    

    Another way of keeping the data would be to nest it

    library(dplyr)
    library(tidyr)
    df1 %>% 
         nest(-username)
    

    0 讨论(0)
提交回复
热议问题