How do I read a Parquet in R and convert it to an R DataFrame?

后端 未结 9 1291
北荒
北荒 2020-12-28 13:04

I\'d like to process Apache Parquet files (in my case, generated in Spark) in the R programming language.

Is an R reader available? Or is work being done on one?

9条回答
  •  隐瞒了意图╮
    2020-12-28 13:26

    Alternatively to SparkR, you could now use sparklyr:

    # install.packages("sparklyr")
    library(sparklyr)
    
    sc <- spark_connect(master = "local")
    
    spark_tbl_handle <- spark_read_parquet(sc, "tbl_name_in_spark", "/path/to/parquetdir")
    
    regular_df <- collect(spark_tbl_handle)
    
    spark_disconnect(sc)
    

提交回复
热议问题