I\'d like to process Apache Parquet files (in my case, generated in Spark) in the R programming language.
Is an R reader available? Or is work being done on one?
Alternatively to SparkR, you could now use sparklyr:
SparkR
sparklyr
# install.packages("sparklyr") library(sparklyr) sc <- spark_connect(master = "local") spark_tbl_handle <- spark_read_parquet(sc, "tbl_name_in_spark", "/path/to/parquetdir") regular_df <- collect(spark_tbl_handle) spark_disconnect(sc)