Loading com.databricks.spark.csv via RStudio

前端 未结 4 1008
伪装坚强ぢ
伪装坚强ぢ 2020-12-30 16:31

I have installed Spark-1.4.0. I have also installed its R package SparkR and I am able to use it via Spark-shell and via RStudio, however, there is one difference I can not

4条回答
  •  生来不讨喜
    2020-12-30 17:17

    This is the right syntax (after hours of trying): (Note - You've to focus on the first line. Notice to double-quotes)

    Sys.setenv('SPARKR_SUBMIT_ARGS'='"--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-shell"')
    
    library(SparkR)
    library(magrittr)
    
    # Initialize SparkContext and SQLContext
    sc <- sparkR.init(appName="SparkR-Flights-example")
    sqlContext <- sparkRSQL.init(sc)
    
    
    # The SparkSQL context should already be created for you as sqlContext
    sqlContext
    # Java ref type org.apache.spark.sql.SQLContext id 1
    
    # Load the flights CSV file using `read.df`. Note that we use the CSV reader Spark package here.
    flights <- read.df(sqlContext, "nycflights13.csv", "com.databricks.spark.csv", header="true")
    

提交回复
热议问题