Read csv as Data Frame in spark 1.6

后端 未结 3 1625
礼貌的吻别
礼貌的吻别 2021-01-07 07:08

I have Spark 1.6 and trying to read a csv (or tsv) file as a dataframe. Here are the steps I take:

scala>  val sqlContext= new org.apache.spark.sql.SQLCon         


        
相关标签:
3条回答
  • 2021-01-07 07:54

    In java first add dependency in POM.xml file and run following code to read csv file.

    <dependency>
                <groupId>com.databricks</groupId>
                <artifactId>spark-csv_2.10</artifactId>
                <version>1.4.0</version>
            </dependency>
    
    Dataset<Row> df = sparkSession.read().format("com.databricks.spark.csv").option`enter code here`("header", true).option("inferSchema", true).load("hdfs://localhost:9000/usr/local/hadoop_data/loan_100.csv");
    
    0 讨论(0)
  • 2021-01-07 08:01

    Looks like you functions are not chained together properly and it's attempting to run "show()" on the val df, which is a reference to the DataFrameReader class. If I run the following, I can reproduce your error:

    val df = sqlContext.read
    df.show()
    

    If you restructure the code, it would work:

    val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("data.csv")
    df.show()
    
    0 讨论(0)
  • 2021-01-07 08:01

    Use the following instead:

    val sqlContext = new SQLContext(sc);
    

    It should resolve your issue.

    0 讨论(0)
提交回复
热议问题