Reading TSV into Spark Dataframe with Scala API

后端 未结 2 1087
终归单人心
终归单人心 2020-12-03 10:02

I have been trying to get the databricks library for reading CSVs to work. I am trying to read a TSV created by hive into a spark data frame using the scala api.

Her

相关标签:
2条回答
  • 2020-12-03 10:26

    All of the option parameters are passed in the option() function as below:

    val segments = sqlContext.read.format("com.databricks.spark.csv")
        .option("delimiter", "\t")
        .load("s3n://michaeldiscenza/data/test_segments")
    
    0 讨论(0)
  • 2020-12-03 10:26

    With Spark 2.0+ use the built-in CSV connector to avoid third party dependancy and better performance:

    val spark = SparkSession.builder.getOrCreate()
    val segments = spark.read.option("sep", "\t").csv("/path/to/file")
    
    0 讨论(0)
提交回复
热议问题