How to create a DataFrame from a text file in Spark

后端 未结 8 1050
滥情空心
滥情空心 2021-01-31 19:03

I have a text file on HDFS and I want to convert it to a Data Frame in Spark.

I am using the Spark Context to load the file and then try to generate individual columns f

相关标签:
8条回答
  • 2021-01-31 19:52

    A txt File with PIPE (|) delimited file can be read as :


    df = spark.read.option("sep", "|").option("header", "true").csv("s3://bucket_name/folder_path/file_name.txt")
    
    0 讨论(0)
  • 2021-01-31 19:53

    If you want to use the toDF method, you have to convert your RDD of Array[String] into a RDD of a case class. For example, you have to do:

    case class Test(id:String,filed2:String)
    val myFile = sc.textFile("file.txt")
    val df= myFile.map( x => x.split(";") ).map( x=> Test(x(0),x(1)) ).toDF()
    
    0 讨论(0)
提交回复
热议问题