发表新帖

发表新帖

How to create a DataFrame from a text file in Spark

后端未结

关注

 8  1061

I have a text file on HDFS and I want to convert it to a Data Frame in Spark.

I am using the Spark Context to load the file and then try to generate individual columns f

相关标签:

8条回答

不知归路

2021-01-31 19:52
A txt File with PIPE (|) delimited file can be read as :
```
df = spark.read.option("sep", "|").option("header", "true").csv("s3://bucket_name/folder_path/file_name.txt")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2021-01-31 19:53
If you want to use the toDF method, you have to convert your RDD of Array[String] into a RDD of a case class. For example, you have to do:
```
case class Test(id:String,filed2:String)
val myFile = sc.textFile("file.txt")
val df= myFile.map( x => x.split(";") ).map( x=> Test(x(0),x(1)) ).toDF()
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2

热议问题