Is there better way to display entire Spark SQL DataFrame?

后端未结

关注

 7  720

I would like to display the entire Apache Spark SQL DataFrame with the Scala API. I can use the show() method:

myDataFrame.show(Int.MaxValue)

相关标签:

7条回答

失恋的感觉

2021-01-30 21:35
I've tried show() and it seems working sometimes. But sometimes not working, just give it a try:
```
println(df.show())
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
鱼传尺愫

2021-01-30 21:39

One way is using count() function to get the total number of records and use show(rdd.count()) .

0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2021-01-30 21:41

In java I have tried it with two ways. This is working perfectly for me:

data.show(SomeNo);

data.foreach(new ForeachFunction<Row>() {
                public void call(Row arg0) throws Exception {
                    System.out.println(arg0);
                }
            });

0 讨论(0)

忘掉有多难

2021-01-30 21:48

Try with,

df.show(35, false)

It will display 35 rows and 35 column values with full values name.

0 讨论(0)
发布评论:

提交评论
- 加载中...
遥遥无期

2021-01-30 21:48

As others suggested, printing out entire DF is bad idea. However, you can use df.rdd.foreachPartition(f) to print out partition-by-partition without flooding driver JVM (y using collect)

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2021-01-30 21:48

Nothing more succinct than that, but if you want to avoid the Int.MaxValue, then you could use a collect and process it, or foreach. But, for a tabular format without much manual code, show is the best you can do.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页