How to show full column content in a Spark Dataframe?

旧巷老猫 提交于 2019-11-30 06:09:44

问题


I am using spark-csv to load data into a DataFrame. I want to do a simple query and display the content:

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("my.csv")
df.registerTempTable("tasks")
results = sqlContext.sql("select col from tasks");
results.show()

The col seems truncated:

scala> results.show();
+--------------------+
|                 col|
+--------------------+
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-06 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
+--------------------+

How do I show the full content of the column?


回答1:


results.show(20, false) will not truncate. Check the source




回答2:


If you put results.show(false) , results will not be truncated




回答3:


The other solutions are good. If these are your goals:

  1. No truncation of columns,
  2. No loss of rows,
  3. Fast and
  4. Efficient

These two lines are useful ...

    df.persist
    df.show(df.count, false) // in Scala or 'False' in Python

By persisting, the 2 executor actions, count and show, are faster & more efficient when using persist or cache to maintain the interim underlying dataframe structure within the executors. See more about persist and cache.




回答4:


Below code would help to view all rows without truncation in each column

df.show(df.count(), False)



回答5:


results.show(20, False) or results.show(20, false) depending on whether you are running it on Java/Scala/Python




回答6:


results.show(false) will show you the full column content.

Show method by default limit to 20, and adding a number before false will show more rows.




回答7:


try this command :

df.show(df.count())



回答8:


Within Databricks you can visualize the dataframe in a tabular format. With the command:

display(results)

It will look like




回答9:


results.show(20,false) did the trick for me in Scala.




回答10:


I use the plugin Chrome extension works pretty well:

[https://userstyles.org/styles/157357/jupyter-notebook-wide][1]



来源:https://stackoverflow.com/questions/33742895/how-to-show-full-column-content-in-a-spark-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!