How to show full column content in a Spark Dataframe?

问题

I am using spark-csv to load data into a DataFrame. I want to do a simple query and display the content:

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("my.csv")
df.registerTempTable("tasks")
results = sqlContext.sql("select col from tasks");
results.show()

The col seems truncated:

scala> results.show();
+--------------------+
|                 col|
+--------------------+
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:15:...|
|2015-11-06 07:15:...|
|2015-11-16 07:15:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
|2015-11-16 07:21:...|
+--------------------+

How do I show the full content of the column?

回答1:

results.show(20, false) will not truncate. Check the source

回答2:

If you put results.show(false) , results will not be truncated

回答3:

The other solutions are good. If these are your goals:

No truncation of columns,
No loss of rows,
Fast and
Efficient

These two lines are useful ...

    df.persist
    df.show(df.count, false) // in Scala or 'False' in Python

By persisting, the 2 executor actions, count and show, are faster & more efficient when using persist or cache to maintain the interim underlying dataframe structure within the executors. See more about persist and cache.

回答4:

Below code would help to view all rows without truncation in each column

df.show(df.count(), False)

回答5:

results.show(20, False) or results.show(20, false) depending on whether you are running it on Java/Scala/Python

回答6:

results.show(false) will show you the full column content.

Show method by default limit to 20, and adding a number before false will show more rows.

回答7:

try this command :

df.show(df.count())

回答8:

Within Databricks you can visualize the dataframe in a tabular format. With the command:

display(results)

It will look like

回答9:

results.show(20,false) did the trick for me in Scala.

回答10:

I use the plugin Chrome extension works pretty well:

[https://userstyles.org/styles/157357/jupyter-notebook-wide][1]

来源：https://stackoverflow.com/questions/33742895/how-to-show-full-column-content-in-a-spark-dataframe

标签

apache-spark

dataframe

spark-csv

output-formatting