问题
I'm trying to run the simple example provided on the README of scala-xml, but the code won't run:
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
val df = sqlContext.read
.format("com.databricks.spark.xml")
.option("rowTag", "book")
.load("books.xml")
(copy-pasted from the README; books.xml
is indeed in the local directory)
This gives me error:
Name: Compile Error
Message: :1: error: illegal start of definition
.format("com.databricks.spark.xml") ^
StackTrace:
I'm running this from a Jupyter notebook with Spark/Scala kernel.
I'm sure there's a simple mistake, but I'm brand new to Scala/Spark.
Version info:
- Spark: 2.0.1
- Scala: 2.11.8
回答1:
you can add packages to the Spark using the --packages command line option
As per the comment and question, you should try running the code in one line it will solve your problem of "error: illegal start of definition"
val df = spark.read.format("com.databricks.spark.xml").option("rowTag", "book").load("book.xml")
Next for "Failed to find data source: com.databricks.spark.xml. "
try adding the library dependency/package "com.databricks:spark-xml_2.11:0.4.1 "
spark-shell --packages com.databricks:spark-xml_2.11:0.4.1
val df = spark.read.format("com.databricks.spark.xml").option("rowTag", "book").load("book.xml")
df.show
+-----+--------------------+--------------------+---------------+-----+------------+--------------------+
| _id| author| description| genre|price|publish_date| title|
+-----+--------------------+--------------------+---------------+-----+------------+--------------------+
|bk101|Gambardella, Matthew|An in-depth look ...| Computer|44.95| 2000-10-01|XML Developer's G...|
|bk102| Ralls, Kim|A former architec...| Fantasy| 5.95| 2000-12-16| Midnight Rain|
|bk103| Corets, Eva|After the collaps...| Fantasy| 5.95| 2000-11-17| Maeve Ascendant|
来源:https://stackoverflow.com/questions/45291259/error-running-introductory-example-of-scala-xml