object databricks is not a member of package com

问题

I am trying to use Stanford NLP library in Spark2 using Zeppelin (HDP 2.6). Apparently there is wrapper built by Databricks for the Stanford NLP library for Spark. Link: https://github.com/databricks/spark-corenlp

I have downloaded the jar for the above wrapper from here and also downloaded Stanford NLP jars from here. Then I have added both sets of jars as dependencies in Spark2 interpreter settings of Zeppelin and restarted the interpreter.

Still the below sample program gives the error "object databricks is not a member of package com import com.databricks.spark.corenlp.functions._"

import org.apache.spark.sql.functions._
import com.databricks.spark.corenlp.functions._

import sqlContext.implicits._

val input = Seq(
  (1, "<xml>Stanford University is located in California. It is a great university.</xml>")
).toDF("id", "text")

val output = input
  .select(cleanxml('text).as('doc))
  .select(explode(ssplit('doc)).as('sen))
  .select('sen, tokenize('sen).as('words), ner('sen).as('nerTags), sentiment('sen).as('sentiment))

output.show(truncate = false)

回答1:

The problem was related to downloading the jar file for Databricks corenlp. I downloaded it from this location. Problem solved.

来源：https://stackoverflow.com/questions/49588799/object-databricks-is-not-a-member-of-package-com

标签

apache-spark

stanford-nlp

apache-zeppelin

databricks

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!