问题
I'm using Solr 6 and I'm trying to populate it. Here's the main scala I put in place :
object testChildDocToSolr {
def main(args: Array[String]): Unit = {
setProperty("hadoop.home.dir", "c:\\winutils\\")
val sparkSession = SparkSession.builder()
.appName("spark-solr-tester")
.master("local")
.config("spark.ui.enabled", "false")
.config("spark.default.parallelism", "1")
.getOrCreate()
val sc = sparkSession.sparkContext
val collectionName = "testChildDocument"
val testDf = sparkSession.read.json("./child_documents.json")
testDf.printSchema()
testDf.show()
val zkHost = "localhost:8983"
val solrOpts = Map(
"zkhost" -> zkHost,
"collection" -> collectionName,
"gen_uniq_key" -> "true",
"gen_uniq_child_key" -> "true",
"child_doc_fieldname" -> "tags"
)
testDf.write.format("solr").options(solrOpts).mode(Overwrite).save()
// Explicit commit to make sure all docs are visible
val solrCloudClient = SolrSupport.getCachedCloudClient(zkHost)
solrCloudClient.commit(collectionName, true, true)
val solrDf = sparkSession.read.format("solr").options(solrOpts).load()
solrDf.show()
sc.stop()
}
}
I'm getting the error :
Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper localhost:8983 within 10000 ms
It seems that I can't connect to ZooKeeper and I don't know why ...
Here's my full log :
[info] Loading project definition from C:\Users\ebelloei\Documents\Ice\DocToSolr\project
[info] Loading settings from build.sbt ...
[info] Set current project to DocToSolr (in build file:/C:/Users/ebelloei/Documents/Ice/DocToSolr/)
[info] Compiling 1 Scala source to C:\Users\ebelloei\Documents\Ice\DocToSolr\target\scala-2.11\classes ...
[info] Done compiling.
[info] Packaging C:\Users\ebelloei\Documents\Ice\DocToSolr\target\scala-2.11\doctosolr_2.11-0.1.jar ...
[info] Done packaging.
[info] Running (fork) Example.testChildDocToSolr
[info] 2017-12-20 15:09:10,436 [main] WARN NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[info] 2017-12-20 15:09:11,152 [main] INFO log - Logging initialized @2334ms
[info] root
[info] |-- dates: array (nullable = true)
[info] | |-- element: string (containsNull = true)
[info] |-- status: string (nullable = true)
[info] |-- tags: array (nullable = true)
[info] | |-- element: struct (containsNull = true)
[info] | | |-- bar: string (nullable = true)
[info] | | |-- foo: long (nullable = true)
[info] | | |-- parent: string (nullable = true)
[info] |-- user: string (nullable = true)
[info] +--------------------+------+--------------------+----+
[info] | dates|status| tags|user|
[info] +--------------------+------+--------------------+----+
[info] |[2017-05-02, 2017...| OK|[[val1,123,a], [v...| a|
[info] |[2017-04-29, 2017...| OK|[[val1,789,b], [v...| b|
[info] +--------------------+------+--------------------+----+
[error] Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper localhost:8983 within 10000 ms
[error] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
[error] at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
[error] at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3936)
[error] at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4806)
[error] at com.lucidworks.spark.util.SolrSupport$.getCachedCloudClient(SolrSupport.scala:190)
[error] at com.lucidworks.spark.util.SolrSupport$.getSolrBaseUrl(SolrSupport.scala:194)
[error] at com.lucidworks.spark.SolrRelation.insert(SolrRelation.scala:671)
[error] at solr.DefaultSource.createRelation(DefaultSource.scala:27)
[error] at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:472)
[error] at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
[error] at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
[error] at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
[error] at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
[error] at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
[error] at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
[error] at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
[error] at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
[error] at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
[error] at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
[error] at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
[error] at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
[error] at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610)
[error] at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
[error] at Example.testChildDocToSolr$.main(Main.scala:39)
[error] at Example.testChildDocToSolr.main(Main.scala)
[error] Caused by: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper localhost:8983/solr within 10000 ms
[error] at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:183)
[error] at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:117)
[error] at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:107)
[error] at org.apache.solr.common.cloud.ZkStateReader.<init>(ZkStateReader.java:226)
[error] at org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.connect(ZkClientClusterStateProvider.java:131)
[error] at org.apache.solr.client.solrj.impl.CloudSolrClient.connect(CloudSolrClient.java:631)
[error] at com.lucidworks.spark.util.SolrSupport$.getSolrCloudClient(SolrSupport.scala:168)
[error] at com.lucidworks.spark.util.SolrSupport$.getNewSolrCloudClient(SolrSupport.scala:186)
[error] at com.lucidworks.spark.util.CacheCloudSolrClient$$anon$1.load(SolrSupport.scala:37)
[error] at com.lucidworks.spark.util.CacheCloudSolrClient$$anon$1.load(SolrSupport.scala:35)
[error] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522)
[error] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315)
[error] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278)
[error] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193)
[error] ... 24 more
[error] Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper localhost:8983/solr within 10000 ms
[error] at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:233)
[error] at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:175)
[error] ... 37 more
[error] java.lang.RuntimeException: Nonzero exit code returned from runner: 1
[error] at sbt.ForkRun.processExitCode$1(Run.scala:29)
[error] at sbt.ForkRun.run(Run.scala:38)
[error] at sbt.Defaults$.$anonfun$bgRunTask$5(Defaults.scala:1155)
[error] at sbt.Defaults$.$anonfun$bgRunTask$5$adapted(Defaults.scala:1150)
[error] at sbt.internal.BackgroundThreadPool.$anonfun$run$1(DefaultBackgroundJobService.scala:359)
[error] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
[error] at scala.util.Try$.apply(Try.scala:209)
[error] at sbt.internal.BackgroundThreadPool$BackgroundRunnable.run(DefaultBackgroundJobService.scala:282)
[error] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[error] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[error] at java.lang.Thread.run(Thread.java:748)
[error] (compile:run) Nonzero exit code returned from runner: 1
[error] Total time: 26 s, completed 20 déc. 2017 15:09:25
1. Waiting for source changes... (press enter to interrupt)
And my Zoo.cfg :
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# dataDir=/opt/zookeeper/data
# NOTE: Solr defaults the dataDir to <solrHome>/zoo_data
# the port at which the clients will connect
# clientPort=8983
# NOTE: Solr sets this based on zkRun / zkHost params
I'm really new to this and don't know if I'm missing something evident because I can't find someone with the same issue.
回答1:
You're giving the Solr host and port as the zookeeper connection details. Zookeeper and Solr runs on separate ports as they are separate daemons. If you're running zookeeper externally from Solr (i.e. a dedicated Zookeeper installation), the default port is 2181
(so localhost:2181
). If you're using the one embedded in Solr, the port is Solr's port + 1000, usually 9983
(and thus, localhost:9983
).
来源:https://stackoverflow.com/questions/47908373/could-not-connect-to-zookeeper-using-solr-in-localhost