ERROR Error cleaning broadcast Exception [duplicate]

痞子三分冷 提交于 2019-12-11 07:36:10

问题


I get the following error while running my spark streaming application, we have a large application running multiple stateful (with mapWithState) and stateless operations. It's getting difficult to isolate the error since spark itself hangs and the only error we see is in the spark log and not the application log itself.

The error happens only after abount 4-5 mins with a micro-batch interval of 10 seconds. I am using Spark 1.6.1 on an ubuntu server with Kafka based input and output streams.

Please note it's not possible for me to provide the smallest possible code to re-create this bug as it does not occur in unit test-cases, and the application itself is very large

Any direction you can give to solve this issue will be helpful. Please let me know if I can provide any more information.

Error inline below:

[2017-07-11 16:15:15,338] ERROR Error cleaning broadcast 2211 (org.apache.spark.ContextCleaner)

org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout

        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)

        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)

        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)

        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)

        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)

        at org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:136)

        at org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)

        at org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)

        at org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:77)

        at org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:233)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:189)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:180)

        at scala.Option.foreach(Option.scala:236)

        at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:180)

        at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)

        at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)

        at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68)

    Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]

        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)

        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)

        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)

        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)

        at scala.concurrent.Await$.result(package.scala:107)

        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

回答1:


Your exception message clearly says that its RPCTimeout due to default configuration of 120 seconds and adjust to optimal value as per your work load. please see 1.6 configuration

your error messages org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. and at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) confirms that.


For Better understanding please see the below code from

see RpcTimeout.scala

     /**
   * Wait for the completed result and return it. If the result is not available within this
   * timeout, throw a [[RpcTimeoutException]] to indicate which configuration controls the timeout.
   * @param  awaitable  the `Awaitable` to be awaited
   * @throws RpcTimeoutException if after waiting for the specified time `awaitable`
   *         is still not ready
   */
  def awaitResult[T](awaitable: Awaitable[T]): T = {
    try {
      Await.result(awaitable, duration)
    } catch addMessageIfTimeout
  }
}
  • Also see my answer in another context


来源:https://stackoverflow.com/questions/45171175/error-error-cleaning-broadcast-exception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!