Cascading + libjars = ClassNotFoundException. Sometimes

别来无恙 提交于 2020-01-06 12:15:28

问题


I am running Cascading (actually Scalding) hadoop job that uses DistributedCache for dependent jars.

Fist time it works fine (meaning that the classpath is set up correctly) but then it starts failing with ClassNotFoundException:

java.io.IOException: Split class cascading.tap.hadoop.io.MultiInputSplit not found
    at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:387)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at  org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: cascading.tap.hadoop.io.MultiInputSplit
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
    at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:385)
    ...

Did anybody else have success with Cascading and jars in the DistributedCache

This message seems to imply that Cascading has some internal handling of the distributed cache jars. Any light you can shed on this?

Edit: I am using Cascading 2.1.6 on Hadoop 1.0.3


回答1:


Which version of hadoop are you using? There are some problems with the distributed cache in 0.20.2. Can you try switching to a newer version?




回答2:


Chris K Wensel, the author of Cascading responded on the mailing list that Cascading does not do anything with DistributedCache.

I looked further and it was a problem in my code -- I did not add these files to the DistributedCache properly.



来源:https://stackoverflow.com/questions/17861614/cascading-libjars-classnotfoundexception-sometimes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!