Spark SQL fails with java.lang.NoClassDefFoundError: org/codehaus/commons/compiler/UncheckedCompileException

后端 未结 6 481
一整个雨季
一整个雨季 2021-02-05 10:24

Running a Spark SQL (v2.1.0_2.11) program in Java immediately fails with the following exception, as soon as the first action is called on a dataframe:

java.lang         


        
相关标签:
6条回答
  • 2021-02-05 10:55

    The culprit is the library commons-compiler. Here is the conflict:

    To work around this, add the following to your pom.xml:

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.codehaus.janino</groupId>
                <artifactId>commons-compiler</artifactId>
                <version>2.7.8</version>
            </dependency>
        </dependencies>
    </dependencyManagement>
    

    0 讨论(0)
  • 2021-02-05 10:55

    I had the similar issues, when updated spark-2.2.1 to spark-2.3.0.

    In my case, I had to fix commons-compiler and janino

    Spark 2.3 solution:

    <dependencyManagement>
        <dependencies>
            <!--Spark java.lang.NoClassDefFoundError: org/codehaus/janino/InternalCompilerException-->
            <dependency>
                <groupId>org.codehaus.janino</groupId>
                <artifactId>commons-compiler</artifactId>
                <version>3.0.8</version>
            </dependency>
            <dependency>
                <groupId>org.codehaus.janino</groupId>
                <artifactId>janino</artifactId>
                <version>3.0.8</version>
            </dependency>
        </dependencies>
    </dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>commons-compiler</artifactId>
            <version>3.0.8</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.janino</groupId>
            <artifactId>janino</artifactId>
            <version>3.0.8</version>
        </dependency>
    </dependencies>
    
    0 讨论(0)
  • 2021-02-05 10:55

    this error still arises with org.apache.spark:spark-sql_2.12:2.4.6, but the Janino version have to be used is 3.0.16 With Gradle:

    implementation 'org.codehaus.janino:commons-compiler:3.0.16'
    implementation 'org.codehaus.janino:janino:3.0.16'
    
    0 讨论(0)
  • 2021-02-05 11:03

    My implementation requirement is Spring-boot + Scala + Spark(2.4.5)

    For this issue, solution is to exclude artifactID 'janino' and 'commons-compiler' which comes with 'spark-sql_2.12' version 2.4.5.
    The reason being the updated version 3.1.2 for both artifactID 'janino' and 'commons-compiler' which comes with 'spark-sql_2.12' version 2.4.5.

    After excluding, add version 3.0.8 for both artifactID 'janino' and 'commons-compiler' as separate dependency.

    <dependencies>
         <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
            <version>2.4.5</version>
            <exclusions>
                <exclusion>
                    <artifactId>janino</artifactId>
                    <groupId>org.codehaus.janino</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>commons-compiler</artifactId>
                    <groupId>org.codehaus.janino</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <artifactId>janino</artifactId>
            <groupId>org.codehaus.janino</groupId>
            <version>3.0.8</version>
        </dependency>
        <dependency>
            <artifactId>commons-compiler</artifactId>
            <groupId>org.codehaus.janino</groupId>
            <version>3.0.8</version>
        </dependency>
        ...............
        ...............
        ...............
        ...............
        ...............
    </dependencies>
    
    0 讨论(0)
  • 2021-02-05 11:09

    in our migration from CDH Parcel 2.2.0.cloudera1 to 2.3.0.cloudera4 we have simply overwritten the maven property :

    <janino.version>3.0.8</janino.version>
    

    In addition, we have defined the proper version of the hive dependency in the dependency management part:

    <hive.version>1.1.0-cdh5.13.3</hive.version>
    
        <dependency>
             <groupId>org.apache.hive</groupId>
             <artifactId>hive-jdbc</artifactId>
             <version>${hive.version}</version>
             <scope>runtime</scope>
             <exclusions>
                 <exclusion>
                     <groupId>org.eclipse.jetty.aggregate</groupId>
                     <artifactId>*</artifactId>
                 </exclusion>
                 <exclusion>
                     <artifactId>slf4j-log4j12</artifactId>
                     <groupId>org.slf4j</groupId>
                 </exclusion>
                 <exclusion>
                     <artifactId>parquet-hadoop-bundle</artifactId>
                     <groupId>com.twitter</groupId>
                 </exclusion>
             </exclusions>
         </dependency>
    

    The exclusions were necessary for the previous version, they might not be necessary anymore

    0 讨论(0)
  • 2021-02-05 11:16

    If you are using the Spark 3.0.1 version, the latest at the days I'm writing this answer, you have to select version 3.0.16 for the two janino dependencies for the @Maksym solution that works very well.

    0 讨论(0)
提交回复
热议问题