Configuring Apache Spark Logging with Maven and logback and finally throwing message to Loggly

人盡茶涼 提交于 2020-12-13 03:57:57

问题


I'm having trouble getting my Spark Application to ignore Log4j, in order to use Logback. One of the reasons i'm trying to use logback, is for the loggly appender it supports.

I have the following dependencies and exclusions in my pom file. (versions are in my dependency manager in main pom library.)

<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.12</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
        <exclusions>
            <exclusion>
                <groupId>org.slf4j</groupId>
                <artifactId>slf4j-log4j12</artifactId>
            </exclusion>
            <exclusion>
                <groupId>log4j</groupId>
                <artifactId>log4j</artifactId>
            </exclusion>
        </exclusions>            
    </dependency>
    
    <dependency>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-classic</artifactId>
        <scope>test</scope>
    </dependency>
    
    <dependency>
        <groupId>ch.qos.logback</groupId>
        <artifactId>logback-core</artifactId>           
    </dependency>
    
    <dependency>
        <groupId>org.logback-extensions</groupId>
        <artifactId>logback-ext-loggly</artifactId>         
    </dependency>
    
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>log4j-over-slf4j</artifactId>           
    </dependency>    

I have referenced these two articles:

Separating application logs in Logback from Spark Logs in log4j
Configuring Apache Spark Logging with Scala and logback

I've tried using first using (when running spark-submit) :
--conf "spark.driver.userClassPathFirst=true"
--conf "spark.executor.userClassPathFirst=true"

but receive the error

    Exception in thread "main" java.lang.LinkageError: loader constraint violation: when resolving method "org.slf4j.impl.StaticLoggerBinder.ge
tLoggerFactory()Lorg/slf4j/ILoggerFactory;" the class loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) of the current cl
ass, org/slf4j/LoggerFactory, and the class loader (instance of sun/misc/Launcher$AppClassLoader) for the method's defining class, org/slf4
j/impl/StaticLoggerBinder, have different Class objects for the type org/slf4j/ILoggerFactory used in the signature      

I would like to get it working with the above, but then i also looked at trying the below
--conf "spark.driver.extraClassPath=$libs"
--conf "spark.executor.extraClassPath=$libs"

but since i'm passing my uber jar to spark submit locally AND (on a Amazon EMR cluster) i really can't be specifying a library file location that will be local to my machine. Since the uber jar contains the files, is there a way for it to use those files? Am i forced to copy these libraries to the master/nodes on the EMR cluster when the spark app finally runs from there?

The first approach about using the userClassPathFirst seems like the best route though.


回答1:


So I solved the issue and had several problems going on.

So in order to get Spark to allow logback to work, the solution that worked for me was from a combination of items from the articles i posted above, and in addition a cert file problem.

The cert file i was using to pass into spark-submit was incomplete and overriding the base truststore certs. This was causing a problem SENDING Https messages to Loggly.

Part 1 change: Update maven to shade org.slf4j (as stated in an answer by @matemaciek)

      </dependencies>
         ...
         <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.2.3</version>                
        </dependency>
                
        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-core</artifactId>
            <version>1.2.3</version>
        </dependency>
        
        <dependency>
            <groupId>org.logback-extensions</groupId>
            <artifactId>logback-ext-loggly</artifactId>
            <version>0.1.5</version>
            <scope>runtime</scope>
        </dependency>

        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>log4j-over-slf4j</artifactId>
            <version>1.7.30</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <transformers>
                        <transformer
                                implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                            <manifestEntries>
                                <Main-Class>com.TestClass</Main-Class>
                            </manifestEntries>
                        </transformer>
                    </transformers>
                    <relocations>
                        <relocation>
                            <pattern>org.slf4j</pattern>
                            <shadedPattern>com.shaded.slf4j</shadedPattern>
                        </relocation>
                    </relocations>
                </configuration>
            </plugin>
        </plugins>
    </build>

Part 1a: the logback.xml

<configuration debug="true">
    <appender name="logglyAppender" class="ch.qos.logback.ext.loggly.LogglyAppender">
        <endpointUrl>https://logs-01.loggly.com/bulk/TOKEN/tag/TAGS/</endpointUrl>
        <pattern>${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n</pattern>
    </appender>
    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
          <pattern>${hostName} %d{yyyy-MM-dd HH:mm:ss,SSS}{GMT} %p %t %c %M - %m%n</pattern>
        </encoder>
    </appender>
    <root level="info">
        <appender-ref ref="logglyAppender" />
        <appender-ref ref="STDOUT" />
    </root>
</configuration> 

Part 2 change: The MainClass

import org.slf4j.*;

public class TestClass {

    static final Logger log = LoggerFactory.getLogger(TestClass.class);

    public static void main(String[] args) throws Exception {
        
        log.info("this is a test message");
    }
}

Part 3 change:
i was submitting spark application as such (example):

sparkspark-submit --deploy-mode client --class com.TestClass --conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" --conf "spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/rds-truststore.jks -Djavax.net.ssl.trustStorePassword=changeit" com/target/testproject-0.0.1.jar 

So the above spark-submit failed on a HTTPS certification problem (that was when Loggly was being contacted to send the message to loggly service) because the rds-truststore.jks overwrote the certs without all certs. I changed this to use cacerts store, and it now had all the certs it needed.

No more error at the Loggly part when sending this

sparkspark-submit --deploy-mode client --class com.TestClass --conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit" --conf "spark.driver.extraJavaOptions=-Djavax.net.ssl.trustStore=c:/src/testproject/cacerts -Djavax.net.ssl.trustStorePassword=changeit" com/target/testproject-0.0.1.jar 


来源:https://stackoverflow.com/questions/64250847/configuring-apache-spark-logging-with-maven-and-logback-and-finally-throwing-mes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!