How to stop INFO messages displaying on spark console?

后端 未结 20 3082
广开言路
广开言路 2020-11-22 13:40

I\'d like to stop various messages that are coming on spark shell.

I tried to edit the log4j.properties file in order to stop these message.

Her

相关标签:
20条回答
  • 2020-11-22 14:14

    In addition to all the above posts, here is what solved the issue for me.

    Spark uses slf4j to bind to loggers. If log4j is not the first binding found, you can edit log4j.properties files all you want, the loggers are not even used. For example, this could be a possible SLF4J output:

    SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-simple/1.6.6/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-log4j12/1.7.19/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]

    So here the SimpleLoggerFactory was used, which does not care about log4j settings.

    Excluding the slf4j-simple package from my project via

    <dependency>
            ...
            <exclusions>
                ...
                <exclusion>
                    <artifactId>slf4j-simple</artifactId>
                    <groupId>org.slf4j</groupId>
                </exclusion>
            </exclusions>
        </dependency>
    

    resolved the issue, as now the log4j logger binding is used and any setting in log4j.properties is adhered to. F.Y.I. my log4j properties file contains (besides the normal configuration)

    log4j.rootLogger=WARN, stdout
    ...
    log4j.category.org.apache.spark = WARN
    log4j.category.org.apache.parquet.hadoop.ParquetRecordReader = FATAL
    log4j.additivity.org.apache.parquet.hadoop.ParquetRecordReader=false
    log4j.logger.org.apache.parquet.hadoop.ParquetRecordReader=OFF
    

    Hope this helps!

    0 讨论(0)
  • 2020-11-22 14:17

    In Python/Spark we can do:

    def quiet_logs( sc ):
      logger = sc._jvm.org.apache.log4j
      logger.LogManager.getLogger("org"). setLevel( logger.Level.ERROR )
      logger.LogManager.getLogger("akka").setLevel( logger.Level.ERROR )
    

    The after defining Sparkcontaxt 'sc' call this function by : quiet_logs( sc )

    0 讨论(0)
  • 2020-11-22 14:17

    An interesting idea is to use the RollingAppender as suggested here: http://shzhangji.com/blog/2015/05/31/spark-streaming-logging-configuration/ so that you don't "polute" the console space, but still be able to see the results under $YOUR_LOG_PATH_HERE/${dm.logging.name}.log.

        log4j.rootLogger=INFO, rolling
    
    log4j.appender.rolling=org.apache.log4j.RollingFileAppender
    log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
    log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
    log4j.appender.rolling.maxFileSize=50MB
    log4j.appender.rolling.maxBackupIndex=5
    log4j.appender.rolling.file=$YOUR_LOG_PATH_HERE/${dm.logging.name}.log
    log4j.appender.rolling.encoding=UTF-8
    

    Another method that solves the cause is to observe what kind of loggings do you usually have (coming from different modules and dependencies), and set for each the granularity for the logging, while turning "quiet" third party logs that are too verbose:

    For instance,

        # Silence akka remoting
    log4j.logger.Remoting=ERROR
    log4j.logger.akka.event.slf4j=ERROR
    log4j.logger.org.spark-project.jetty.server=ERROR
    log4j.logger.org.apache.spark=ERROR
    log4j.logger.com.anjuke.dm=${dm.logging.level}
    log4j.logger.org.eclipse.jetty=WARN
    log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
    log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
    log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
    
    0 讨论(0)
  • 2020-11-22 14:20

    I just add this line to all my pyspark scripts on top just below the import statements.

    SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
    

    example header of my pyspark scripts

    from pyspark.sql import SparkSession, functions as fs
    SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
    
    0 讨论(0)
  • 2020-11-22 14:23

    Simple to do on the command line...

    spark2-submit --driver-java-options="-Droot.logger=ERROR,console" ..other options..

    0 讨论(0)
  • 2020-11-22 14:24

    If you don't have the ability to edit the java code to insert the .setLogLevel() statements and you don't want yet more external files to deploy, you can use a brute force way to solve this. Just filter out the INFO lines using grep.

    spark-submit --deploy-mode client --master local <rest-of-cmd> | grep -v -F "INFO"
    
    0 讨论(0)
提交回复
热议问题