How to stop INFO messages displaying on spark console?

后端 未结 20 3118
广开言路
广开言路 2020-11-22 13:40

I\'d like to stop various messages that are coming on spark shell.

I tried to edit the log4j.properties file in order to stop these message.

Her

相关标签:
20条回答
  • 2020-11-22 14:36
    sparkContext.setLogLevel("OFF")
    
    0 讨论(0)
  • 2020-11-22 14:37

    This one worked for me. For only ERROR messages to be displayed as stdout, log4j.properties file may look like:

    # Root logger option
    log4j.rootLogger=ERROR, stdout
    # Direct log messages to stdout
    log4j.appender.stdout=org.apache.log4j.ConsoleAppender
    log4j.appender.stdout.Target=System.out
    log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
    log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
    

    NOTE: Put log4j.properties file in src/main/resources folder to be effective. And if log4j.properties doesn't exist (meaning spark is using log4j-defaults.properties file) then you can create it by going to SPARK_HOME/conf and then mv log4j.properties.template log4j.properties and then proceed with above said changes.

    0 讨论(0)
  • 2020-11-22 14:38

    All the methods collected with examples

    Intro

    Actually, there are many ways to do it. Some are harder from others, but it is up to you which one suits you best. I will try to showcase them all.


    #1 Programatically in your app

    Seems to be the easiest, but you will need to recompile your app to change those settings. Personally, I don't like it but it works fine.

    Example:

    import org.apache.log4j.{Level, Logger}
    
    val rootLogger = Logger.getRootLogger()
    rootLogger.setLevel(Level.ERROR)
    
    Logger.getLogger("org.apache.spark").setLevel(Level.WARN)
    Logger.getLogger("org.spark-project").setLevel(Level.WARN)
    

    You can achieve much more just using log4j API.
    Source: [Log4J Configuration Docs, Configuration section]


    #2 Pass log4j.properties during spark-submit

    This one is very tricky, but not impossible. And my favorite.

    Log4J during app startup is always looking for and loading log4j.properties file from classpath.

    However, when using spark-submit Spark Cluster's classpath has precedence over app's classpath! This is why putting this file in your fat-jar will not override the cluster's settings!

    Add -Dlog4j.configuration=<location of configuration file> to spark.driver.extraJavaOptions (for the driver) or
    spark.executor.extraJavaOptions (for executors).

    Note that if using a file, the file: protocol should be explicitly provided, and the file needs to exist locally on all the nodes.

    To satisfy the last condition, you can either upload the file to the location available for the nodes (like hdfs) or access it locally with driver if using deploy-mode client. Otherwise:

    upload a custom log4j.properties using spark-submit, by adding it to the --files list of files to be uploaded with the application.

    Source: Spark docs, Debugging

    Steps:

    Example log4j.properties:

    # Blacklist all to warn level
    log4j.rootCategory=WARN, console
    
    log4j.appender.console=org.apache.log4j.ConsoleAppender
    log4j.appender.console.target=System.err
    log4j.appender.console.layout=org.apache.log4j.PatternLayout
    log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
    
    # Whitelist our app to info :)
    log4j.logger.com.github.atais=INFO
    

    Executing spark-submit, for cluster mode:

    spark-submit \
        --master yarn \
        --deploy-mode cluster \
        --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
        --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
        --files "/absolute/path/to/your/log4j.properties" \
        --class com.github.atais.Main \
        "SparkApp.jar"
    

    Note that you must use --driver-java-options if using client mode. Spark docs, Runtime env

    Executing spark-submit, for client mode:

    spark-submit \
        --master yarn \
        --deploy-mode client \
        --driver-java-options "-Dlog4j.configuration=file:/absolute/path/to/your/log4j.properties \
        --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
        --files "/absolute/path/to/your/log4j.properties" \
        --class com.github.atais.Main \
        "SparkApp.jar"
    

    Notes:

    1. Files uploaded to spark-cluster with --files will be available at root dir, so there is no need to add any path in file:log4j.properties.
    2. Files listed in --files must be provided with absolute path!
    3. file: prefix in configuration URI is mandatory.

    #3 Edit cluster's conf/log4j.properties

    This changes global logging configuration file.

    update the $SPARK_CONF_DIR/log4j.properties file and it will be automatically uploaded along with the other configurations.

    Source: Spark docs, Debugging

    To find your SPARK_CONF_DIR you can use spark-shell:

    atais@cluster:~$ spark-shell 
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
          /_/   
    
    scala> System.getenv("SPARK_CONF_DIR")
    res0: String = /var/lib/spark/latest/conf
    

    Now just edit /var/lib/spark/latest/conf/log4j.properties (with example from method #2) and all your apps will share this configuration.


    #4 Override configuration directory

    If you like the solution #3, but want to customize it per application, you can actually copy conf folder, edit it contents and specify as the root configuration during spark-submit.

    To specify a different configuration directory other than the default “SPARK_HOME/conf”, you can set SPARK_CONF_DIR. Spark will use the configuration files (spark-defaults.conf, spark-env.sh, log4j.properties, etc) from this directory.

    Source: Spark docs, Configuration

    Steps:

    1. Copy cluster's conf folder (more info, method #3)
    2. Edit log4j.properties in that folder (example in method #2)
    3. Set SPARK_CONF_DIR to this folder, before executing spark-submit,
      example:

      export SPARK_CONF_DIR=/absolute/path/to/custom/conf
      
      spark-submit \
          --master yarn \
          --deploy-mode cluster \
          --class com.github.atais.Main \
          "SparkApp.jar"
      

    Conclusion

    I am not sure if there is any other method, but I hope this covers the topic from A to Z. If not, feel free to ping me in the comments!

    Enjoy your way!

    0 讨论(0)
  • 2020-11-22 14:39

    Edit your conf/log4j.properties file and change the following line:

    log4j.rootCategory=INFO, console
    

    to

    log4j.rootCategory=ERROR, console
    

    Another approach would be to :

    Start spark-shell and type in the following:

    import org.apache.log4j.Logger
    import org.apache.log4j.Level
    
    Logger.getLogger("org").setLevel(Level.OFF)
    Logger.getLogger("akka").setLevel(Level.OFF)
    

    You won't see any logs after that.

    Other options for Level include: all, debug, error, fatal, info, off, trace, trace_int, warn

    Details about each can be found in the documentation.

    0 讨论(0)
  • 2020-11-22 14:39

    tl;dr

    For Spark Context you may use:

    sc.setLogLevel(<logLevel>)
    

    where loglevel can be ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE or WARN.


    Details-

    Internally, setLogLevel calls org.apache.log4j.Level.toLevel(logLevel) that it then uses to set using org.apache.log4j.LogManager.getRootLogger().setLevel(level).

    You may directly set the logging levels to OFF using:

    LogManager.getLogger("org").setLevel(Level.OFF)
    

    You can set up the default logging for Spark shell in conf/log4j.properties. Use conf/log4j.properties.template as a starting point.

    Setting Log Levels in Spark Applications

    In standalone Spark applications or while in Spark Shell session, use the following:

    import org.apache.log4j.{Level, Logger}
    
    Logger.getLogger(classOf[RackResolver]).getLevel
    Logger.getLogger("org").setLevel(Level.OFF)
    Logger.getLogger("akka").setLevel(Level.OFF)
    

    Disabling logging(in log4j):

    Use the following in conf/log4j.properties to disable logging completely:

    log4j.logger.org=OFF
    

    Reference: Mastering Spark by Jacek Laskowski.

    0 讨论(0)
  • 2020-11-22 14:40

    Thanks @AkhlD and @Sachin Janani for suggesting changes in .conf file.

    Following code solved my issue:

    1) Added import org.apache.log4j.{Level, Logger} in import section

    2) Added following line after creation of spark context object i.e. after val sc = new SparkContext(conf):

    val rootLogger = Logger.getRootLogger()
    rootLogger.setLevel(Level.ERROR)
    
    0 讨论(0)
提交回复
热议问题