How to stop INFO messages displaying on spark console?

后端未结

关注

 20  3117

广开言路

I\'d like to stop various messages that are coming on spark shell.

I tried to edit the log4j.properties file in order to stop these message.

Her

相关标签:

20条回答

无人共我

2020-11-22 14:14
In addition to all the above posts, here is what solved the issue for me.

Spark uses slf4j to bind to loggers. If log4j is not the first binding found, you can edit log4j.properties files all you want, the loggers are not even used. For example, this could be a possible SLF4J output:

SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-simple/1.6.6/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-log4j12/1.7.19/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]

So here the SimpleLoggerFactory was used, which does not care about log4j settings.

Excluding the slf4j-simple package from my project via
```
<dependency>
        ...
        <exclusions>
            ...
            <exclusion>
                <artifactId>slf4j-simple</artifactId>
                <groupId>org.slf4j</groupId>
            </exclusion>
        </exclusions>
    </dependency>
```
resolved the issue, as now the log4j logger binding is used and any setting in log4j.properties is adhered to. F.Y.I. my log4j properties file contains (besides the normal configuration)
```
log4j.rootLogger=WARN, stdout
...
log4j.category.org.apache.spark = WARN
log4j.category.org.apache.parquet.hadoop.ParquetRecordReader = FATAL
log4j.additivity.org.apache.parquet.hadoop.ParquetRecordReader=false
log4j.logger.org.apache.parquet.hadoop.ParquetRecordReader=OFF
```
Hope this helps!
0 讨论(0)
发布评论:

提交评论
- 加载中...

一个人的身影

2020-11-22 14:17

In Python/Spark we can do:

def quiet_logs( sc ):
  logger = sc._jvm.org.apache.log4j
  logger.LogManager.getLogger("org"). setLevel( logger.Level.ERROR )
  logger.LogManager.getLogger("akka").setLevel( logger.Level.ERROR )

The after defining Sparkcontaxt 'sc' call this function by : quiet_logs( sc )

0 讨论(0)

说谎

2020-11-22 14:17

An interesting idea is to use the RollingAppender as suggested here: http://shzhangji.com/blog/2015/05/31/spark-streaming-logging-configuration/ so that you don't "polute" the console space, but still be able to see the results under $YOUR_LOG_PATH_HERE/${dm.logging.name}.log.

    log4j.rootLogger=INFO, rolling

log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxFileSize=50MB
log4j.appender.rolling.maxBackupIndex=5
log4j.appender.rolling.file=$YOUR_LOG_PATH_HERE/${dm.logging.name}.log
log4j.appender.rolling.encoding=UTF-8

Another method that solves the cause is to observe what kind of loggings do you usually have (coming from different modules and dependencies), and set for each the granularity for the logging, while turning "quiet" third party logs that are too verbose:

For instance,

    # Silence akka remoting
log4j.logger.Remoting=ERROR
log4j.logger.akka.event.slf4j=ERROR
log4j.logger.org.spark-project.jetty.server=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.com.anjuke.dm=${dm.logging.level}
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO

0 讨论(0)

野趣味

2020-11-22 14:20
I just add this line to all my pyspark scripts on top just below the import statements.
```
SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
```
example header of my pyspark scripts
```
from pyspark.sql import SparkSession, functions as fs
SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2020-11-22 14:23

Simple to do on the command line...

spark2-submit --driver-java-options="-Droot.logger=ERROR,console" ..other options..

0 讨论(0)
发布评论:

提交评论
- 加载中...
感动是毒

2020-11-22 14:24
If you don't have the ability to edit the java code to insert the .setLogLevel() statements and you don't want yet more external files to deploy, you can use a brute force way to solve this. Just filter out the INFO lines using grep.
```
spark-submit --deploy-mode client --master local <rest-of-cmd> | grep -v -F "INFO"
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 3 4 下一页