In Spark, what is the difference between the event log directory and the history server log directory?
spark.eventLog.dir hdfs:///var/log/spark/apps
spark.histo
spark.eventLog.dir
is to generate logs while spark.history.fs.logDirectory
is the place where Spark History Server finds log events.
From the official documentation of Apache Spark:
spark.eventLog.dir
is the base directory in which Spark events are logged, if spark.eventLog.enabled is true. Within this base directory, Spark creates a sub-directory for each application, and logs the events specific to the application in this directory. Users may want to set this to a unified location like an HDFS directory so history files can be read by the history server.
See spark.eventLog.dir.
spark.history.fs.logDirectory
is for the filesystem history provider, the URL to the directory containing application event logs to load. This can be a local file:// path, an HDFS path hdfs://namenode/shared/spark-logs or that of an alternative filesystem supported by the Hadoop APIs.
See spark.history.fs.logDirectory.