AWS Glue jobs log output and errors to two different CloudWatch logs, /aws-glue/jobs/error
and /aws-glue/jobs/output
by default. When I include p
I know the article is not new but maybe it could be helpful for someone: For me logging in glue works with the following lines of code:
# create glue context
glueContext = GlueContext(sc)
# set custom logging on
logger = glueContext.get_logger()
...
#write into the log file with:
logger.info("s3_key:" + your_value)
Just in case this helps. This works to change the log level.
sc = SparkContext()
sc.setLogLevel('DEBUG')
glueContext = GlueContext(sc)
logger = glueContext.get_logger()
logger.info('Hello Glue')
Try to use built-in python logger from logging
module, by default it writes messages to standard output stream.
import logging
MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
logger = logging.getLogger(<logger-name-here>)
logger.setLevel(logging.INFO)
...
logger.info("Test log message")
I noticed the above answers are written in python. For Scala you could do the following
import com.amazonaws.services.glue.log.GlueLogger
object GlueApp {
def main(sysArgs: Array[String]) {
val logger = new GlueLogger
logger.info("info message")
logger.warn("warn message")
logger.error("error message")
}
}
You can find both Python and Scala solution from official doc here
I faced the same problem. I resolved it by added
logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))
Before there was no prints at all, even ERROR level
The idea was taken from here https://medium.com/tieto-developers/how-to-do-application-logging-in-aws-745114ac6eb7
Another option would be to log to stdout and glue AWS logging to stdout (using stdout is actually one of the best practices in cloud logging).
Update: it works only for setLevel("WARNING") and when prints ERROR or WARING. I didn't find how to manage it for the INFO level :(