Scrapy - logging to file and stdout simultaneously, with spider names

后端 未结 7 1836
悲&欢浪女
悲&欢浪女 2021-01-30 18:28

I\'ve decided to use the Python logging module because the messages generated by Twisted on std error is too long, and I want to INFO level meaningful messages such

相关标签:
7条回答
  • 2021-01-30 18:47

    I know this is old but it was a really helpful post since the class still isn't properly documented in the Scrapy docs. Also, we can skip importing logging and use scrapy logs directly. Thanks All!

    from scrapy import log
    
    logfile = open('testlog.log', 'a')
    log_observer = log.ScrapyFileLogObserver(logfile, level=log.DEBUG)
    log_observer.start()
    
    0 讨论(0)
  • 2021-01-30 18:48

    As of Scrapy 2.3, none of the answers mentioned above worked for me. In addition, the solution found in the documentation caused overwriting of the log file with every message, which is of course not what you want in a log. I couldn't find a built-in setting that changed the mode to "a" (append). I achieved logging to both file and stdout with the following configuration code:

    configure_logging(settings={
        "LOG_STDOUT": True
    })
    file_handler = logging.FileHandler(filename, mode="a")
    formatter = logging.Formatter(
        fmt="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s",
        datefmt="%H:%M:%S"
    )
    file_handler.setFormatter(formatter)
    file_handler.setLevel("DEBUG")
    logging.root.addHandler(file_handler) 
    
    0 讨论(0)
  • 2021-01-30 18:52

    As the Scrapy Official Doc said:

    Scrapy uses Python’s builtin logging system for event logging.

    So you can config your logger just as a normal Python script.

    First, you have to import the logging module:

    import logging
    

    You can add this line to your spider:

    logging.getLogger().addHandler(logging.StreamHandler())
    

    It adds a stream handler to log to console.

    After that, you have to config logging file path.

    Add a dict named custom_settings which consists of your spider-specified settings:

    custom_settings = {
         'LOG_FILE': 'my_log.log',
         'LOG_LEVEL': 'INFO',
         ... # you can add more settings
     }
    

    The whole class looks like:

    import logging
    
    class AbcSpider(scrapy.Spider):
        name: str = 'abc_spider'
        start_urls = ['you_url']
        custom_settings = {
             'LOG_FILE': 'my_log.log',
             'LOG_LEVEL': 'INFO',
             ... # you can add more settings
         }
         logging.getLogger().addHandler(logging.StreamHandler())
    
         def parse(self, response):
            pass
    
    0 讨论(0)
  • 2021-01-30 18:56

    It is very easy to redirect output using: scrapy some-scrapy's-args 2>&1 | tee -a logname

    This way, all what scrapy ouputs into stdout and stderr, will be redirected to a logname file and also, prited to the screen.

    0 讨论(0)
  • 2021-01-30 19:09

    You want to use the ScrapyFileLogObserver.

    import logging
    from scrapy.log import ScrapyFileLogObserver
    
    logfile = open('testlog.log', 'w')
    log_observer = ScrapyFileLogObserver(logfile, level=logging.DEBUG)
    log_observer.start()
    

    I'm glad you asked this question, I've been wanting to do this myself.

    0 讨论(0)
  • 2021-01-30 19:09

    For all those folks who came here before reading the current documentation version:

    import logging
    from scrapy.utils.log import configure_logging
    
    configure_logging(install_root_handler=False)
    logging.basicConfig(
        filename='log.txt',
        filemode = 'a',
        format='%(levelname)s: %(message)s',
        level=logging.DEBUG
    )
    
    0 讨论(0)
提交回复
热议问题