Scrapy: CSV output without header

前端 未结 1 630
借酒劲吻你
借酒劲吻你 2021-01-21 07:15

When I use the command scrapy crawl -o , I get the output of my Item dictionary with headers. This is good. However

相关标签:
1条回答
  • 2021-01-21 08:12

    There is include_headers_line=True in CsvItemExporter but I don't know how to use it directly. http://doc.scrapy.org/en/latest/topics/exporters.html#csvitemexporter

    But you can create own exporter with include_headers_line=False in file exporters.py (in the same folder as settings.py and items.py)

    from scrapy.exporters import CsvItemExporter
    
    
    class HeadlessCsvItemExporter(CsvItemExporter):
    
        def __init__(self, *args, **kwargs):
            kwargs['include_headers_line'] = False
            super(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)
    

    Then you have to set this exporter in settings.py

    FEED_EXPORTERS = {
        'csv': 'your_project_name.exporters.HeadlessCsvItemExporter',
    }
    

    And now scrapy should write csv file without headers.

    scrapy crawl <project> -o <filename.csv>
    

    Or you can set

    FEED_EXPORTERS = {
        'headless': 'your_project_name.exporters.HeadlessCsvItemExporter',
    }
    

    and get csv without headers only when you use -t headless

    scrapy crawl <project> -o <filename.csv> -t headless
    

    ps. don't forget to use your project name in place of your_project_name in setttings.py


    EDIT:

    Now exporter skips headers only if file is not empty (if file.tell() > 0)

    from scrapy.exporters import CsvItemExporter
    
    
    class HeadlessCsvItemExporter(CsvItemExporter):
    
        def __init__(self, *args, **kwargs):
    
            # args[0] is (opened) file handler
            # if file is not empty then skip headers
            if args[0].tell() > 0:
                kwargs['include_headers_line'] = False
    
            super(HeadlessCsvItemExporter, self).__init__(*args, **kwargs)
    
    0 讨论(0)
提交回复
热议问题