How to get double quotes in Scrapy .csv results

廉价感情. 提交于 2020-01-04 03:18:09

问题


I have a problem with quotations within outputs using Scrapy. I am trying to scrap data that contains commas and this results in double quotations in some columns like so:

TEST,TEST,TEST,ON,TEST,TEST,"$2,449,000, 4,735 Sq Ft, 6 Bed, 5.1 Bath, Listed 03/01/2016"
TEST,TEST,TEST,ON,TEST,TEST,"$2,895,000, 4,975 Sq Ft, 5 Bed, 4.1 Bath, Listed 01/03/2016"

Only columns with commas get double quoted. How can I double quote all my data columns?

I want Scrapy to output:

"TEST","TEST","TEST","ON","TEST","TEST","$2,449,000, 4,735 Sq Ft, 6 Bed, 5.1 Bath, Listed 03/01/2016"
"TEST","TEST","TEST","ON","TEST","TEST","$2,895,000, 4,975 Sq Ft, 5 Bed, 4.1 Bath, Listed 01/03/2016"

Are there any settings I can change to do this?


回答1:


By default, for CSV output, Scrapy uses csv.writer() with the defaults.

For fields quotes, the default is csv.QUOTE_MINIMAL:

Instructs writer objects to only quote those fields which contain special characters such as delimiter, quotechar or any of the characters in lineterminator.

But you can build your own CSV item exporter and set a new dialect, building on the default 'excel' dialect.

For example, in an exporters.py module, define the following

import csv

from scrapy.exporters import CsvItemExporter


class QuoteAllDialect(csv.excel):
    quoting = csv.QUOTE_ALL


class QuoteAllCsvItemExporter(CsvItemExporter):

    def __init__(self, *args, **kwargs):
        kwargs.update({'dialect': QuoteAllDialect})
        super(QuoteAllCsvItemExporter, self).__init__(*args, **kwargs)

Then you simply need to reference this item exporter in your settings for CSV output, something like:

FEED_EXPORTERS = {
    'csv': 'myproject.exporters.QuoteAllCsvItemExporter',
}

And a simple spider like this:

import scrapy


class ExampleSpider(scrapy.Spider):
    name = "example"
    allowed_domains = ["example.com"]
    start_urls = ['http://example.com/']

    def parse(self, response):
        yield {
            "name": "Some name",
            "title": "Some title, baby!",
            "description": "Some description, with commas, quotes (\") and all"
        }

will output this:

"description","name","title"
"Some description, with commas, quotes ("") and all","Some name","Some title, baby!"


来源:https://stackoverflow.com/questions/42658875/how-to-get-double-quotes-in-scrapy-csv-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!