Scrapy : storing the data

前端 未结 2 422

I\'m new with python and scrapy. I\'m tring to follow the Scrapy tutorial but I don\'t understand the logic of the storage step.

scrapy crawl spidername -o items         


        
相关标签:
2条回答
  • 2021-02-03 11:22

    You can view a list of available commands by typing scrapy crawl -h from within your project directory.

    scrapy crawl spidername -o items.json -t json
    
    • -o specifies the output filename for dumped items (items.json)
    • -t specifies the format for dumping items (json)

    scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv

    • --set is used to set/override a setting
    • FEED_URI is used to set the storage backend for the item dumping. In this instance it is set to "output.csv" which is using the local filesystem ie a simple output file.(for current example - output.csv)
    • FEED_FORMAT is used to set the serialization format for the (output) feed ie (for current example csv)

    References (Scrapy documentation):

    1. Available tool commands (for the command line)
    2. Feed exports
    0 讨论(0)
  • 2021-02-03 11:28

    --set

    Arguments provided by the command line are the ones that take precedence, overriding any other options.

    You can explicitly override one (or more) settings using the -s (or --set) command line option.

    Example:
    
        scrapy crawl myspider -s LOG_FILE=scrapy.log
    
        sets the LOG_FILE settings value to `scrapy.log`
    

    -o

    Specifies the output filename and extension WHERE you will write the scraped data to

    Examples: 
        scrapy crawl quotes -o items.csv
        scrapy crawl quotes -o items.json
        scrapy crawl quotes -o items.xml
    

    -t

    Specifies the serialisation format or HOW the items are written

    https://www.tutorialspoint.com/scrapy/scrapy_settings.htm

    0 讨论(0)
提交回复
热议问题