Django custom management command running Scrapy: How to include Scrapy's options?

前端 未结 2 483
时光取名叫无心
时光取名叫无心 2021-02-08 03:41

I want to be able to run the Scrapy web crawling framework from within Django. Scrapy itself only provides a command line tool scrapy to execute its commands, i.e.

2条回答
  •  温柔的废话
    2021-02-08 04:35

    I think you're really looking for Guideline 10 of the POSIX argument syntax conventions:

    The argument -- should be accepted as a delimiter indicating the end of options. Any following arguments should be treated as operands, even if they begin with the '-' character. The -- argument should not be used as an option or as an operand.

    Python's optparse module behaves this way, even under windows.

    I put the scrapy project settings module in the argument list, so I can create separate scrapy projects in independent apps:

    # /management/commands/scrapy.py
    from __future__ import absolute_import
    import os
    
    from django.core.management.base import BaseCommand
    
    class Command(BaseCommand):
        def handle(self, *args, **options):
            os.environ['SCRAPY_SETTINGS_MODULE'] = args[0]
            from scrapy.cmdline import execute
            # scrapy ignores args[0], requires a mutable seq
            execute(list(args))
    

    Invoked as follows:

    python manage.py scrapy myapp.scrapyproj.settings crawl domain.com -- -o scraped_data.json -t json
    

    Tested with scrapy 0.12 and django 1.3.1

提交回复
热议问题