Access django models inside of Scrapy

后端 未结 8 2207
执笔经年
执笔经年 2020-11-28 19:11

Is it possible to access my django models inside of a Scrapy pipeline, so that I can save my scraped data straight to my model?

I\'ve seen this, but I don\'t really

相关标签:
8条回答
  • 2020-11-28 19:37

    Add DJANGO_SETTINGS_MODULE env in your scrapy project's settings.py

    import os
    os.environ['DJANGO_SETTINGS_MODULE'] = 'your_django_project.settings'
    

    Now you can use DjangoItem in your scrapy project.

    Edit:
    You have to make sure that the your_django_project projects settings.py is available in PYTHONPATH.

    0 讨论(0)
  • 2020-11-28 19:46

    Why not create a __init__.py file in the scrapy project folder and hook it up in INSTALLED_APPS? Worked for me. I was able to simply use:

    piplines.py

    from my_app.models import MyModel
    

    Hope that helps.

    0 讨论(0)
  • 2020-11-28 19:50

    The opposite solution (setup scrapy in a django management command):

    # -*- coding: utf-8 -*-
    # myapp/management/commands/scrapy.py 
    
    from __future__ import absolute_import
    from django.core.management.base import BaseCommand
    
    class Command(BaseCommand):
    
        def run_from_argv(self, argv):
            self._argv = argv
            self.execute()
    
        def handle(self, *args, **options):
            from scrapy.cmdline import execute
            execute(self._argv[1:])
    

    and in django's settings.py:

    import os
    os.environ['SCRAPY_SETTINGS_MODULE'] = 'scrapy_project.settings'
    

    Then instead of scrapy foo run ./manage.py scrapy foo.

    UPD: fixed the code to bypass django's options parsing.

    0 讨论(0)
  • 2020-11-28 19:52

    setup-environ is deprecated. You may need to do the following in scrapy's settings file for newer versions of django 1.4+

    def setup_django_env():
        import sys, os, django
    
        sys.path.append('/path/to/django/myapp')
        os.environ['DJANGO_SETTINGS_MODULE'] = 'myapp.settings'
    
    django.setup()
    
    0 讨论(0)
  • 2020-11-28 19:53

    For Django 1.4, the project layout has changed. Instead of /myproject/settings.py, the settings module is in /myproject/myproject/settings.py.

    I also added path's parent directory (/myproject) to sys.path to make it work correctly.

    def setup_django_env(path):
        import imp, os, sys
        from django.core.management import setup_environ
    
        f, filename, desc = imp.find_module('settings', [path])
        project = imp.load_module('settings', f, filename, desc)       
    
        setup_environ(project)
    
        # Add path's parent directory to sys.path
        sys.path.append(os.path.abspath(os.path.join(path, os.path.pardir)))
    
    setup_django_env('/path/to/django/myproject/myproject/')
    
    0 讨论(0)
  • 2020-11-28 19:58

    If anyone else is having the same problem, this is how I solved it.

    I added this to my scrapy settings.py file:

    def setup_django_env(path):
        import imp, os
        from django.core.management import setup_environ
    
        f, filename, desc = imp.find_module('settings', [path])
        project = imp.load_module('settings', f, filename, desc)       
    
        setup_environ(project)
    
    setup_django_env('/path/to/django/project/')
    

    Note: the path above is to your django project folder, not the settings.py file.

    Now you will have full access to your django models inside of your scrapy project.

    0 讨论(0)
提交回复
热议问题