Multiprocessing with Django when importing a external module

痴心易碎 提交于 2020-05-16 05:56:45

问题


I built a scraping module "scraper.py" that also has the ability to download file and I imported this module into django views. Issue is that in the scraper.py, this " __name__='__main__" is included where the multiprocessing pool is, so when I import the module and try to run it, it doesn't work because it isn't the main. This is the script(scraper.py) that uses the pool method.

 def download(self, url):
    response = self._is_downloadable(url)
    if response:
        name = response.headers.get('content-disposition')
        fname = re.findall('filename=(.+)', name)
        if len(fname) != 0:
            filename = fname[0]
            filename = filename.replace("\"", "")
            print(filename)
        else :
            filename = "Lecture note"
        with open(filename, 'wb') as files:
            for chunk in response.iter_content(100000):
                files.write(chunk)

def download_course_file(self, course):
    username = self._login_data["username"]
    p = Path(f"{username}-{course}.txt").exists()
    if not p:
        self.get_download_links(course)
    statime = time.time()
    if __name__ == "__main__":
        with Pool() as p:  
            with open(f"{username}-{course}.txt", "r") as course_link:
                data = course_link.read().splitlines(False)[::2]
                p.map(self.download, data)
                print(data)
        print(f"Process done {time.time()-statime}")

This module is imported in the views and then ran as

import scraper
def download_course(request, id):
    course = course = get_object_or_404(Course, id=id)
    course_name = (course.course_name)[:6]
    person, error = create_session(request)
    if "invalid" in error:
        data = {"error":error}
        return JsonResponse(data)
    person.download_course_file(course_name)
    data = {"success":"Your notes are being downloaded"}
    return JsonResponse(data)

PS: create_session is a function for initialising the scraper object with a username and password.

Is there a workaround for this name statement and even if there isn't, can't I remove it when I am deploying to a server as long as the server don't use windows as its OS.

来源:https://stackoverflow.com/questions/61566737/multiprocessing-with-django-when-importing-a-external-module

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!