python webcrawler downloading files

后端 未结 1 1794
孤街浪徒
孤街浪徒 2021-01-15 12:26

I have a webcrawler that searches for certain files and downloads them, but how do I download a pdf file when the \"save as or open\" dialog prompts up. I am currently using

1条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-15 12:58

    You are going to need to modify the preferences of your Firefox profile. In order to get it to stop showing that dialog, you need to set the browser.helperApps.neverAsk.saveToDisk property of the profile in use. To do so, you could do this (note that this is for CSVs/Excel files - I believe your type would be 'application/pdf'):

    profile = webdriver.firefox.firefox_profile.FirefoxProfile()
    profile.set_preference('browser.helperApps.neverAsk.saveToDisk', ('text/csv,'
                                                                      'application/csv,'
                                                                      'application/msexcel'))
    

    For your case (I haven't tested this with a PDF, so take it with a grain of salt :) ), you could try this:

    profile = webdriver.firefox.firefox_profile.FirefoxProfile()
    profile.set_preference('browser.helperApps.neverAsk.saveToDisk', ('application/pdf'))
    

    The second argument is a tuple that contains the types of files that will never trigger a Save As prompt. You then pass this profile into your browser:

    browser = webdriver.Firefox(firefox_profile=profile)
    

    Now when you download a file of a type in that tuple, it should bypass the prompt and put it in your default directory. If you want to change the directory to which the file downloads, you can use the same process, just changing a few things (do this before attaching the profile to the browser):

    profile.set_preference('browser.download.folderList': 2)
    profile.set_preference('browser.download.dir': '/path/to/your/dir')
    

    0 讨论(0)
提交回复
热议问题