Can't store downloaded files in their concerning folders

前端 未结 3 1007
再見小時候
再見小時候 2021-01-12 08:02

I\'ve written a script in python in combination with selenium to download few document files (ending with .doc) from a webpage. The reason I do not wish to use request

相关标签:
3条回答
  • 2021-01-12 08:09

    Use this code while declaring the Driver object (This is for Java, Python will also have a similar way to accomplish it) This will download the file to the specified location every time.

        //Create preference object
        HashMap<String, Object> chromePrefs = new HashMap<String , Object>();   
        //Set Download path  
        chromePrefs.put("download.default_directory","C:\\Reports\\AutomaionDownloads");
            chromePrefs.put("download.directory_upgrade", true);
            ChromeOptions options = new ChromeOptions();
            options.setExperimentalOption("prefs", chromePrefs);    
            //Call the Chrome Driver
            WebDriver driver = new ChromeDriver(options); 
    
    0 讨论(0)
  • 2021-01-12 08:25

    I just added the the rename of the file to move it. So it'll work just as you have it, but then once it downloads the file, will move it to the correct path:

    os.rename(desk_location + '\\' + filename, file_location)

    Full Code:

    import os
    import time
    from selenium import webdriver
    
    link ='https://www.online-convert.com/file-format/doc' 
    
    dirf = os.path.expanduser('~')
    desk_location = dirf + r'\Desktop\file_folder'
    if not os.path.exists(desk_location):
        os.mkdir(desk_location)
    
    def download_files():
        driver.get(link)
        for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
            filename = item.get_attribute("href").split("/")[-1]
            #creating new folder in accordance with filename to store the downloaded file in thier concerning folder
            folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
            #set the new location of the folders to be created
            new_location = os.path.join(desk_location,folder_name)
            if not os.path.exists(new_location):
                os.mkdir(new_location)
            #set the location of the folders the downloaded files will be within
            file_location = os.path.join(new_location,filename)
            item.click()
    
            time_to_wait = 10
            time_counter = 0
    
            try:
                while not os.path.exists(file_location):
                    time.sleep(1)
                    time_counter += 1
                    if time_counter > time_to_wait:break
                os.rename(desk_location + '\\' + filename, file_location)
            except Exception:pass
    
    if __name__ == '__main__':
        chromeOptions = webdriver.ChromeOptions()
        prefs = {'download.default_directory' : desk_location,
                'profile.default_content_setting_values.automatic_downloads': 1
            }
        chromeOptions.add_experimental_option('prefs', prefs)
        driver = webdriver.Chrome(chrome_options=chromeOptions)
        download_files()
    
    0 讨论(0)
  • 2021-01-12 08:28

    Use pathlib library in Python 3 or the pathlib2 library for Python 2 to handle paths. It gives you an object-oriented way to work with files and directories. Also it has PurePath object, which can work with paths without even touching the filesystem.

    0 讨论(0)
提交回复
热议问题