Download file from web in Python 3

前端 未结 9 574
星月不相逢
星月不相逢 2020-11-22 16:43

I am creating a program that will download a .jar (java) file from a web server, by reading the URL that is specified in the .jad file of the same game/application. I\'m usi

相关标签:
9条回答
  • 2020-11-22 17:12
    from urllib import request
    
    def get(url):
        with request.urlopen(url) as r:
            return r.read()
    
    
    def download(url, file=None):
        if not file:
            file = url.split('/')[-1]
        with open(file, 'wb') as f:
            f.write(get(url))
    
    0 讨论(0)
  • 2020-11-22 17:13

    If you are using Linux you can use the wget module of Linux through the python shell. Here is a sample code snippet

    import os
    url = 'http://www.example.com/foo.zip'
    os.system('wget %s'%url)
    
    0 讨论(0)
  • 2020-11-22 17:16

    Motivation

    Sometimes, we are want to get the picture but not need to download it to real files,

    i.e., download the data and keep it on memory.

    For example, If I use the machine learning method, train a model that can recognize an image with the number (bar code).

    When I spider some websites and that have those images so I can use the model to recognize it,

    and I don't want to save those pictures on my disk drive,

    then you can try the below method to help you keep download data on memory.

    Points

    import requests
    from io import BytesIO
    response = requests.get(url)
    with BytesIO as io_obj:
        for chunk in response.iter_content(chunk_size=4096):
            io_obj.write(chunk)
    

    basically, is like to @Ranvijay Kumar

    An Example

    import requests
    from typing import NewType, TypeVar
    from io import StringIO, BytesIO
    import matplotlib.pyplot as plt
    import imageio
    
    URL = NewType('URL', str)
    T_IO = TypeVar('T_IO', StringIO, BytesIO)
    
    
    def download_and_keep_on_memory(url: URL, headers=None, timeout=None, **option) -> T_IO:
        chunk_size = option.get('chunk_size', 4096)  # default 4KB
        max_size = 1024 ** 2 * option.get('max_size', -1)  # MB, default will ignore.
        response = requests.get(url, headers=headers, timeout=timeout)
        if response.status_code != 200:
            raise requests.ConnectionError(f'{response.status_code}')
    
        instance_io = StringIO if isinstance(next(response.iter_content(chunk_size=1)), str) else BytesIO
        io_obj = instance_io()
        cur_size = 0
        for chunk in response.iter_content(chunk_size=chunk_size):
            cur_size += chunk_size
            if 0 < max_size < cur_size:
                break
            io_obj.write(chunk)
        io_obj.seek(0)
        """ save it to real file.
        with open('temp.png', mode='wb') as out_f:
            out_f.write(io_obj.read())
        """
        return io_obj
    
    
    def main():
        headers = {
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
            'Accept-Encoding': 'gzip, deflate',
            'Accept-Language': 'zh-TW,zh;q=0.9,en-US;q=0.8,en;q=0.7',
            'Cache-Control': 'max-age=0',
            'Connection': 'keep-alive',
            'Host': 'statics.591.com.tw',
            'Upgrade-Insecure-Requests': '1',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36'
        }
        io_img = download_and_keep_on_memory(URL('http://statics.591.com.tw/tools/showPhone.php?info_data=rLsGZe4U%2FbphHOimi2PT%2FhxTPqI&type=rLEFMu4XrrpgEw'),
                                             headers,  # You may need this. Otherwise, some websites will send the 404 error to you.
                                             max_size=4)  # max loading < 4MB
        with io_img:
            plt.rc('axes.spines', top=False, bottom=False, left=False, right=False)
            plt.rc(('xtick', 'ytick'), color=(1, 1, 1, 0))  # same of plt.axis('off')
            plt.imshow(imageio.imread(io_img, as_gray=False, pilmode="RGB"))
            plt.show()
    
    
    if __name__ == '__main__':
        main()
    
    
    0 讨论(0)
  • 2020-11-22 17:17

    I hope I understood the question right, which is: how to download a file from a server when the URL is stored in a string type?

    I download files and save it locally using the below code:

    import requests
    
    url = 'https://www.python.org/static/img/python-logo.png'
    fileName = 'D:\Python\dwnldPythonLogo.png'
    req = requests.get(url)
    file = open(fileName, 'wb')
    for chunk in req.iter_content(100000):
        file.write(chunk)
    file.close()
    
    0 讨论(0)
  • 2020-11-22 17:24

    You can use wget which is popular downloading shell tool for that. https://pypi.python.org/pypi/wget This will be the simplest method since it does not need to open up the destination file. Here is an example.

    import wget
    url = 'https://i1.wp.com/python3.codes/wp-content/uploads/2015/06/Python3-powered.png?fit=650%2C350'  
    wget.download(url, '/Users/scott/Downloads/cat4.jpg') 
    
    0 讨论(0)
  • 2020-11-22 17:27

    Here we can use urllib's Legacy interface in Python3:

    The following functions and classes are ported from the Python 2 module urllib (as opposed to urllib2). They might become deprecated at some point in the future.

    Example (2 lines code):

    import urllib.request
    
    url = 'https://www.python.org/static/img/python-logo.png'
    urllib.request.urlretrieve(url, "logo.png")
    
    0 讨论(0)
提交回复
热议问题