Creating URLs in a loop

﹥>﹥吖頭↗ 提交于 2021-01-29 05:22:58

问题


I am trying to create a list of URLs using a for loop. It prints all the correct URLs, but is not saving them in a list. Ultimately I want to download multiple files using urlretrieve.

for i, j in zip(range(0, 17), range(1, 18)):
    if i < 8 or j < 10:
        url = "https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"
        print(url)
    if i == 9 and j == 10:
        url = "https://Here is a URL/P200{}".format(i) + "-{}".format(j) + ".xls"
        print(url)
    if i > 9:
        if i > 9 or j < 8:
            url = "https://Here is a URL/P20{}".format(i) + "-{}".format(j) + ".xls"
            print(url)

Output of above code is:

https://Here is a URL/P2000-01.xls
https://Here is a URL/P2001-02.xls
https://Here is a URL/P2002-03.xls
https://Here is a URL/P2003-04.xls
https://Here is a URL/P2004-05.xls
https://Here is a URL/P2005-06.xls
https://Here is a URL/P2006-07.xls
https://Here is a URL/P2007-08.xls
https://Here is a URL/P2008-09.xls
https://Here is a URL/P2009-10.xls
https://Here is a URL/P2010-11.xls
https://Here is a URL/P2011-12.xls
https://Here is a URL/P2012-13.xls
https://Here is a URL/P2013-14.xls
https://Here is a URL/P2014-15.xls
https://Here is a URL/P2015-16.xls
https://Here is a URL/P2016-17.xls

But this:

url

gives only:

'https://Here is a URL/P2016-17.xls'

How do I get all the URLs, not just the final one?


回答1:


There are several things that could significantly simplify your code. First of all, this:

"https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"

could be simplified to this:

"https://Here is a URL/P200{}-0{}.xls".format(i, j)

And if you have at least Python 3.6, you could use an f-string instead:

f"https://Here is a URL/P200{i}-0{j}.xls"

Second of all, Python strings have a builtin zfill method that automatically handles filling in zeroes on the left to a specified length. Additionally, range starts from zero by default.

So your entire original code is equivalent to:

for num in range(17):
    first = str(num).zfill(2)
    second = str(num + 1).zfill(2)
    print(f'https://Here is a URL/P20{first}-{second}.xls')

Now, you want to actually use these URLs, not just print them out. You mentioned building a list, which can be done like so:

urls = []
for num in range(17):
    first = str(num).zfill(2)
    second = str(num + 1).zfill(2)
    urls.append(f'https://Here is a URL/P20{first}-{second}.xls')

Based on your comments here and on your other question, you seem to be confused about what form you need these URLs to be in. Strings like this are already what you need. urlretrieve accepts the URL as a string, so you don't need to do any further processing. See the example in the docs:

local_filename, headers = urllib.request.urlretrieve('http://python.org/')
html = open(local_filename)
html.close()

However, I would recommend not using urlretrieve, for two reasons.

  1. As the documentation mentions, urlretrieve is a legacy method that may become deprecated. If you're going to use urllib, use the urlopen method instead.

  2. However, as Paul Becotte mentioned in an answer to your other question: if you're looking to fetch URLs, I would recommend installing and using Requests instead of urllib. It's more user-friendly.

Regardless of which method you choose, again, strings are fine. Here's code that that uses Requests to download each of the specified spreadsheets to your current directory:

import requests

base_url = 'https://Here is a URL/'

for num in range(17):
    first = str(num).zfill(2)
    second = str(num + 1).zfill(2)
    filename = f'P20{first}-{second}.xls'
    xls = requests.get(base_url + filename)
    with open(filename, 'wb') as f:
        f.write(xls.content)



回答2:


You are overriding the results of the URL with final URL. you need to maintain a final list and keep adding new values to the list

import urllib.parse
url=[];
for i,j in zip(range(0,17),range(1,18)):
    if(i<8 or j<10):
        url.append("https://Here is a URL/P200{}".format(i)+"-0{}".format(j)+".xls")
    if(i==9 and  j==10):
        url.append("https://Here is a URL/P200{}".format(i)+"-{}".format(j)+".xls") 
    if(i>9):
        if((i>9) or (j<8)):
            url.append("https://Here is a URL/P20{}".format(i)+"-{}".format(j)+".xls")

for urlValue in url:
            print(urllib.parse.quote(urlValue))


来源:https://stackoverflow.com/questions/65843258/creating-urls-in-a-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!