问题
I am trying to create a list of URLs using a for loop. It prints all the correct URLs, but is not saving them in a list. Ultimately I want to download multiple files using urlretrieve.
for i, j in zip(range(0, 17), range(1, 18)):
if i < 8 or j < 10:
url = "https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"
print(url)
if i == 9 and j == 10:
url = "https://Here is a URL/P200{}".format(i) + "-{}".format(j) + ".xls"
print(url)
if i > 9:
if i > 9 or j < 8:
url = "https://Here is a URL/P20{}".format(i) + "-{}".format(j) + ".xls"
print(url)
Output of above code is:
https://Here is a URL/P2000-01.xls
https://Here is a URL/P2001-02.xls
https://Here is a URL/P2002-03.xls
https://Here is a URL/P2003-04.xls
https://Here is a URL/P2004-05.xls
https://Here is a URL/P2005-06.xls
https://Here is a URL/P2006-07.xls
https://Here is a URL/P2007-08.xls
https://Here is a URL/P2008-09.xls
https://Here is a URL/P2009-10.xls
https://Here is a URL/P2010-11.xls
https://Here is a URL/P2011-12.xls
https://Here is a URL/P2012-13.xls
https://Here is a URL/P2013-14.xls
https://Here is a URL/P2014-15.xls
https://Here is a URL/P2015-16.xls
https://Here is a URL/P2016-17.xls
But this:
url
gives only:
'https://Here is a URL/P2016-17.xls'
How do I get all the URLs, not just the final one?
回答1:
There are several things that could significantly simplify your code. First of all, this:
"https://Here is a URL/P200{}".format(i) + "-0{}".format(j) + ".xls"
could be simplified to this:
"https://Here is a URL/P200{}-0{}.xls".format(i, j)
And if you have at least Python 3.6, you could use an f-string instead:
f"https://Here is a URL/P200{i}-0{j}.xls"
Second of all, Python strings have a builtin zfill method that automatically handles filling in zeroes on the left to a specified length. Additionally, range starts from zero by default.
So your entire original code is equivalent to:
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
print(f'https://Here is a URL/P20{first}-{second}.xls')
Now, you want to actually use these URLs, not just print them out. You mentioned building a list, which can be done like so:
urls = []
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
urls.append(f'https://Here is a URL/P20{first}-{second}.xls')
Based on your comments here and on your other question, you seem to be confused about what form you need these URLs to be in. Strings like this are already what you need. urlretrieve accepts the URL as a string, so you don't need to do any further processing. See the example in the docs:
local_filename, headers = urllib.request.urlretrieve('http://python.org/') html = open(local_filename) html.close()
However, I would recommend not using urlretrieve
, for two reasons.
As the documentation mentions,
urlretrieve
is a legacy method that may become deprecated. If you're going to useurllib
, use the urlopen method instead.However, as Paul Becotte mentioned in an answer to your other question: if you're looking to fetch URLs, I would recommend installing and using Requests instead of
urllib
. It's more user-friendly.
Regardless of which method you choose, again, strings are fine. Here's code that that uses Requests to download each of the specified spreadsheets to your current directory:
import requests
base_url = 'https://Here is a URL/'
for num in range(17):
first = str(num).zfill(2)
second = str(num + 1).zfill(2)
filename = f'P20{first}-{second}.xls'
xls = requests.get(base_url + filename)
with open(filename, 'wb') as f:
f.write(xls.content)
回答2:
You are overriding the results of the URL with final URL. you need to maintain a final list and keep adding new values to the list
import urllib.parse
url=[];
for i,j in zip(range(0,17),range(1,18)):
if(i<8 or j<10):
url.append("https://Here is a URL/P200{}".format(i)+"-0{}".format(j)+".xls")
if(i==9 and j==10):
url.append("https://Here is a URL/P200{}".format(i)+"-{}".format(j)+".xls")
if(i>9):
if((i>9) or (j<8)):
url.append("https://Here is a URL/P20{}".format(i)+"-{}".format(j)+".xls")
for urlValue in url:
print(urllib.parse.quote(urlValue))
来源:https://stackoverflow.com/questions/65843258/creating-urls-in-a-loop