问题
Hello I am new to Python and am trying to figure out why my list overwrites the previous elements every time a new page is loaded and scraped during the while loop. Thank you in advance.
def scrapeurls():
domain = "https://domain234dd.com"
count = 0
while count < 10:
page = requests.get("{}{}".format(domain, count))
soup = BeautifulSoup(page.content, 'html.parser')
data = soup.findAll('div', attrs={'class': 'video'})
urls = []
for div in data:
links = div.findAll('a')
for a in links:
urls.append(a['href'])
print(a['href'])
print(count)
count += 1
回答1:
Because you reset urls
to an empty list in every iteration of the loop. You should move that to before the loop.
(Note, the whole thing would be better expressed as a for loop.)
回答2:
You need to initialize the URL list before the loop. If you initialize inside the loop it sets it back to nothing every time.
回答3:
domain = "https://domain234dd.com"
count = 0
urls = []
while count < 10:
page = requests.get("{}{}".format(domain, count))
soup = BeautifulSoup(page.content, 'html.parser')
data = soup.findAll('div', attrs={'class': 'video'})
for div in data:
links = div.findAll('a')
for a in links:
urls.append(a['href'])
print(a['href'])
print(count)
count += 1
来源:https://stackoverflow.com/questions/46565752/python-previous-list-elements-being-overwritten-by-new-elements-during-while-l