昨天有同学让我帮忙写个简单的爬取京东商品属性的数据,要求很简单,500条商品数据就行。
用的
bs4,和requests,没用框架
import requests from bs4 import BeautifulSoup import csv sku = [] for i in range(1, 10): print(i) res = requests.get('https://list.jd.com/list.html?cat=9987,653,655&page='+str(i)+'&sort=sort_rank_asc&trans=1&JL=6_0_0&ms=10#J_main') html = res.text soup = BeautifulSoup(html, 'html.parser') items = soup.find_all(class_="gl-item") print(i) for item in items: data = item.find(class_='gl-i-wrap j-sku-item') sku.append(data['data-sku']) length = len(sku) for i in range(length): print(i) res = requests.get('https://item.jd.com/'+str(sku[i])+'.html') html = res.text soup = BeautifulSoup(html, 'html.parser') item = soup.find_all(class_="parameter2 p-parameter-list")[0] lis = item.find_all('li') with open("/Users/liulingzhi/Desktop/recipe.csv", "a") as csv_file: writer = csv.writer(csv_file) columns = [] # 先写入columns_name for li in lis: columns.append(li.text) writer.writerow(columns)
最后爬取的结果是:
来源:51CTO
作者:崩坏的芝麻
链接:https://blog.csdn.net/wangqingbang/article/details/100535391