发表新帖

发表新帖

Options for HTML scraping? [closed]

前端未结

关注

 30  2007

难免孤独 2020-11-22 04:06

30条回答

悲哀的现实 (楼主)

2020-11-22 04:27
Python has several options for HTML scraping in addition to Beatiful Soup. Here are some others:
- mechanize: similar to perl WWW:Mechanize. Gives you a browser like object to ineract with web pages
- lxml: Python binding to libwww. Supports various options to traverse and select elements (e.g. XPath and CSS selection)
- scrapemark: high level library using templates to extract informations from HTML.
- pyquery: allows you to make jQuery like queries on XML documents.
- scrapy: an high level scraping and web crawling framework. It can be used to write spiders, for data mining and for monitoring and automated testing
0 讨论(0)

查看其它30个回答
发布评论:

提交评论
- 加载中...

热议问题