Is it possible to scrape all text messages from Whatsapp Web with Scrapy?

房东的猫 提交于 2020-06-11 05:45:40

问题


I've been experimenting with web scraping using Scrapy, and I was interested in retrieving all text messages from all chats on Whatsapp to use as training data for a Machine Learning project. I know there are websites that block web crawlers/scrapers, so I would like to know if it is possible to use Scrapy to obtain these messages, and if it isn't possible, what are some alternatives I can use? I understand that I can click on the "Email chat" option for each chat, but this might not be feasible if I want to obtain a large amount of data, not just from my own chats, but from other people who are willing to let me use their chats for the project.


回答1:


I think WhatsApp do not block crawlers and scrapers. You have access only to your web.whatsapp.com. It's your metter what will you do with your messages. When I write code to read/write WhatsApp messages I used Selenium WebDriver, which can fully automate any browser actions. It worked too stable for WhatsUpp. It was not fully automation, be course of QR code. If you press F12 and go to "network" tab in web browser, you will notice XHR packets with messages inside. You can see it when you load new messages during scrolling or opening person. It look like byte data. So I do not think you can write Scrapy code for that.



来源:https://stackoverflow.com/questions/50775630/is-it-possible-to-scrape-all-text-messages-from-whatsapp-web-with-scrapy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!