Download all messages from a Google group

隐身守侯 提交于 2019-11-30 03:04:47

If you don't mind using #bash, you may try a tool I wrote

https://github.com/icy/google-group-crawler

It can download all mbox files from Google Group. If you have a cookie file, you can even download all files from a private Google Group, and/or to see all original emails. It can also read rss feeds and fetch the latest posts ; and this is useful for daily mirror.

An example result is here http://l.archlinuxvn.org/archlinuxvn/. MHonArch is used to convert mbox files into HTML format.

Ultimately I ended up using the gdata python library to get a list of all groups along with their respective URLs. From there I used selenium to scrape the groups for messages and all replies. Probably not the best solution but it works for what I need.

I made a simple scrap utility by using selenium and htmlunit.. you can use it.. it is not very optimized and can help you download messages of small groups only(up-to 7000 msgs)

https://github.com/himukr/google-grp-scraper

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!