Script for FeedPaser to Regularly Gather RSS, then Storing Data in Database

删除回忆录丶 提交于 2019-12-12 04:37:20

问题


I'm learning Python. To teach myself I've decided to try to build a tool which gathers RSS feeds and stores the output, title, URL and Summary in a database (I will later build a tool to access the data and scrape the pages)

So far, I have created a local version that gathers gathers content from a list of RSS feeds and puts it into a pandas dataframe.

What I'm trying to understand next is, what tools do I need to turn this local script into a script that runs every, for example, 30 mins and adds the new found data to the database.

Any direction would be helpful.

import feedparser
import pandas as pd

rawrss = [
    'http://newsrss.bbc.co.uk/rss/newsonline_uk_edition/front_page/rss.xml',
    'https://www.yahoo.com/news/rss/',
    'http://www.huffingtonpost.co.uk/feeds/index.xml',
    'http://feeds.feedburner.com/TechCrunch/',
    ]

posts = []
for url in rawrss:
    feed = feedparser.parse(url)
    for post in feed.entries:
        posts.append((post.title, post.link, post.summary))
df = pd.DataFrame(posts, columns=['title', 'link', 'summary']) # pass data to init

df

来源:https://stackoverflow.com/questions/45735189/script-for-feedpaser-to-regularly-gather-rss-then-storing-data-in-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!