feedparser

Error importing external library within Django template tag library

徘徊边缘 提交于 2019-12-12 14:37:57
问题 So I'm attempting to write a Django reusable app that provides a method for displaying your Twitter feed on your page. I know well that it already exists 20 times. It's an academic exercise. :) Directory structure is pretty simple: myproject |__ __init__.py |__ manage.py |__ settings.py |__ myapp |__ __init__.py |__ admin.py |__ conf |__ __init__.py |__ appsettings.py |__ feedparser.py |__ models.py |__ templates |__ __init__.py |__ templatetags |__ __init__.py |__ twitterfeed.py |__ views.py

Script for FeedPaser to Regularly Gather RSS, then Storing Data in Database

删除回忆录丶 提交于 2019-12-12 04:37:20
问题 I'm learning Python. To teach myself I've decided to try to build a tool which gathers RSS feeds and stores the output, title, URL and Summary in a database (I will later build a tool to access the data and scrape the pages) So far, I have created a local version that gathers gathers content from a list of RSS feeds and puts it into a pandas dataframe. What I'm trying to understand next is, what tools do I need to turn this local script into a script that runs every, for example, 30 mins and

'QuerySet' object has no attribute 'url' when using feedparser in Django

夙愿已清 提交于 2019-12-12 01:24:51
问题 This is follow up to the question from here bozo_exception in Django / feedparser I would like to iterate through many feeds from models/DB and have each of them displayed in the html template. While I do understand that I need to iterate thought x.feed.entries in the html template, I assume that iteration through each rss source needs to happen in the view function correct? def feed5(request): source = Feed.objects.all() for item in source.url: rss = feedparser.parse(item) context = {'rss':

FeedParser, Removing Special Characters and Writing to CSV

牧云@^-^@ 提交于 2019-12-11 14:45:50
问题 I'm learning Python. I've set myself a wee goal of building a RSS scraper. I'm trying to gather the Author, Link and Title. From there I want to write to a CSV. I'm encountering some problems. I've search for the answer since last night but can't seem to find a solution. I do have a feeling that is a bit of knowledge that I'm missing between what feedparser is parsing and moving it to a CSV but I don't have the vocabulary yet to know what to Google. How do I remove special characters such as

Serializing a FeedParser object to Atom

家住魔仙堡 提交于 2019-12-11 03:17:10
问题 I use feedparser http://www.feedparser.org/ to parse Atom feeds and I do some manipulation on the resulting Python objetcs. After that, I would like to serialize the objects back to Atom. But feedparser does not seem to offer a way to do so? I noticed other Atom libraries like gdata http://code.google.com/p/gdata-python-client/ or demokritos http://jtauber.com/demokritos/ but, to tell the truth, they seem very difficult for the beginner. I use feedparser precisely because of its extreme

Have a correct datetime with correct timezone

喜夏-厌秋 提交于 2019-12-10 21:42:43
问题 I am using feedparser in order to get RSS data. Here is my code : >>> import datetime >>> import time >>> import feedparser >>> d=feedparser.parse("http://.../rss.xml") >>> datetimee_rss = d.entries[0].published_parsed >>> datetimee_rss time.struct_time(tm_year=2015, tm_mon=5, tm_mday=8, tm_hour=16, tm_min=57, tm_sec=39, tm_wday=4, tm_yday=128, tm_isdst=0) >>> datetime.datetime.fromtimestamp(time.mktime(datetimee_rss)) datetime.datetime(2015, 5, 8, 17, 57, 39) In my timezone (FR), the actual

Retrieving raw XML for items with feedparser

為{幸葍}努か 提交于 2019-12-10 11:13:53
问题 I'm trying to use feedparser to retrieve some specific information from feeds, but also retrieve the raw XML of each entry (ie. elements for RSS and for Atom), and I can't see how to do that. Obviously I could parse the XML by hand, but that's not very elegant, would require separate support for RSS and Atom, and I imagine it could fall out of sync with feedparser for ill-formed feeds. Is there a better way? Thanks! 回答1: I'm the current developer of feedparser. Currently, one of the ways you

Check date format before parsing

戏子无情 提交于 2019-12-08 07:04:18
问题 I am parsing several documments with the field Duration . But in the differents files, it is in differnt formats, ex: "Duration": "00:43" "Duration": "113.046" "Duration": "21.55 s" I want to parse all of them to the format "Duration": "113.046" , how could I check before any parsing in wich format it is?? Some conditions before this piece of code, because this is not right for all of them: Long duration; DateFormat sdf = new SimpleDateFormat("hh:mm:ss"); try { Date durationD = sdf.parse

not able to parse rss feeds

試著忘記壹切 提交于 2019-12-08 03:17:06
问题 I'm trying to parse RSS feeds from a url using feedparser in python. >>> import feedparser >>> d = feedparser.parse('http://www.shop.inonit.in/RSSFeedDetails.aspx?PID=801') >>> d {'feed': {'summary': u'<span><h1>Server Error in \'/mobile\' Application.<hr color="silver" size="1" width="100%" /></h1>\n\n <h2> <i>Attempted to divide by zero.</i> </h2></span>\n\n <font face="Arial, Helvetica, Geneva, SunSans-Regular, sans-serif ">\n\n <b> Description: </b>An unhandled exception occurred during

Reading RSS feed and displaying it in Django Template | feedparser

我们两清 提交于 2019-12-06 10:21:25
问题 Refer this blog: http://johnsmallman.wordpress.com/author/johnsmallman/feed/ I want to fetch the RSS feed for my application. The above blog is a wordpress blog. I am using feedparser import feedparser feeds = feedparser.parse('http://johnsmallman.wordpress.com/author/johnsmallman/feed/') Now feeds['feed']['title'] Outputs u"Johnsmallman's Blog \xbb John Smallman" My question is How exactly i present this in my app. Lets say this blog contains 100s of articles. So i want to loop over and