问题
I am using feedparser
in order to get RSS data.
Here is my code :
>>> import datetime
>>> import time
>>> import feedparser
>>> d=feedparser.parse("http://.../rss.xml")
>>> datetimee_rss = d.entries[0].published_parsed
>>> datetimee_rss
time.struct_time(tm_year=2015, tm_mon=5, tm_mday=8, tm_hour=16, tm_min=57, tm_sec=39, tm_wday=4, tm_yday=128, tm_isdst=0)
>>> datetime.datetime.fromtimestamp(time.mktime(datetimee_rss))
datetime.datetime(2015, 5, 8, 17, 57, 39)
In my timezone (FR), the actual date is May, 8th, 2015 18:57
.
In the RSS XML, the value is <pubDate>Fri, 08 May 2015 18:57:39 +0200</pubDate>
When I parse it into datetime, I got 2015, 5, 8, 17, 57, 39
.
How to have 2015, 5, 8, 18, 57, 39
without dirty hack, but simply by configuring the correct timezone ?
EDIT:
By doing :
>>> from pytz import timezone
>>> datetime.datetime.fromtimestamp(time.mktime(datetimee_rss),tz=timezone('Euro
pe/Paris'))
datetime.datetime(2015, 5, 8, 17, 57, 39, tzinfo=<DstTzInfo 'Europe/Paris' CEST+2:00:00 DST>)
I got something nicer, however, it doesn't seem to work in the rest of the script, I got plenty of TypeError: can't compare offset-naive and offset-aware datetimes
error.
回答1:
feedparser
does provide the original datetime string (just remove the _parsed
suffix from the attribute name), so if you know the format of the string, you can parse it into a tz-aware datetime object yourself.
For example, with your code, you can get the tz-aware object as such:
datetime.datetime.strptime(d.entries[0].published, '%a, %d %b %Y %H:%M:%S %z')
for more reference on strptime()
, see https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
EDIT: Since Python 2.x doesn't support %z
directive, use python-dateutil
instead
pip install python-dateutil
then
from dateutil import parser
datetime_rss = parser.parse(d.entries[0].published)
documentation at https://dateutil.readthedocs.org/en/latest/
回答2:
feedparser
returns time in UTC timezone. It is incorrect to apply time.mktime()
to it (unless your local timezone is UTC that it isn't). You should use calendar.timegm()
instead:
import calendar
from datetime import datetime
utc_tuple = d.entries[0].published_parsed
posix_timestamp = calendar.timegm(utc_tuple)
local_time_as_naive_datetime_object = datetime.frometimestamp(posix_timestamp) # assume non-"right" timezone
RSS feeds may use many different dates formats; I would leave the date parsing to feedparser
module.
If you want to get the local time as an aware datetime object:
from tzlocal import get_localzone # $ pip install tzlocal
local_timezone = get_localzone()
local_time = datetime.frometimestamp(posix_timestamp, local_timezone) # assume non-"right" timezone
回答3:
Try this:
>>> import os
>>> os.environ['TZ'] = 'Europe/Paris'
>>> time.tzset()
>>> time.tzname
('CET', 'CEST')
来源:https://stackoverflow.com/questions/30130588/have-a-correct-datetime-with-correct-timezone