I\'d love to know if there is a module to parse \"human formatted\" dates in Perl. I mean things like \"tomorrow\", \"Tuesday\", \"next week\", \"1 hour ago\".
My re
you may also find it interesting to look at the DateTime::Format
family, specifically DateTime::Format::Natural. once you've parsed your date/time into a DateTime object, you can manipulate and evaluate it in a whole bunch of different ways.
here's a sample program:
use strict;
use warnings;
use DateTime::Format::Natural;
my( $parser ) = DateTime::Format::Natural->new;
while ( <> ) {
chomp;
my( $dt ) = $parser->parse_datetime( $_ );
if ( $parser->success ) {
print join( ' ', $dt->ymd, $dt->hms ) . "\n";
}
else {
print $parser->error . "\n";
}
}
output:
tomorrow
2008-11-18 21:48:49
next Tuesday
2008-11-25 21:48:53
1 week from now
2008-11-24 21:48:57
1 hour ago
2008-11-17 20:48:59
TMTOWTDI :)
-steve
I assume you have context. how could NLP help here ? as a wild guess you could just find the nearest date that is an exact date(not relative to today) and use today/tommorow/yesterday to relate to that.
Date::Manip does exactly this.
Here is an example program:
#!/usr/bin/perl
use strict;
use Date::Manip;
while (<DATA>)
{
chomp;
print UnixDate($_, "%Y-%m-%d %H:%M:%S"), " ($_)\n";
}
__DATA__
today
yesterday
tomorrow
last Tuesday
next Tuesday
1 hour ago
next week
Which results in the following output:
2008-11-17 15:21:04 (today)
2008-11-16 15:21:04 (yesterday)
2008-11-18 15:21:04 (tomorrow)
2008-11-11 00:00:00 (last Tuesday)
2008-11-18 00:00:00 (next Tuesday)
2008-11-17 14:21:04 (1 hour ago)
2008-11-24 00:00:00 (next week)
UnixDate is one of the functions provided by Date::Manip
, the first argument is a date/time in any format that the module supports, the second argument describes how to format the date/time. There are other functions that just parse these "human" dates, without formatting them, to be used in delta calculations, etc.
Personally, I've always used Time::ParseDate for this. It understands pretty much every format I've tried.
Absolute date formats
Dow, dd Mon yy
Dow, dd Mon yyyy
Dow, dd Mon
dd Mon yy
dd Mon yyyy
Month day{st,nd,rd,th}, year
Month day{st,nd,rd,th}
Mon dd yyyy
yyyy/mm/dd
yyyy-mm-dd (usually the best date specification syntax)
yyyy/mm
mm/dd/yy
mm/dd/yyyy
mm/yy
yy/mm (only if year > 12, or > 31 if UK)
yy/mm/dd (only if year > 12 and day < 32, or year > 31 if UK)
dd/mm/yy (only if UK, or an invalid mm/dd/yy or yy/mm/dd)
dd/mm/yyyy (only if UK, or an invalid mm/dd/yyyy)
dd/mm (only if UK, or an invalid mm/dd)
Relative date formats:
count "days"
count "weeks"
count "months"
count "years"
Dow "after next"
Dow "before last"
Dow (requires PREFER_PAST or PREFER_FUTURE)
"next" Dow
"tomorrow"
"today"
"yesterday"
"last" dow
"last week"
"now"
"now" "+" count units
"now" "-" count units
"+" count units
"-" count units
count units "ago"
Absolute time formats:
hh:mm:ss[.ffffd]
hh:mm
hh:mm[AP]M
hh[AP]M
hhmmss[[AP]M]
"noon"
"midnight"
Relative time formats:
count "minutes" (count can be franctional "1.5" or "1 1/2")
count "seconds"
count "hours"
"+" count units
"+" count
"-" count units
"-" count
count units "ago"
Timezone formats:
[+-]ffffdd
GMT[+-]d+
[+-]ffffdd (TZN)
TZN
Special formats:
[ d]d/Mon/yyyy:hh:mm:ss [[+-]ffffdd]
yy/mm/dd.hh:mm