Splitting a string containing a longitude or latitude expression in perl

亡梦爱人 提交于 2020-01-24 15:11:45

问题


I retrieve data from the net containing real geodesic expressions, by that I mean degrees, minutes and seconds with Unicode symbols: U+00B0, U+2032 and U+2033, named Degree, Prime and Double Prime. Example:

my $Lat = "48° 25′ 43″ N";

My objective is to convert such an expression first to degrees and then to radians to be used in a Perl module I am writing that implements the Vincenty inverse formula to calculate ellipsoidal great-circle distances. All my code objectives have been met with pseudo geodesics, such as "48:25:43 N", but of course, this is hand entered test data, not real world data. I am struggling with crafting a regular expression that can split this real data as I now do pseudo data, as in:

my ($deg, $min, $sec, $dir) = split(/[\s:]+/, $_[0], 4); # this works

I have tried many regular expressions including

/[°′″\s]+/ and
/[\x{0B00}\x{2032}\x{2033}\s]/+

all with dismal results, such as $deg = "48?", $min = "?", $sec = "25′43″ N" and $dir = undef. I've encapsulated the code inside braces {} and included within that scope use utf8; and use feature 'unicode_strings'; all with nada results.

input data example:

my $Lat = "48° 25′ 43″ N"; 

Expected output:

$deg = 48, $min = 25, $sec = 43 and $dir = "N"

回答1:


You may try this regex to split the string:

[^\dNSEW.]+

Regex Demo

Sample source: ( run here )

my $str = '48° 25′ 43″ N';
my $regex = qr/[^\dNSEW.]+/p;
my ($deg, $min, $sec, $dir) = split $regex, $str;



回答2:


My bad! Pilot error!

The original regex I posted, and was struggling with was:

/[\x{0B00}\x{2032}\x{2033}\s]/+

The error(s) are where I placed the '+' character and the hex value of the degree character. That regex should have been written:

/[\x{B0}\x{2032}\x{2033}\s]+/

The answer from @Rizwan was illuminating but I was determined to make regular expressions in Perl work with Unicode, so I persevered, and now this is my solution:

use utf8;
no warnings;

my $dms = "48° 25′ 43.314560″ N";
my $regex = qr/[\x{B0}\x{2032}\x{2033}:\s]+/p; # some geodesics do use ':'
my ($deg, $min, $sec, $dir) = split $regex, $dms;
printf("\$deg: %s, \$min: %s, \$sec: %s, \$dir: %s\n",
       $deg, $min, $sec, $dir);

Like it or not, Unicode is the future.



来源:https://stackoverflow.com/questions/48534863/splitting-a-string-containing-a-longitude-or-latitude-expression-in-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!