Can Encode::Guess tell utf-8 from iso-8859-1?

寵の児 提交于 2019-12-23 09:37:20

问题


I have a string $data, encoded in utf-8. I assume that I don't know whether this string is utf-8 or iso-8859-1. I want to use the Perl Encode::Guess module to see if it's one or the other. I'm having trouble figuring out how this module works.

I have tried the four following methods (from http://perldoc.perl.org/Encode/Guess.html) :

use Encode::Guess qw/utf8 latin1/;

my $decoder = guess_encoding($data);

print "$decoder\n";

Result: iso-8859-1 or utf8

use Encode::Guess qw/utf8 latin1/;

my $enc = guess_encoding($data, qw/utf8 latin1/);
ref($enc) or die "Can't guess: $enc";
my $utf8 = $enc->decode($data); 

print "$utf8\n";

Result: Can't guess: iso-8859-1 or utf8 at encodage-windows.pl line 25, line 18110.

use Encode::Guess qw/utf8 latin1/;

my $decoder = Encode::Guess->guess($data);
die $decoder unless ref($decoder);
my $utf8 = $decoder->decode($data);

print "$utf8\n";

Result: iso-8859-1 or utf8 at encodage-windows.pl line 30, line 18110.

use Encode::Guess qw/utf8 latin1/;

my $utf8 = Encode::decode("Guess", $data);

print "$utf8\n";

Result: iso-8859-1 or utf8 at /usr/local/lib/perl5/Encode.pm line 175.

My first question is: which one of these methods am I supposed to use (if any)? And my second question: what changes should I make to make this work?


回答1:


I normally check the possible encodings one at a time, like this

my $decoder = guess_encoding($data, 'utf8');
$decoder = guess_encoding($data, 'iso-8859-1') unless ref $decoder;
die $decoder unless ref $decoder;

printf "Decoding as %s\n\n", $decoder->name;
$data = $decoder->decode($data);

If possible it chooses UTF-8, otherwise it tries ISO-8859-1, and either chooses that or errors, so it becomes a simple yes/no result for each encoding and there is no way for it to come up with two possible results (which is the error you're getting).



来源:https://stackoverflow.com/questions/23015155/can-encodeguess-tell-utf-8-from-iso-8859-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!