Perl 6 error message: Malformed UTF-8 in block <unit>

大憨熊 提交于 2019-12-08 15:58:32

问题


I'm trying to read a downloaded html-file

my $file = "sn.html";
my $in_fh = open $file, :r;
my $text = $in_fh.slurp;

and I get the following error message:

Malformed UTF-8
  in block <unit> at prog.p6 line 10

How to avoid this and get access to the file's contents?


回答1:


If you do not specify an encoding when opening a file, it will assume utf8. Apparently, the file that you wish to open, contains bytes that cannot be interpreted as UTF-8. Hence the error message.

Depending on what you want to do with the file contents, you could either set the :bin named parameter, to have the file opened in binary mode. Or you could use the special utf8-c8 encoding, which will assume UTF-8 until it encounters bytes it cannot encode: in that case it will generate temporary code points.

See https://docs.raku.org/language/unicode#UTF8-C8 for more information.




回答2:


For slurp, if you have some idea about encoding, you can also add encoding specifically.

From documentation (https://docs.perl6.org/routine/slurp):

my $text_contents   = slurp "path/to/file", enc => "latin1";

I used it today for a stupid file encoded in ISO-8859-1.



来源:https://stackoverflow.com/questions/49320061/perl-6-error-message-malformed-utf-8-in-block-unit

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!