问题
I'm trying to read a downloaded html-file
my $file = "sn.html";
my $in_fh = open $file, :r;
my $text = $in_fh.slurp;
and I get the following error message:
Malformed UTF-8
in block <unit> at prog.p6 line 10
How to avoid this and get access to the file's contents?
回答1:
If you do not specify an encoding when opening a file, it will assume utf8
. Apparently, the file that you wish to open, contains bytes that cannot be interpreted as UTF-8. Hence the error message.
Depending on what you want to do with the file contents, you could either set the :bin
named parameter, to have the file opened in binary mode. Or you could use the special utf8-c8
encoding, which will assume UTF-8 until it encounters bytes it cannot encode: in that case it will generate temporary code points.
See https://docs.raku.org/language/unicode#UTF8-C8 for more information.
回答2:
For slurp, if you have some idea about encoding, you can also add encoding specifically.
From documentation (https://docs.perl6.org/routine/slurp
):
my $text_contents = slurp "path/to/file", enc => "latin1";
I used it today for a stupid file encoded in ISO-8859-1.
来源:https://stackoverflow.com/questions/49320061/perl-6-error-message-malformed-utf-8-in-block-unit