Insert BOM to a CSV File using PERL

白昼怎懂夜的黑 提交于 2020-01-06 14:42:08

问题


Hi I am having a problem with making my CSV file readable. I am currently trying to do it using PERL. Here's my line of code:

#!/usr/bin/perl

$infile = @ARGV[0];
$outfile = @ARGV[1];

open(INFILE,"$infile") || die "cannot open input file : $infile : ";

open(OUTFILE,">$outfile") || die "cannot open output file";

$/="undef";

while(<INFILE>)

{

  $temp=$_;

}

close(INFILE);

  print OUTFILE "\x{feff}".$temp;

close(OUTFILE);

However, the CSV file is still unreadable. Is there anything that I can do to insert BOM? Thanks!


回答1:


Before we do this, let me tell you that BOMs are an incredible pain in most cases, and should be avoided wherever possible. They are only technically necessary with UTF-16 encodings. The BOM is the Unicode character U+FEFF. It is encoded in UTF-8 as EF BB BF, in UTF-16LE as FF FE, and UTF-16BE as FE FF. It seems you are assuming that your input is UTF-16BE, in that case you could write the bytes directly:

open my $in,  "<:raw", $ARGV[0] or die "Can't open $ARGV[0]: $!";
open my $out, ">:raw", $ARGV[1] or die "Can't open $ARGV[1]: $!";

print $out "\xFE\xFF";
while (<$in>) {
    print $out $_;
}

But it would probably be better to decode and the encode the output again, and explicitly specify the BOM as a character:

open my $in,  "<:encoding(UTF-16BE)", $ARGV[0] or die "Can't open $ARGV[0]: $!";
open my $out, ">:encoding(UTF-16BE)", $ARGV[1] or die "Can't open $ARGV[1]: $!";

print $out "\N{U+FEFF}";
while (<$in>) {
    print $out $_;
}



回答2:


What you probably want to do, rather than manually inserting a BOM, is set the output file encoding to whatever it is you need.

Also:

  • You are setting the input record separator to the literal string "undef", which is definitely not what you want! (Although it happens to work as long as undef doesn't appear in the input files). Remove the quotes there.
  • use warnings; use strict;



回答3:


I think you need something like this at the top of your code:

use open OUT => ':encoding(UTF-16)';



回答4:


You've got a few answers about your BOM. But here's your code written in more idiomatic Perl.

#!/usr/bin/perl

use strict;
use warnings;

my ($infile, $outfile) = @ARGV;

open my $in_fh, $infile or die "cannot open input file : $infile : $!";
open my $out_fh, '>', $outfile or die "cannot open output file: $!";

print $out_fh "\x{feff}";
print $out_fh while <$in_fh>;


来源:https://stackoverflow.com/questions/22710558/insert-bom-to-a-csv-file-using-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!