Remove multiple BOMs from a file

我只是一个虾纸丫 提交于 2019-12-29 04:42:06

问题


I am using a Javascript file that is a concatenation of other JavaScript files.

Unfortunately, the person who concatenated these JavaScript files together did not use the proper encoding when reading the file, and allowed a BOM for every single JavaScript file to get written to the concatenated JavaScript file.

Does anyone know a simple way to search through the concatenated file and remove any/all BOM markers?

Using PHP or a bash script for Mac OSX would be great.


回答1:


See also: Using awk to remove the Byte-order mark

To remove multiple BOMs from anywhere within a text file you can try something similar. Just leave out the ^ anchor:

perl -e 's/\xef\xbb\xbf//;' -pi~ file.js

(This edits the file in-place. But creates a backup file.js~.)




回答2:


I normally do it using vim:

vim -c "set nobomb" -c wq! myfile



回答3:


fetch BOM files

grep -rIlo $’^\xEF\xBB\xBF’ ./

remove BOM files

grep -rIlo $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

exclude .svn dir

grep -rIlo –exclude-dir=”.svn” $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’

  • See more at: http://www.a5go.com/how-to-remove-bom-from-utf-8-using-sed.html#



回答4:


I also figured out this solution which works entirely in PHP:

$packed = pack("CCC",0xef,0xbb,0xbf);
$contents = preg_replace('/'.$packed.'/','',$contents);


来源:https://stackoverflow.com/questions/9100728/remove-multiple-boms-from-a-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!