Convert ISO-8859-1 to UTF-8 using groovy

后端 未结 2 2220
情歌与酒
情歌与酒 2021-02-14 09:24

i need to convert a ISO-8859-1 file to utf-8 encoding, without loosing content intormations...

i have a file which looks like this:



        
相关标签:
2条回答
  • 2021-02-14 09:53
    def f=new File('c:/data/myiso88591.xml').getText('ISO-8859-1')
    new File('c:/data/myutf8.xml').write(f,'utf-8')
    

    (I just gave it a try, it works :-)

    same as in java: the libraries do the conversion for you... as deceze said: when you specify an encoding, it will be converted to an internal format (utf-16 afaik). When you specify another encoding when you write the string, it will be converted to this encoding.

    But if you work with XML, you shouldn't have to worry about the encoding anyway because the XML parser will take care of it. It will read the first characters <?xml and determines the basic encoding from those characters. After that, it is able to read the encoding information from your xml header and use this.

    0 讨论(0)
  • 2021-02-14 10:05

    Making it a little more Groovy, and not requiring the whole file to fit in memory, you can use the readers and writers to stream the file. This was my solution when I had files too big for plain old Unix iconv(1).

    new FileOutputStream('out.txt').withWriter('UTF-8') { writer ->
        new FileInputStream('in.txt').withReader('ISO-8859-1') { reader ->
            writer << reader
        }
    }
    
    • http://www.hjsoft.com/blog/link/A_Useful_Example_in_Java_Ruby_and_Groovy
    0 讨论(0)
提交回复
热议问题