Exporting HSQLDB database with UTF-8 encoding

问题

I'm trying to export the GeoTools HSQL 2 database and load it back into HSQL 1 for a legacy system that needs the older database format. The tables include characters like the degree symbol. However, it's coming out as the escape sequence \u0080 rather the encoded character. I need to either fix that or have HSQL 1 import convert the escaped characters back into the correct encoding.

e.g.

cp modules/plugin/epsg-hsql/src/main/resources/org/geotools/referencing/factory/epsg/EPSG.zip /tmp
cd /tmp
unzip EPSG.zip
java -jar hsqldb-2.4.1.jar 
# For the file, put jdbc:hsqldb:file:/tmp/EPSG
SELECT 'epsg-dump'

And in the results I see things like this \u00b5:

INSERT INTO EPSG_ALIAS VALUES(389,'epsg_unitofmeasure',9109,7302,'\u00b5rad','')

Looking into hsqldb, I'm not sure how to control the encoding the of the data being written, assuming that this is the correct location to look:

https://github.com/ryenus/hsqldb/blob/master/src/org/hsqldb/scriptio/ScriptWriterText.java

回答1:

You can use the following procedure:

In the source database, create TEXT tables with exactly the same columns as the original tables. Use CREATE TEXT TABLE thecopyname (LIKE thesourcename) for each table.
Use SET TABLE thecopyname SOURCE 'thecopyname.csv;encoding=UTF-8' for each of the copy tables.
INSERT into each thecopyname table with SELECT * FROM thesourcename.
Use SET TABLE thecopyname SOURCE OFF for each thecopyname
You will now have several thecopyname.csv files (each with its own name) with UTF8 encoding.
Use the reverse procedure on the target database. You need to explicity create the TEXT tables then use SET TABLE thecopyname SOURCE 'thecopyname.csv;encoding=UTF-8'

回答2:

The encoding looks like Unicode (one to four hex digits). Try this in bash (quick & dirty):

echo -ne "$(< dump.sql)" > dump_utf8.sql

来源：https://stackoverflow.com/questions/54208286/exporting-hsqldb-database-with-utf-8-encoding

标签

java

utf-8

hsqldb

geotools