I\'m using import.sql to write my development data to DB. I\'m using MySQL Server 5.5 and my persistence.xml is here:
Here's a reliable solution without setting any system property.
We assume that import files are encoded with UTF-8
but Java default charset is different, let's say latin1
.
1) Define a custom class for import_files_sql_extractor hibernate.hbm2ddl.import_files_sql_extractor=com.pragmasphere.hibernate.CustomSqlExtractor
2) fix the invalid strings read by hibernate in the implementation.
package com.pragmasphere.hibernate;
import org.hibernate.tool.hbm2ddl.MultipleLinesSqlCommandExtractor;
import java.io.IOError;
import java.io.Reader;
import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
public class CustomSqlExtractor extends MultipleLinesSqlCommandExtractor {
private final String SOURCE_CHARSET = "UTF-8";
@Override
public String[] extractCommands(final Reader reader) {
String[] lines = super.extractCommands(reader);
Charset charset = Charset.defaultCharset();
if (!charset.equals(Charset.forName(SOURCE_CHARSET))) {
for (int i = 0; i < lines.length; i++) {
try {
lines[i] = new String(lines[i].getBytes(), SOURCE_CHARSET);
} catch (UnsupportedEncodingException e) {
throw new IOError(e);
}
}
}
return lines;
}
}
You can change value of SOURCE_CHARSET
with another encoding used by import files.
Since version 5.2.3 there is a new property in Hibernate for cases like this.
<property name="hibernate.hbm2ddl.charset_name" value="UTF-8" />
I'm using import.sql to populate database on test phase and this link has helped me to solve encoding problem: http://javacimrman.blogspot.ru/2011/07/hibernate-importsql-encoding-when.html.
When creating the reader for that file, Hibernate uses new InputStreamReader(stream);
directly, without explicit encoding (the default execution platform charset encoding is assumed/used).
So, in other words, your import.sql
file must be in the default execution platform charset encoding.
There is an old (2006!) open issue for this, in case one wishes to send a patch: https://hibernate.atlassian.net/browse/HBX-711
Options to fix:
Add -Dfile.encoding=UTF-8
to the JAVA_OPTS
environment variable, such as:
# Linux/Unix
export JAVA_OPTS=-Dfile.encoding=UTF-8
# Windows
set JAVA_OPTS=-Dfile.encoding=UTF-8
# Attention, check before if your JAVA_OPTS doesn't already have a value. If so,
# then it should be
export JAVA_OPTS=$JAVA_OPTS -Dfile.encoding=UTF-8
# or
set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8
Set a property in your Maven plugin (could be surefire
, failsafe
or other, depending on how do you run the code that imports the hibernate file). Example for surefire
:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<argLine>-Dfile.encoding=UTF8</argLine>
</configuration>
</plugin>
If gradle: To add this property in gradle add systemProperty 'file.encoding', 'UTF-8'
to task configuration block. (Thanks @meztihn)