Hibernate/JPA import.sql utf8 characters corrupted

后端 未结 4 1796
感情败类
感情败类 2020-12-31 07:48

I\'m using import.sql to write my development data to DB. I\'m using MySQL Server 5.5 and my persistence.xml is here:



        
相关标签:
4条回答
  • 2020-12-31 07:54

    Here's a reliable solution without setting any system property.

    We assume that import files are encoded with UTF-8 but Java default charset is different, let's say latin1.

    1) Define a custom class for import_files_sql_extractor hibernate.hbm2ddl.import_files_sql_extractor=com.pragmasphere.hibernate.CustomSqlExtractor

    2) fix the invalid strings read by hibernate in the implementation.

    package com.pragmasphere.hibernate;
    
    import org.hibernate.tool.hbm2ddl.MultipleLinesSqlCommandExtractor;
    
    import java.io.IOError;
    import java.io.Reader;
    import java.io.UnsupportedEncodingException;
    import java.nio.charset.Charset;
    
    public class CustomSqlExtractor extends MultipleLinesSqlCommandExtractor {
    
        private final String SOURCE_CHARSET = "UTF-8";
    
        @Override
        public String[] extractCommands(final Reader reader) {
            String[] lines = super.extractCommands(reader);
    
            Charset charset = Charset.defaultCharset();
            if (!charset.equals(Charset.forName(SOURCE_CHARSET))) {
                for (int i = 0; i < lines.length; i++) {
                    try {
                        lines[i] = new String(lines[i].getBytes(), SOURCE_CHARSET);
                    } catch (UnsupportedEncodingException e) {
                        throw new IOError(e);
                    }
                }
            }
    
            return lines;
        }
    }
    

    You can change value of SOURCE_CHARSET with another encoding used by import files.

    0 讨论(0)
  • 2020-12-31 07:56

    Since version 5.2.3 there is a new property in Hibernate for cases like this.

    <property name="hibernate.hbm2ddl.charset_name" value="UTF-8" />
    
    0 讨论(0)
  • 2020-12-31 08:12

    I'm using import.sql to populate database on test phase and this link has helped me to solve encoding problem: http://javacimrman.blogspot.ru/2011/07/hibernate-importsql-encoding-when.html.

    0 讨论(0)
  • 2020-12-31 08:19

    When creating the reader for that file, Hibernate uses new InputStreamReader(stream); directly, without explicit encoding (the default execution platform charset encoding is assumed/used).

    So, in other words, your import.sql file must be in the default execution platform charset encoding.

    There is an old (2006!) open issue for this, in case one wishes to send a patch: https://hibernate.atlassian.net/browse/HBX-711


    Options to fix:

    • Add -Dfile.encoding=UTF-8 to the JAVA_OPTS environment variable, such as:

      # Linux/Unix
      export JAVA_OPTS=-Dfile.encoding=UTF-8
      # Windows
      set JAVA_OPTS=-Dfile.encoding=UTF-8
      
      # Attention, check before if your JAVA_OPTS doesn't already have a value. If so,
      # then it should be
      export JAVA_OPTS=$JAVA_OPTS -Dfile.encoding=UTF-8
      # or
      set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8
      
    • Set a property in your Maven plugin (could be surefire, failsafe or other, depending on how do you run the code that imports the hibernate file). Example for surefire:

      <plugin>
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-surefire-plugin</artifactId>
         <configuration>
            <argLine>-Dfile.encoding=UTF8</argLine>
         </configuration>
      </plugin>
      
    • If gradle: To add this property in gradle add systemProperty 'file.encoding', 'UTF-8' to task configuration block. (Thanks @meztihn)

    0 讨论(0)
提交回复
热议问题