JDBC-ODBC Bridge queries to Access fail when they have accented characters

后端 未结 2 597
无人及你
无人及你 2020-12-10 23:03

I\'m sending a query through the JDBC-ODBC Bridge to an Access database from Java, like this:

\"SELECT * FROM localities WHERE locName=\'\" + cityName + \"\'         


        
相关标签:
2条回答
  • 2020-12-10 23:47

    Try to use PreparedStatement.

    I have just tested MS Access Northwind database with Jython 2.5 which uses JDBC-ODBC bridge:

    c = db.createStatement()
    TRADH = u'Tradi\xe7\u0103o Hipermercados'
    pstm = db.prepareStatement("SELECT CustomerID, CompanyName FROM customers WHERE CompanyName=?")
    pstm.setString(1, TRADH)
    rs = pstm.executeQuery()
    while (rs.next()):
        try:
            s1 = rs.getString(1)
            s2 = rs.getString(2)
            print('[%s] [%s]' % (s1, s2))
        except UnicodeEncodeError:
            print('[%s] [%s] !!!' % (s1, repr(s2)))
    c.close()
    

    In your code it will look like:

    pstm = db.prepareStatement("SELECT * FROM localities WHERE locName=?");
    pstm.setString(1, locName);
    rs = pstm.executeQuery();
    ...
    
    0 讨论(0)
  • 2020-12-11 00:04

    It sounds like your Java source file is encoded as UTF-8 so when the cityName string contains LEÓN it is encoded as

    L  E  Ó     N
    -- -- ----- --
    4C 45 C3 93 4E
    

    That is not how Access stores the value. Access does store characters as Unicode, but it does not use UTF-8 encoding. It uses a variation of UTF-16LE encoding where characters with code points U+00FF and below are stored in a single byte, and characters with code points above U+00FF are stored as a Null (0x0) value followed by their UTF-16LE byte pair(s). In this case Ó is U+00D3, which is below U+00FF, so Access stores all four characters of the string as single bytes:

    L  E  Ó  N
    -- -- -- --
    4C 45 D3 4E
    

    The net effect is that the encoding of the string in the Access database is the same as it would be for the ISO 8859-1 character set.

    This can be confirmed with the following Java code which uses the JDBC-ODBC Bridge. It fails to find the desired record when the Java source file is encoded as UTF-8, but it works when the Java source file is encoded as cp1252 in Eclipse:

    import java.sql.*;
    
    public class accentTestMain {
    
        public static void main(String[] args) {
            String connectionString = 
                    "jdbc:odbc:Driver={Microsoft Access Driver (*.mdb, *.accdb)};" + 
                    "DBQ=C:\\__tmp\\test\\accented.accdb;";
            try {
                Connection con = DriverManager.getConnection(connectionString);
                PreparedStatement stmt = con.prepareStatement("SELECT * FROM localities WHERE locName=?");
                String cityName = "LEÓN";
                stmt.setString(1, cityName);
                stmt.execute();
                ResultSet rs = stmt.getResultSet();
                if (rs.next()) {
                    System.out.println(String.format("Record found, ID=%d", rs.getInt("ID")));
                }
                else {
                    System.out.print("Record not found.");
                }
            } catch (SQLException e) {
                e.printStackTrace();
            }
        }
    
    }
    

    If you can make do with supporting only the accented characters represented in the cp1252 character set then you should be able to simply use cp1252 as the encoding setting for your Java source file(s).

    On the other hand, if you really need full Unicode character support with an Access database then the JDBC-ODBC Bridge is not going to get the job done for you. This is a long-standing interoperability issue between the JDBC-ODBC Bridge and the Access ODBC driver, and it is not going to be fixed. (More details here.)

    In that case you might want to consider using UCanAccess which is a pure-Java JDBC driver for Access. The corresponding code using UCanAccess with a UTF-8 encoded source file would be

    // assumes...
    //     import java.sql.*;
    Connection conn=DriverManager.getConnection(
            "jdbc:ucanaccess://C:/__tmp/test/accented.accdb");
    PreparedStatement ps = conn.prepareStatement(
            "SELECT ID FROM localities WHERE locName=?");
    ps.setString(1, "LEÓN");
    ResultSet rs = ps.executeQuery();
    if (rs.next()) {
        System.out.println(String.format(
                "Record found, ID=%d", 
                rs.getInt("ID")));
    }
    else {
        System.out.println("Record not found.");
    }
    

    For more information on using UCanAccess, see the related question here.

    Another solution would be to use Jackcess to manipulate the Access database like so (again, the Java source file is encoded as UTF-8):

    import java.io.File;
    import java.io.IOException;
    import com.healthmarketscience.jackcess.*;
    
    public class accentTestMain {
    
        public static void main(String[] args) {
            Database db;
            try {
                db = DatabaseBuilder.open(new File("C:\\__tmp\\test\\accented.accdb"));
                try {
                    Table tbl = db.getTable("localities");
                    Cursor crsr = CursorBuilder.createCursor(tbl.getIndex("locName"));
                    if (crsr.findFirstRow(tbl.getColumn("locName"), "LEÓN")) {
                        System.out.println(String.format("Record found, ID=%d", crsr.getCurrentRowValue(tbl.getColumn("ID"))));
                    }
                    else {
                        System.out.println("Record not found.");
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                } finally {
                    db.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    
    }
    
    0 讨论(0)
提交回复
热议问题