How to set charset for MySQL in RODBC?

后端 未结 2 622
暖寄归人
暖寄归人 2021-01-19 17:31

I have a data with chinese characters as field names and data, I have imported them from xls to access 2007 and export them to ODBC. Then I use RODBC to read them in R, the

相关标签:
2条回答
  • 2021-01-19 18:02

    I'm not familiar with ODBC and RODBC, but my reading of the above snippet of documentation is that SET NAMES 'utf8'; is part of MySQL's SQL dialect, so you run that as you would any other SQL statement that you might use to retrieve data from your data base.

    Something like (not tested):

    sqlQuery(myChannel, query = "SET NAMES 'utf8';")
    

    where myChannel is the connection handle returned by odbcConnect().

    Is there a reason you are using RODBC over the RMySQL package? I've had good experience using RMySQL for extensive data processing and retrieval of complex sets of data all from within R.

    Update: There is some evidence that, at least at one point, that SET NAMES has been deactivated in the MySQL ODBC driver. If you are confident you can read the characters via direct access to the database (via mysql or one of MySQL's GUI front ends), then you could try to replicate what SET NAMES does. The following is from the MySQL manual:

    A SET NAMES 'x' statement is equivalent to these three statements:
    
    SET character_set_client = x;
    SET character_set_results = x;
    SET character_set_connection = x;
    

    You could try executing those three SQL statements in place of SET NAMES and see if that works.

    The same manual also documents SET CHARACTER SET, which can be used in the same way as SET NAMES:


    SET CHARACTER SET charset_name

    SET CHARACTER SET is similar to SET NAMES but sets character_set_connection and collation_connection to character_set_database and collation_database. A SET CHARACTER SET x statement is equivalent to these three statements:

    SET character_set_client = x;
    SET character_set_results = x;
    SET collation_connection = @@collation_database;
    

    Setting collation_connection also sets character_set_connection to the character set associated with the collation (equivalent to executing SET character_set_connection = @@character_set_database). It is not necessary to set character_set_connection explicitly.


    You could try using SET CHARACTER SET 'utf8' instead.

    Finally, what character set / locale are you running in? It looks like you are on windows - is this a UTF8 locale? I also note some confusion in your Q. You say you have imported your data to MS Access, and then export it to ODBC. Do you mean you exported it to MySQL? I though ODBC was a connection driver to allow communication with/between a range of databases, not something you could "export to".

    Are you data really in MySQL? Could you not connect to MS Access via RODBC to read the data from there?

    If the data are in MySQL, try using the RMySQL package to connect to the database and read the data.

    0 讨论(0)
  • 2021-01-19 18:12

    I just found the cure. Don't know if I can post.

    1. Set up the MySQL database to be UTF-8 based;

    2. Set up the ODBC DSN and do NOT set the "character set" option.

    3. ch<-odbcConnect("mydb",DBMSencoding="UTF-8");

    That's it.

    0 讨论(0)
提交回复
热议问题