I am trying to query data from a mysql database, which contains some strings, of course. For the connection and data retrieval I am using RMySQL in R, which works fine. Apart from one thing: the strings I am retrieving seem not to be in utf8. But I need this, because I have some german "Umlaute" in these strings. When I ask teh databse, which are its encoding by
dbGetQuery(db, "SHOW VARIABLES LIKE 'character_set_%';")
I get the desired answer:
Variable_name Value
1 character_set_client utf8
2 character_set_connection utf8
3 character_set_database utf8
4 character_set_filesystem binary
5 character_set_results utf8
6 character_set_server utf8
7 character_set_system utf8
8 character_sets_dir C:\\Program Files\\MySQL\\MySQL Server 5.7\\share\\charsets\\
But e.g. I receive
Andreas Wünsche
instead of
Andreas Wünsche
Hope that somebody knows how to deal with it. If additonal information is needed, just ask. I can provide it.
I find something a bit tricky but works for me :
you have to manually define the col of your data frame to utf-8 like this :
x <- "Wünsche"
Encoding(x) <- "UTF-8"
x
[1] "Wünsche"
Think you have to do this to all your strings vector
EDIT :
Take a look here
seems to fix the same problem by adding 'set character set "utf8"'
inside the dbSendQuery()
When trying to use utf8/utf8mb4, if you see Mojibake, check the following. This discussion also applies to Double Encoding, which is not necessarily visible.
- The bytes to be stored need to be utf8-encoded.
- The connection when
INSERTing
andSELECTing
text needs to specify utf8 or utf8mb4. - The column needs to be declared
CHARACTER SET utf8
(or utf8mb4). - HTML should start with
<meta charset=UTF-8>
.
来源:https://stackoverflow.com/questions/38347348/utf8-encoding-using-rmysql