How can I handle unicode with Perl's DBI?

北城以北 提交于 2019-12-06 17:07:40

问题


My delicious-to-wp perl script works but gives for all "weird" characters even weirder output. So I tried

$description = decode_utf8( $description ); 

but that doesnt make a difference. I would like e.g. “go live” to become “go live” and not “go live†How can I handle unicode in Perl so that this works?

UPDATE: I found the problem was to set utf of DBI I had to set in Perl:

my $sql = qq{SET NAMES 'utf8';};
$dbh->do($sql);

That was the part that I had to set, tricky. Thanks!


回答1:


It may have nothing to do with Perl. Check to make sure you're using UTF encodings in the pertinent MySQL table columns.




回答2:


It's worth noting that if you're running a version of DBD::mysql new enough (3.0008 on), you can do the following: $dbh->{'mysql_enable_utf8'} = 1; and then everything's decode()ed/encode()ed for you on the way out from/in to DBI.




回答3:


Enable UTF8, when you connect to database like this:

my $dbh = DBI->connect(
    "dbi:mysql:dbname=db_name", 
    "db_user", "db_pass",
     {RaiseError => 0, PrintError => 0, mysql_enable_utf8 => 1}
 ) or die "Connect to database failed.";

This should get you character mode strings with the UTF8 flag set as needed.

From DBI General Interface Rules & Caveats:

Perl supports two kinds of strings: Unicode (utf8 internally) and non-Unicode (defaults to iso-8859-1 if forced to assume an encoding). Drivers should accept both kinds of strings and, if required, convert them to the character set of the database being used. Similarly, when fetching from the database character data that isn't iso-8859-1 the driver should convert it into utf8.

And the specifics from DBD::mysql for mysql_enable_utf8

Additionally, turning on this flag tells MySQL that incoming data should be treated as UTF-8. This will only take effect if used as part of the call to connect(). If you turn the flag on after connecting, you will need to issue the command SET NAMES utf8 to get the same effect.




回答4:


The term

$dbh->do(qq{SET NAMES 'utf8';});

definitely saves the day for accessing an utf-8 declared database, but take notice, if you are going to do any perl processing of any data obatined from the db it would be wise to store it in a perl var as an utf8 string with, as this operation is not implicit.

$utfstring = decode('utf8',$string_from_db);

of course, for proper i/o handling of utf8 strings (reading, printing, writing to output) remember to set

use open ':utf8';

and

binmode STDOUT, ":utf8";

the latter being essential for printing out utf8 strings. Hope this helps.




回答5:


Leave this öne out:

binmode STDOUT, ":utf8";

when using:

$dbh->do(qq{SET NAMES 'utf8';});

Otherwise your output will have double utf8 encoding, resulting in unreadable double byte characters! It took me a couple of hours to figure this out..




回答6:


By default, the driver Perl/MySQL handles binary data (at least I concluded this from some experiments with MySQL 5.1 and 5.5).

Without setting mysql_enable_utf8, I encoded/decoded the strings to/from UTF-8 before writing/reading to/from the database.

It should not be relied upon the perl-internal string representation as an array of byte; be aware that the internal 'utf8' is not guaranteed to be standard UTF-8; in converse, the single byte encoding is not guaranteed to be ISO-8859-1; really do encode/decode to/from UTF-8 (and not 'utf8').

There are also some settings of MySQL (like SET NAMES above, as far as I remember there is a client encoding, a connection encoding, and a server encoding, whose interactions are not quite clear to me if they do not all have the same value) regarding to the encodings; setting all of them to UTF-8, and the recipe above, worked for me.



来源:https://stackoverflow.com/questions/983778/how-can-i-handle-unicode-with-perls-dbi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!