How to compare and output latin characters?

柔情痞子 提交于 2019-12-13 20:34:22

问题


I have an array of countries with one having a Latin character "Å":

$country["af"] = "Afghanistan";
$country["ax"] = "Åland Islands";
$country["al"] = "Albania";

While looping through this array and performing a comparison of the first character of the country name, I cannot match the Latin character.

foreach($country as $cc => $name)
{
 if($name[0] == "Å")
 {
  echo "matched";
 }
 else
 {
  echo $name[0];
 }
}

The result I got is: A�A

Why does the Latin character Å became � and how do I perform a proper comparison and output the Latin character Å?

Add Note: The http header and the html document have already been specified as UTF-8 format.

Add Note2: If I just echo $name instead of $name[0], I am able to get the Å in Åland Islands. Using substr($name, 0, 1) has the same effect as $name[0], which gives me �.


回答1:


Change your script to this. The unicode encoding words cannot explode with normal string functions. You have to use multibyte functions.

foreach($country as $cc => $name)
{
     if(mb_substr($name,0,1,"UTF-8") == "Å")
     {
      echo "matched";
     }
     else
     {
      echo mb_substr($name,0,1,"UTF-8");
     }
}



回答2:


The problem is that programs have different ways of representing different characters. This is referred to as character encoding. Your browser, server, and PHP code are currently confused about which encoding you are using because you are mixing UTF-8 characters with ANSI code.

You can learn more about encoding here: http://vlaurie.com/computers2/Articles/characters.htm

There are three things that I do whenever I build a UTF-8 PHP site. These three things should resolve your problem:

Add a PHP UTF-8 Header

Add this to the top of your code:

<?php
header('Content-Type: text/html; charset=utf-8'); 
...

I believe that this instructs other servers and your browser to parse this document using UTF-8, instead of ANSI. You can read more about this here: Set HTTP header to UTF-8 using PHP

Add HTML UTF-8 Meta Tags

Add this code to the top of the HTML that you return:

<!doctype html>
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" /> 
...

This also instructs your browser to read the characters in UTF-8 (instead of ANSI). You can read more about this here: Set HTTP header to UTF-8 using PHP

Save the PHP File as UTF-8 without BOM

By default, your files usually save in ANSI encoding. If you want to work with international characters, then you need to save them inUTF-8encoding. This will let you work with theÅ` character properly.

If you are Notepad++ as your Text Editor, then you can set the encoding of your document under the Encoding menu. Set it to Encode in UTF-8 without BOM.

Gotcha

UTF-8 without BOM is not the same thing as UTF-8. UTF-8 files are often prepended with 3 bytes of data that indicate that the file is a UTF-8 file. This is referred to as the Byte Order Mark (BOM). You can read more about the BOM here: http://www.arclab.com/products/amlc/utf-8-php-cannot-modify-header-information.html

Most programs can tell that the file is UTF-8 anyway, so the BOM is redundant. If you don't save without the BOM, you'll probably get an error message like this:

Warning: Cannot modify header information – headers already sent

If you see this error message, then you probably have a BOM problem.




回答3:


The Question mark is because your viewer (browser) is trying to display a character that is not supported in the current character set. Why this is happening on accessing the first character with $name[0] I'm not sure.

Based on the post here: PHP: Convert specific-Bosnian characters to non-bosnian (utf8 standard chars)

I tried the following:

$result = iconv("UTF-8", "ASCII//TRANSLIT", $test);

$result now contains Aland Islands, the special characters are converted to their normal version.

$result[0] should now contain A.




回答4:


Please set character encoding for file (stored code) and output



来源:https://stackoverflow.com/questions/12603421/how-to-compare-and-output-latin-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!