问题
I have a text file that contains Persian words and the file is saved using ANSI encoding. when I try to read the Persian words from the text file, I get some characters like '?'. In order to solve the problem, I wrote a method that change the file encoding to UTF8 and re-write the text file. the method:
public void Convert2UTF8(string filePath)
{
//first, read the text file with "ANSI" endocing
StreamReader fileStream = new StreamReader(filePath, Encoding.Default);
string fileContent = fileStream.ReadToEnd();
fileStream.Close();
//Now change the file encoding and replace it with the UTF8
StreamWriter utf8Writer = new StreamWriter(filePath.Replace(".txt", ".txt"), false, Encoding.UTF8);
utf8Writer.Write(fileContent);
utf8Writer.Close();
}
Now the first problem is solved; However, the main issue is in here: I want to insert the words into a table in SQL server database. I do this, but every time that I want to search a Persian word from the database table, the result is null while the record does exist in database table. What`s the solution to find my Persian words that exist in table. The code that I currently use is simply like:
SELECT * FROM [dbo].[WordDirectory]
WHERE Word = N'کلمه'
'Word' is the field that Persian words are saved in. The type of the field is NVARCHAR. My SQL server version is 2012. Should I change the collation or something?
回答1:
DECLARE @Table TABLE(Field NVARCHAR(4000) COLLATE Frisian_100_CI_AI)
INSERT INTO @Table (Field) VALUES
(N'همهٔ افراد بش'),
(N'میآیند و حیثیت '),
(N'ميشه آهسته تر صحبت کنيد؟'),
(N'روح'),
(N' رفتار')
SELECT * FROM @Table
WHERE Field LIKE N'%آهسته%'
The both Queries return the same result
RESULT Set: ميشه آهسته تر صحبت کنيد؟
You have to make sure that when you are inserting the values you prefix then witn N
thats to tell sql server there can be unicode character in the passed string. Same is true when you are searching for them strings in Select statement.
回答2:
Probably you have problem with Persian and Arabic versions of the 'ي' and 'ك' during search. These characters even look the same, have different Unicode numbers:
select NCHAR(1740), -- Persian ى
NCHAR(1610), -- Arabic ي
NCHAR(1705), -- Persian ك
NCHAR(1603) -- Arabic ك
more info: http://www.dotnettips.info/post/90
来源:https://stackoverflow.com/questions/21957438/searching-persian-characters-and-words-in-sql-server-with-different-encoding