Searching Persian characters and words in SQL server with different encoding

拈花ヽ惹草 提交于 2019-12-11 02:29:31

问题


I have a text file that contains Persian words and the file is saved using ANSI encoding. when I try to read the Persian words from the text file, I get some characters like '?'. In order to solve the problem, I wrote a method that change the file encoding to UTF8 and re-write the text file. the method:

    public void Convert2UTF8(string filePath)
    {
        //first, read the text file with "ANSI" endocing
        StreamReader fileStream = new StreamReader(filePath, Encoding.Default);
        string fileContent = fileStream.ReadToEnd();
        fileStream.Close();
        //Now change the file encoding and replace it with the UTF8
        StreamWriter utf8Writer = new StreamWriter(filePath.Replace(".txt", ".txt"), false, Encoding.UTF8);
        utf8Writer.Write(fileContent);
        utf8Writer.Close();
    }

Now the first problem is solved; However, the main issue is in here: I want to insert the words into a table in SQL server database. I do this, but every time that I want to search a Persian word from the database table, the result is null while the record does exist in database table. What`s the solution to find my Persian words that exist in table. The code that I currently use is simply like:

SELECT * FROM [dbo].[WordDirectory] 
WHERE Word = N'کلمه'

'Word' is the field that Persian words are saved in. The type of the field is NVARCHAR. My SQL server version is 2012. Should I change the collation or something?


回答1:


DECLARE @Table TABLE(Field NVARCHAR(4000) COLLATE Frisian_100_CI_AI)

INSERT INTO @Table (Field) VALUES
(N'همهٔ افراد بش'),
(N'می‌آیند و حیثیت '),
(N'ميشه آهسته تر صحبت کنيد؟'),
(N'روح'),
(N' رفتار')   

SELECT * FROM @Table
WHERE Field LIKE N'%آهسته%'

The both Queries return the same result

RESULT Set:  ميشه آهسته تر صحبت کنيد؟

You have to make sure that when you are inserting the values you prefix then witn N thats to tell sql server there can be unicode character in the passed string. Same is true when you are searching for them strings in Select statement.




回答2:


Probably you have problem with Persian and Arabic versions of the 'ي' and 'ك' during search. These characters even look the same, have different Unicode numbers:

select NCHAR(1740),  -- Persian ى
       NCHAR(1610),  -- Arabic ي
       NCHAR(1705), -- Persian ك
       NCHAR(1603) -- Arabic ك

more info: http://www.dotnettips.info/post/90



来源:https://stackoverflow.com/questions/21957438/searching-persian-characters-and-words-in-sql-server-with-different-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!