Converting wide char string to lowercase in C++

别说谁变了你拦得住时间么 提交于 2019-12-30 09:07:21

问题


How do I convert a wchar_t string from upper case to lower case in C++?

The string contains a mixture of Japanese, Chinese, German and Greek characters.

I thought about using towlower...

http://msdn.microsoft.com/en-us/library/8h19t214%28VS.80%29.aspx

.. but the documentation says that:

The case conversion of towlower is locale-specific. Only the characters relevant to the current locale are changed in case.

Edit: Maybe I should describe what I'm doing. I receive a Unicode search query from a user. It's originally in UTF-8 encoding, but I'm converting it to a widechar (I may be wrong on the wording). My debugger (VS2008) correctly shows the Japanese, German, etc characters in in the "variable quick watch". I need to go through another set of data in Unicode and find matches of the search string. While this is no problem for me to do when the search is case sensitive, it's more problematic to do it case insensitive. My (maybe naive) approach to solve the problem would be to convert all input data and output data to lower case and then compare it.


回答1:


If your string contains all those characters, the codeset must be Unicode-based. If implemented properly, Unicode (Chapter 4 'Character Properties') defines character properties including whether the character is upper case and the lower case mapping, and so on.

Given that preamble, the towlower() function from <wctype.h> is the correct tool to use. If it doesn't do the job, you have a QoI (Quality of Implementation) problem to discuss with your vendor. If you find the vendor unresponsive, then look at alternative libraries. In this case, you might consider ICU (International Components for Unicode).




回答2:


You have a nasty problem in hand. A Japanese locale will not help converting German and vice versa. There are languages which do not have the concept of captalization either (toupper and friends would be a no-op here, I suppose). So, can you break up your string into individual chunks of words from the same language? If you can then you can convert the pieces and string them up.




回答3:


This SO answer shows how to work with facets to work with several locales. If this is on Windows, you can consider using win32 API functions, if you can work with C++.NET (managed C++), you can use the char.ToLower and string.ToLower functions, which are Unicode compliant.




回答4:


Have a look at _wcslwr_l in <wchar.h> (MSDN).

You should be able to run the function on the input for each of the locales.



来源:https://stackoverflow.com/questions/1614595/converting-wide-char-string-to-lowercase-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!