Convert a unicode String In C++ To Upper Case

后端 未结 9 2073
予麋鹿
予麋鹿 2020-12-01 14:51

How we can convert a multi language string or unicode string to upper/lower case in C or C++.

相关标签:
9条回答
  • 2020-12-01 14:51

    I found 2 solution of that problem_

    1. setlocale(LC_CTYPE, "en_US.UTF-8"); // the locale will be the UTF-8 enabled English

    std::wstring str = L"Zoë Saldaña played in La maldición del padre Cardona.ëèñ";
    
    std::wcout << str << std::endl;
    
    for (wstring::iterator it = str.begin(); it != str.end(); ++it)
        *it = towupper(*it);
    
    std::wcout << "toUpper_onGCC_LLVM_1 :: "<< str << std::endl;
    

    this is working on LLVM GCC 4.2 Compiler.

    2. std::locale::global(std::locale("en_US.UTF-8")); // the locale will be the UTF-8 enabled English

    std::wcout.imbue(std::locale());
    const std::ctype<wchar_t>& f = std::use_facet< std::ctype<wchar_t> >(std::locale());
    
    std::wstring str = L"Chloëè";//"Zoë Saldaña played in La maldición del padre Cardona.";
    
    f.toupper(&str[0], &str[0] + str.size());   
    
    std::wcout << str << std::endl;
    

    This is working in Apple LLVM 4.2.

    Both case i ran on Xocde. But I am finding a way to run this code in Eclipse with g++ Compiler.

    0 讨论(0)
  • 2020-12-01 14:51

    You can iterate through a wstring and use towupper / towlower

    for (wstring::iterator it = a.begin(); it != a.end(); ++it)
            *it = towupper(*it);
    
    0 讨论(0)
  • 2020-12-01 14:51

    For C I would use toupper after adjusting the C locale in the current thread.

    setlocale(LC_CTYPE, "en_US.UTF8");
    

    For C++ I would use the toupper method of std::ctype<char>:

    std::locale loc;
    
    auto& f = std::use_facet<std::ctype<char>>(loc);
    
    char str[80] = "Hello World";
    
    f.toupper(str, str+strlen(str));
    
    0 讨论(0)
  • 2020-12-01 14:53

    In Windows, consider CharUpperBuffW and CharLowerBuffW for mixed-language applications where locale is unknown. These functions handle diacritics where toupper() does not.

    0 讨论(0)
  • 2020-12-01 14:54

    With quite a lot of difficulty if you're going to do it right.

    The usual use-case for this is for comparison purposes, but the problem is more general than that.

    There is a fairly detailed paper from C++ Report circa 2000 from Matt Austern here (PDF)

    0 讨论(0)
  • 2020-12-01 15:01

    If you want a sane and mature solution, look at IBM's ICU. Here's an example:

    #include <iostream>
    #include <unicode/unistr.h>
    #include <string>
    
    int main(){
        icu::UnicodeString us("óóßChloë");
        us.toUpper(); //convert to uppercase in-place
        std::string s;
        us.toUTF8String(s);
        std::cout<<"Upper: "<<s<<"\n";
    
        us.toLower(); //convert to lowercase in-place
        s.clear();
        us.toUTF8String(s);
        std::cout<<"Lower: "<<s<<"\n";
        return 0;
    }
    

    Output:

    Upper: ÓÓSSCHLOË
    Lower: óósschloë
    

    Note: In the later step SS isn't being treated as capital of German ß

    0 讨论(0)
提交回复
热议问题