utf | 易学教程

Convert UTF-16 to UTF-8

阅读更多关于 Convert UTF-16 to UTF-8

问题 I am current using VC++ 2008 MFC. Due to PostgreSQL doesn't support UTF-16 (Encoding used by Windows for Unicode), I need to convert string from UTF-16 to UTF-8, before store it. Here is my code snippet. // demo.cpp : Defines the entry point for the console application. // #include "stdafx.h" #include "demo.h" #include "Utils.h" #include <iostream> #ifdef _DEBUG #define new DEBUG_NEW #endif // The one and only application object CWinApp theApp; using namespace std; int _tmain(int argc, TCHAR*

Convert keyboard emoticons into custom png and vice versa

阅读更多关于 Convert keyboard emoticons into custom png and vice versa

问题 Now this is a straight and simple question. How can I achieve these two things. FIRST Input - hey I'm smiling 😁 Output - hey I'm smiling <span class ="smile"></span> And vice versa. SECOND Input - hey I'm smiling :smile: Output - hey I'm smiling 😁 Now I know the words extraction part. I just don't know in what form keyboard emoticons are ? For First. I know this can be achieved by checking each word and using switch-case to check. But what goes inside the case statements? For second This one

utf8 representation as normal text

阅读更多关于 utf8 representation as normal text

问题 $text = "\xd0\xa2\xd0\xb0\xd0\xb9\xd0\xbd\xd0\xb0"; $text = iconv('UTF-8', 'UTF-8//IGNORE', $text); var_dump($text); //Тайна - good $text = file_get_contents('log.txt'); $text = iconv('UTF-8', 'UTF-8//IGNORE', trim($text)); var_dump($text); // \xd0\xa2\xd0\xb0\xd0\xb9\xd0\xbd\xd0\xb0 - bad Why if string \xd0\xa2\xd0\xb0\xd0\xb9\xd0\xbd\xd0\xb0 was read from file iconv did not work and how to fix it ? 回答1: The string literal and the text in the file is not equivalent. $text is already utf-8

What is QString::toUtf8 doing?

阅读更多关于 What is QString::toUtf8 doing?

问题 This may sounds like a obvious question, but I'm missing something about either how UTF-8 is encoded or how the toUtf8 function works. Let's look at a very simple program QString str("Müller"); qDebug() << str << str.toUtf8().toHex(); Then I get the output "Müller" "4dc383c2bc6c6c6572" But I got the idea the the letter ü should have been encoded as c3bc and not c383c2bc . Thanks Johan 回答1: It depends on the encoding of your source code. I tend to think that your file is already encoded in UTF

Printing unicode characters to stdout in python prints wrong glyphs

阅读更多关于 Printing unicode characters to stdout in python prints wrong glyphs

问题 I want to print a set of Unicode characters to my command prompt terminal. Even when I enforce the encoding to be "UTF-8" the terminal prints some garbage. $python -c "import sys; print sys.stdout.write(u'\u2044'.encode('UTF-8'))" ΓüäNone $python -c "import sys; print sys.stdout.encoding" cp437 My default terminal encoding is cp437 and I am trying to override that. The expected output here is Fraction slash ( ⁄ ) http://www.fileformat.info/info/unicode/char/2044/index.htm The same piece of

How can I put a 💙, or any other emoji inside an XML string?

阅读更多关于 How can I put a 💙, or any other emoji inside an XML string?

问题 How can I do this? I'm pretty new to Java and Android and I have the problem described above. When I paste the emoji inside the xml file it shows a white square and another weird character which "copies" the next character. Any idea on how to work this out? 回答1: Try using this library - emoji-java I know you want an XML way, and this is Java It may help you Example String str = "An 😀awesome 😃string with a few 😉emojis!"; String result = EmojiParser.parseToAliases(myString); System.out.println

How can I put a 💙, or any other emoji inside an XML string?

阅读更多关于 How can I put a 💙, or any other emoji inside an XML string?

difference between NLS_NCHAR_CHARACTERSET and NLS_CHARACTERSET for Oracle

阅读更多关于 difference between NLS_NCHAR_CHARACTERSET and NLS_CHARACTERSET for Oracle

问题 i have a quick question here, that i would like to know the difference between NLS_NCHAR_CHARACTERSET and NLS_CHARACTERSET setting in oracle ?? from my understanding NLS_NCHAR_CHARACTERSET is for NVARCHAR data types and for NLS_CHARACTERSET would be for VARCHAR2 data types. i tried to test this on my development server which my current settings for CHARACTERSET is as the following :- PARAMETER VALUE ------------------------------ ---------------------------------------- NLS_NCHAR_CHARACTERSET

OSX Emacs: unbind just the right alt?

阅读更多关于 OSX Emacs: unbind just the right alt?

问题 I'm using emacsformacosx.com and would like to stop the Meta_R (right meta, or right option key) on my Apple keyboard from being an Emacs meta key. The reason is that I want to be able to continue using the right option key as a character modifier so that I can enter UTF-8 chars when writing in emacs. I know I can do a C-x 8 RET and type em dash , for example, but that's a lot more work than Alt_R - ! Is there some way of passing the keycode to global-unset-key ? Or something else I'm

Unicode characters from JSON.stringify to real unicode characters

阅读更多关于 Unicode characters from JSON.stringify to real unicode characters

问题 I use JSON.stringify() function to stringify JS objects for AJAX sending to PHP. The problem arises when JSON.stringify function encodes unicode characters to format \uxxxx (eg. \u000a ). My question is how to convert those characters to regular unicode characters in PHP? 回答1: See Output UTF-16? A little stuck This converts to UTF-8: function unescape_utf16($string) { /* go for possible surrogate pairs first */ $string = preg_replace_callback( '/\\\\u(D[89ab][0-9a-f]{2})\\\\u(D[c-f][0-9a-f]{2