utf-8

Convert CESU-8 to UTF-8 with high performance

霸气de小男生 提交于 2021-02-08 02:36:11
问题 I have some raw text that is usually a valid UTF-8 string. However, every now and then it turns out that the input is in fact a CESU-8 string, instead. It is possible to technically detect this and convert to UTF-8 but as this happens rarely, I would rather not spend lots of CPU time to do this. Is there any fast method to detect if a string is encoded with CESU-8 or UTF-8? I guess I could always blindly convert "UTF-8" to UTF-16LE and then to UTF-8 using iconv() and I would probably get the

Convert all *.cs files to unicode in VisualStudio

故事扮演 提交于 2021-02-07 20:59:22
问题 My team does not pay attention to file encoding (and that is correct, because humans should not be bothered by file encodings). However some files are saved in utf8, and some in regional encoding (cp1250). I need a tool that can do two things: 1. Force utf8 on all files that will be created in future 2. Convert all existing files with given extension (or at least *.cs) to utf-8 How can I achieve these goals using Visual-Studio or Resharper plugins, and or external tools? I tried to do #2 with

LINUX to Windows bad encoding response

本秂侑毒 提交于 2021-02-07 19:54:53
问题 We have a PHP client (LINUX) accessing a C# server (Windows / MVC 2) using Zend_Http_Client. We are trying to set a Authorization header corresponding to the body of the request. Unfortunately, the request body received in C# has some different characters even if the encoding is UTF-8 in both PHP and C#. Although, the C# server can set the Request.Files correctly and the picture will work fine. Plus, a simple textfile with UTF-8 encoded characters will go through without any problem. Here is

Russian symbols in Python output corrupted (ENCODING)

我的梦境 提交于 2021-02-07 19:40:19
问题 I parsed a HTML document and have Russian text in it. When I'm trying to print it in Python, I get this: ÐлÑбниÑнÑй новогодний пÑÐ½Ñ I tried to decode it and I get ISO-8859-1 encoding. I'm trying to decode it like that: print drink_name.decode('iso8859-1') But I get an error. How can I print this text, or encode it in Unicode? 回答1: You have a Mojibake; UTF-8 bytes decoded as Latin-1 or CP1251 in this case. You can repair it by reversing the process: >>> print u'ÐлÑбнÐ

How to encode cyrillic symbols in HTTP-requests in Java?

我怕爱的太早我们不能终老 提交于 2021-02-07 15:49:10
问题 Good time! My Adroid app executes HTTP request to the one of the API services of Google. Sure, it works, when the parameter of the request in english, but when I test my function with cyrillic - I get the 400-error. Seems to be, the problem is to encode the Win-1251 string to UTF-8 ?How it can be done in Java ? 回答1: Try: URLEncoder.encode(yourString, HTTP.UTF-8); 回答2: You should use URLEncoder#encode() to encode request parameters. String query = "name1=" + URLEncoder.encode(value1, "UTF-8")

Python - Reading and writing csv files with utf-8 encoding

我的未来我决定 提交于 2021-02-07 13:37:16
问题 I'm trying to read a csv file the its header contains foreign characters and I'm having a lot of problems with this. first of all, I'm reading the file with a simple csv.reader filename = 'C:\\Users\\yuval\\Desktop\\בית ספר\\עבודג\\new\\resources\\mk'+ str(mkNum) + 'Data.csv' raw_data = open(filename, 'rt', encoding="utf8") reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE) x = list(reader) header = x[0] data = np.array(x[1:]).astype('float') The var header should be an

Python - Reading and writing csv files with utf-8 encoding

做~自己de王妃 提交于 2021-02-07 13:36:43
问题 I'm trying to read a csv file the its header contains foreign characters and I'm having a lot of problems with this. first of all, I'm reading the file with a simple csv.reader filename = 'C:\\Users\\yuval\\Desktop\\בית ספר\\עבודג\\new\\resources\\mk'+ str(mkNum) + 'Data.csv' raw_data = open(filename, 'rt', encoding="utf8") reader = csv.reader(raw_data, delimiter=',', quoting=csv.QUOTE_NONE) x = list(reader) header = x[0] data = np.array(x[1:]).astype('float') The var header should be an

UnicodeString to char* (UTF-8)

早过忘川 提交于 2021-02-07 12:11:18
问题 I am using the ICU library in C++ on OS X. All of my strings are UnicodeStrings, but I need to use system calls like fopen, fread and so forth. These functions take const char* or char* as arguments. I have read that OS X supports UTF-8 internally, so that all I need to do is convert my UnicodeString to UTF-8, but I don't know how to do that. UnicodeString has a toUTF8() member function, but it returns a ByteSink. I've also found these examples: http://source.icu-project.org/repos/icu/icu

WCF Change message encoding from Utf-16 to Utf-8

江枫思渺然 提交于 2021-02-07 10:10:58
问题 I have a WCF connected service in a .net core application. I'm using the code that is autogenerated taken the wsdl definition. Currently at the top of the request xml is including this line: <?xml version="1.0" encoding="utf-16"?> I can't find a simple way to change this encoding to UTF-8 when sending the request. Since I could find a configuration option a the request/client objects, I've tried to change the message with following code at IClientMessageInspector.BeforeSendRequest public

WCF Change message encoding from Utf-16 to Utf-8

淺唱寂寞╮ 提交于 2021-02-07 10:10:45
问题 I have a WCF connected service in a .net core application. I'm using the code that is autogenerated taken the wsdl definition. Currently at the top of the request xml is including this line: <?xml version="1.0" encoding="utf-16"?> I can't find a simple way to change this encoding to UTF-8 when sending the request. Since I could find a configuration option a the request/client objects, I've tried to change the message with following code at IClientMessageInspector.BeforeSendRequest public