unicode-escapes

Convert Unicode Escape to Hebrew text

匆匆过客 提交于 2019-12-02 08:57:55
问题 I have the following text in a json file: "\u00d7\u0090\u00d7\u0097\u00d7\u0095\u00d7\u0096\u00d7\u00aa \u00d7\u00a4\u00d7\u0095\u00d7\u009c\u00d7\u0092" which represents the text "אחוזת פולג" in Hebrew. no matter which encoding/decoding i use i don't seem to get it right with Python 3. if for example ill try: text = "\u00d7\u0090\u00d7\u0097\u00d7\u0095\u00d7\u0096\u00d7\u00aa \u00d7\u00a4\u00d7\u0095\u00d7\u009c\u00d7\u0092".encode('unicode-escape') print(text) i get that text is: b'\\xd7\

Print Unicode escape codes from variable

感情迁移 提交于 2019-12-02 07:17:24
I have a list of Unicode character codes that I would like to output with rumoji . Here's the code I'm using to iterate over my data. require "rumoji" # this works puts Rumoji.decode("\u{1F600}") # feed some data data = [ "1F600", "1F476", "1F474" ] data.each do |line| # this doesn't work puts Rumoji.decode("\u{#{line}}") puts Rumoji.decode("\u{" + line + "}") end I'm not sure how I can use variable names inside the escaped string. One can not use \u along with string interpolation, since \u takes precedence. What one might do, is to Array#pack an array of integers: ▶ data.map { |e| e.to_i(16)

Convert Unicode Escape to Hebrew text

一个人想着一个人 提交于 2019-12-02 05:09:50
I have the following text in a json file: "\u00d7\u0090\u00d7\u0097\u00d7\u0095\u00d7\u0096\u00d7\u00aa \u00d7\u00a4\u00d7\u0095\u00d7\u009c\u00d7\u0092" which represents the text "אחוזת פולג" in Hebrew. no matter which encoding/decoding i use i don't seem to get it right with Python 3. if for example ill try: text = "\u00d7\u0090\u00d7\u0097\u00d7\u0095\u00d7\u0096\u00d7\u00aa \u00d7\u00a4\u00d7\u0095\u00d7\u009c\u00d7\u0092".encode('unicode-escape') print(text) i get that text is: b'\\xd7\\x90\\xd7\\x97\\xd7\\x95\\xd7\\x96\\xd7\\xaa \\xd7\\xa4\\xd7\\x95\\xd7\\x9c\\xd7\\x92' which in bytecode

Six digit unicode escaped value comparison

余生长醉 提交于 2019-12-01 08:06:00
I have a six digit unicode character, for example U+100000 which I wish to make a comparison with a another char in my C# code. My reading of the MSDN documentation is that this character cannot be represented by a char , and must instead be represented by a string . a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal I feel that I'm missing something obvious, but how can you get the follow comparison to work correctly: public bool IsCharLessThan(char myChar, string upperBound) {

Six digit unicode escaped value comparison

别来无恙 提交于 2019-12-01 07:19:41
问题 I have a six digit unicode character, for example U+100000 which I wish to make a comparison with a another char in my C# code. My reading of the MSDN documentation is that this character cannot be represented by a char , and must instead be represented by a string . a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal I feel that I'm missing something obvious, but how can you get the

Is it possible to decode bytes to UTF-8, converting errors to escape sequences in Rust?

泄露秘密 提交于 2019-12-01 02:54:10
问题 In Rust it's possible to get UTF-8 from bytes by doing this: if let Ok(s) = str::from_utf8(some_u8_slice) { println!("example {}", s); } This either works or it doesn't, but Python has the ability to handle errors, e.g.: s = some_bytes.decode(encoding='utf-8', errors='surrogateescape'); In this example the argument surrogateescape converts invalid utf-8 sequences to escape-codes, so instead of ignoring or replacing text that can't be decoded, they are replaced with a byte literal expression,

Difference in URL decode/encode UTF-8 between Java and JS/AS3 (bug!?)

眉间皱痕 提交于 2019-11-30 22:29:58
I am having an issue URL decoding a UTF-8 string in Java that is encoded either with Javascript or Actionscript 3. I've set up a test case as follows: The string in question is Produktgröße When I encode with JS/AS3 I get the following string: escape('Produktgröße') Produktgr%F6%DFe When I unescape this with JS I get no change unescape('Produktgr%F6%DFe') Produktgr%F6%DFe So, by this I assume that JS isn't encoding the string properly?? The following JSP produces this outupt <%@page import="java.net.URLEncoder"%> <%@page import="java.net.URLDecoder"%> <%=(URLDecoder.decode("Produktgr%F6%DFe",

What does “\1” represent in this Java string?

谁说我不能喝 提交于 2019-11-30 16:24:49
问题 System.out.println("\1"); I thought it did not compile because of the non-recognized escape sequence. What does "\1" exactly represent? 回答1: It's an octal escape sequence, as listed in section 3.10.6 of the JLS. So for example: String x = "\16"; is equivalent to: String x = "\u000E"; (As Octal 16 = Hex E.) So \1 us U+0001, the "start of heading" character. Octal escape sequences are very rarely used in Java in my experience, and I'd personally avoid them where possible. When I want to specify

Python 2.7: How to convert unicode escapes in a string into actual utf-8 characters

笑着哭i 提交于 2019-11-30 07:39:23
问题 I use python 2.7 and I'm receiving a string from a server (not in unicode!). Inside that string I find text with unicode escape sequences. For example like this: <a href = "http://www.mypage.com/\u0441andmoretext">\u00b2<\a> How do I convert those \uxxxx - back to utf-8? The answers I found were either dealing with &# or required eval() which is too slow for my purposes. I need a universal solution for any text containing such sequenes. Edit: <\a> is a typo but I want a tolerance against such

Escape String in Grails to avoid JSON error

假如想象 提交于 2019-11-29 18:00:16
I have few strings like "12.10 On-Going Submission of ""Made Up"" Samples." 10. PRODUCT STANDARDS; APPROVAL. which I render as JSON in grails. The quotes and any other possible special characters are giving me trouble i.e they make the JSON invalid when returning a response from the REST service. How do I solve this? I have tried few things but nothing seems to work: //text: java.net.URLEncoder.encode(artifact.text, "UTF-8"), //Loses the original format //text : artifact.text.encodeAsJavaScript(), // give problem with ; //text: artifact.text.encodeAsHTML(), // gives &qoute(not wanted) in the