decoding | 易学教程

Unbaking mojibake

阅读更多关于 Unbaking mojibake

When you have incorrectly decoded characters, how can you identify likely candidates for the original string? Ä×èÈÄÄî▒è¤ô_üiâAâjâüâpâXüj_10òb.png I know for a fact that this image filename should have been some Japanese characters. But with various guesses at urllib quoting/unquoting, encode and decode iso8859-1, utf8, I haven't been able to unmunge and get the original filename. Is the corruption reversible? galinden You could use chardet (install with pip): import chardet your_str = "Ä×èÈÄÄî▒è¤ô_üiâAâjâüâpâXüj_10òb" detected_encoding = chardet.detect(your_str)["encoding"] try: correct_str =

How can I do LZW decoding in Java?

阅读更多关于 How can I do LZW decoding in Java?

I have a database which contains picture data stored as a binary blob. The documentation says the data is encoded using LZW. I thought that I could decode it using the Zip or GZip input streams found in the Java library, but it didn't work - I got an exception that said the format of the data is not correct. From what I've read, the library uses DEFLATE, which is not LZW. Also, I've read about some licensing problems for using the LZW algorithm. What can I use to decode the data? Is there a library? Do I have to implement it myself? What about the licensing problems? Stephen C Here are a

Python - Replace non-ascii character in string (»)

阅读更多关于 Python - Replace non-ascii character in string (»)

问题 I need to replace in a string the character "»" with a whitespace, but I still get an error. This is the code I use: # -*- coding: utf-8 -*- from bs4 import BeautifulSoup # other code soup = BeautifulSoup(data, 'lxml') mystring = soup.find('a').text.replace(' »','') UnicodeEncodeError: 'ascii' codec can't encode character u'\xbb' in position 13: ordinal not in range(128) But If I test it with this other script: # -*- coding: utf-8 -*- a = "hi »" b = a.replace('»','') It works. Why this? 回答1:

Request returns bytes and I'm failing to decode them

阅读更多关于 Request returns bytes and I'm failing to decode them

Essentially I made a request to a website and got a byte response back: b'[{"geonameId:"703448"}..........'. I'm confused because although it is of type byte, it is very human readable and appears like a list of json. I do know that the response is encoded in latin1 from running r.encoding which returned ISO-859-1 and I have tried to decode it, but it just returns an empty string. Here's what I have so far: r = response.content string = r.decode("ISO-8859-1") print (string) and this is where it prints a blank line. However when I run len(string) I get: back 31023 How can I decode these bytes

How do i allow HTML tags to be submitted in a textbox in asp.net?

阅读更多关于 How do i allow HTML tags to be submitted in a textbox in asp.net?

First, I want to let everyone know that I am using an aspx engine not a Razor engine. I have a table within a form. One of my textbox contains html tags like Phone: 814-888-9999 Email: aaa@gmail.com. When i go to build it it it gives me an error that says A potentially dangerous Request.Form value was detected from the client (QuestionAnswer="...ics Phone: 814-888-9999<br...") . I tried the validation request="false" but it did not work. Im sorry i didn't add my html code for you to look at so far. I am pulling some question up where I can edit it, if need be. <%@

Unicode Encoding and decoding issues in QRCode

阅读更多关于 Unicode Encoding and decoding issues in QRCode

I am trying to generate UTF-8 QRCode so that I can encore accents and Unicode characters. To test it, I am using many decoding solution : http://zxing.org/w/decode.jspx - The zxing project also used in Android http://www.drhu.org/QRCode/QRDecoder.php - a PHP Decoder http://zbar.sf.net - The ZBar bar code reader - OpenSource and C project for embedded All of them give me always the same result. You can try this image works well with Unicode Characters. But if I am trying to use zxing or Google Chart API to generate the QRCode, I cannot decode it correctly. I have tried this : http://chart.apis

base64 encoding that doesn't use “+/=” (plus or equals) characters?

阅读更多关于 base64 encoding that doesn't use “+/=” (plus or equals) characters?

I need to encode a string of about 1000 characters that can be any byte value (00-FF). I don't want to use Hex because it's not dense enough. the problem with base64 as I understand it is that it includes + / and = which are characters I can not tolerate in my application. Any suggestions? As Ciaran says, base64 isn't terribly hard to implement - but you may want to have a look for existing libraries which allow you to specify a custom set of characters to use. I'm pretty sure there are plenty out there, but you haven't specified which platform you need this for. Basically, you just need 65

Android decoding html in xml file

阅读更多关于 Android decoding html in xml file

问题 In my software im receiving a xml file that is containing some HTML entities like & amp; or whatever. Im successfull decoding the xml but not the HTML entities. The strings are cutted when they meet an html entities... Anybody can help ? I have such code actually to decode the xml... DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); InputStream inputStream = entity.getContent(); Document dom = builder.parse

How to write single bits to a file in C

阅读更多关于 How to write single bits to a file in C

问题 I am programming an entropy coding algorithm and I want to write single bits like an encoded character to a file. For example I want to write 011 to a file but if you would store it as character it'd take up 3 Bytes instead of 3 Bits. So my final question is: How can I write single bits to a file? Thanks in advance! 回答1: You can't write individual bits to a file, the resolution is a single byte. If you want to write bits in sequence, you have to batch them up until you have a full byte, then

Convert hex to ascii characters

阅读更多关于 Convert hex to ascii characters

Is it possible to represent a sequence of hex characters (0-9A-F) with a sequence of 0-9a-zA-Z characters, so the the result sequence is smaller and can be decoded? For example: $hex = '5d41402abc4b2a76b9719d911017c592'; echo $string = encode($hex); // someASCIIletters123 echo decode(string) == $hex; //true Jon You can trivially adapt the solution I presented here using the function base_convert_arbitrary . Edit: I had not read carefully enough :) Base 16 to base 62 is still very doable, as above. See it in action . Andrew I think you're looking for this: function hex2str($hex) { $str = '';