Unbaking mojibake
When you have incorrectly decoded characters, how can you identify likely candidates for the original string? Ä×èÈÄÄî▒è¤ô_üiâAâjâüâpâXüj_10òb.png I know for a fact that this image filename should have been some Japanese characters. But with various guesses at urllib quoting/unquoting, encode and decode iso8859-1, utf8, I haven't been able to unmunge and get the original filename. Is the corruption reversible? galinden You could use chardet (install with pip): import chardet your_str = "Ä×èÈÄÄî▒è¤ô_üiâAâjâüâpâXüj_10òb" detected_encoding = chardet.detect(your_str)["encoding"] try: correct_str =