How to replace � in a string

前端 未结 10 477
挽巷
挽巷 2020-11-28 07:50

I have a string that contains a character � I haven\'t been able to replace it correctly.

String.replace(\"�\", \"\");

doesn\'t work, d

相关标签:
10条回答
  • 2020-11-28 08:30

    for detail

    import java.io.UnsupportedEncodingException;
    
    /**
     * File: BOM.java
     * 
     * check if the bom character is present in the given string print the string
     * after skipping the utf-8 bom characters print the string as utf-8 string on a
     * utf-8 console
     */
    
    public class BOM
    {
        private final static String BOM_STRING = "Hello World";
        private final static String ISO_ENCODING = "ISO-8859-1";
        private final static String UTF8_ENCODING = "UTF-8";
        private final static int UTF8_BOM_LENGTH = 3;
    
        public static void main(String[] args) throws UnsupportedEncodingException {
            final byte[] bytes = BOM_STRING.getBytes(ISO_ENCODING);
            if (isUTF8(bytes)) {
                printSkippedBomString(bytes);
                printUTF8String(bytes);
            }
        }
    
        private static void printSkippedBomString(final byte[] bytes) throws UnsupportedEncodingException {
            int length = bytes.length - UTF8_BOM_LENGTH;
            byte[] barray = new byte[length];
            System.arraycopy(bytes, UTF8_BOM_LENGTH, barray, 0, barray.length);
            System.out.println(new String(barray, ISO_ENCODING));
        }
    
        private static void printUTF8String(final byte[] bytes) throws UnsupportedEncodingException {
            System.out.println(new String(bytes, UTF8_ENCODING));
        }
    
        private static boolean isUTF8(byte[] bytes) {
            if ((bytes[0] & 0xFF) == 0xEF && 
                (bytes[1] & 0xFF) == 0xBB && 
                (bytes[2] & 0xFF) == 0xBF) {
                return true;
            }
            return false;
        }
    }
    
    0 讨论(0)
  • 2020-11-28 08:31

    Change the Encoding to UTF-8 while parsing .This will remove the special characters

    0 讨论(0)
  • 2020-11-28 08:32

    As others have said, you posted 3 characters instead of one. I suggest you run this little snippet of code to see what's actually in your string:

    public static void dumpString(String text)
    {
        for (int i=0; i < text.length(); i++)
        {
            System.out.println("U+" + Integer.toString(text.charAt(i), 16) 
                               + " " + text.charAt(i));
        }
    }
    

    If you post the results of that, it'll be easier to work out what's going on. (I haven't bothered padding the string - we can do that by inspection...)

    0 讨论(0)
  • 2020-11-28 08:33

    Character issues like this are difficult to diagnose because information is easily lost through misinterpretation of characters via application bugs, misconfiguration, cut'n'paste, etc.

    As I (and apparently others) see it, you've pasted three characters:

    codepoint   glyph   escaped    windows-1252    info
    =======================================================================
    U+00ef      ï       \u00ef     ef,             LATIN_1_SUPPLEMENT, LOWERCASE_LETTER
    U+00bf      ¿       \u00bf     bf,             LATIN_1_SUPPLEMENT, OTHER_PUNCTUATION
    U+00bd      ½       \u00bd     bd,             LATIN_1_SUPPLEMENT, OTHER_NUMBER
    

    To identify the character, download and run the program from this page. Paste your character into the text field and select the glyph mode; paste the report into your question. It'll help people identify the problematic character.

    0 讨论(0)
  • 2020-11-28 08:39

    Use the unicode escape sequence. First you'll have to find the codepoint for the character you seek to replace (let's just say it is ABCD in hex):

    str = str.replaceAll("\uABCD", "");
    
    0 讨论(0)
  • 2020-11-28 08:42

    No above answer resolve my issue. When i download xml it apppends <xml to my xml. I simply

    xml = parser.getXmlFromUrl(url);
    
    xml = xml.substring(3);// it remove first three character from string,
    

    now it is running accurately.

    0 讨论(0)
提交回复
热议问题