How to replace ï¿½ in a string

前端未结

关注

 10  477

I have a string that contains a character ï¿½ I haven\'t been able to replace it correctly.

String.replace(\"ï¿½\", \"\");

doesn\'t work, d

相关标签:

10条回答

臣服心动

2020-11-28 08:30

for detail

import java.io.UnsupportedEncodingException;

/**
 * File: BOM.java
 * 
 * check if the bom character is present in the given string print the string
 * after skipping the utf-8 bom characters print the string as utf-8 string on a
 * utf-8 console
 */

public class BOM
{
    private final static String BOM_STRING = "ï»¿Hello World";
    private final static String ISO_ENCODING = "ISO-8859-1";
    private final static String UTF8_ENCODING = "UTF-8";
    private final static int UTF8_BOM_LENGTH = 3;

    public static void main(String[] args) throws UnsupportedEncodingException {
        final byte[] bytes = BOM_STRING.getBytes(ISO_ENCODING);
        if (isUTF8(bytes)) {
            printSkippedBomString(bytes);
            printUTF8String(bytes);
        }
    }

    private static void printSkippedBomString(final byte[] bytes) throws UnsupportedEncodingException {
        int length = bytes.length - UTF8_BOM_LENGTH;
        byte[] barray = new byte[length];
        System.arraycopy(bytes, UTF8_BOM_LENGTH, barray, 0, barray.length);
        System.out.println(new String(barray, ISO_ENCODING));
    }

    private static void printUTF8String(final byte[] bytes) throws UnsupportedEncodingException {
        System.out.println(new String(bytes, UTF8_ENCODING));
    }

    private static boolean isUTF8(byte[] bytes) {
        if ((bytes[0] & 0xFF) == 0xEF && 
            (bytes[1] & 0xFF) == 0xBB && 
            (bytes[2] & 0xFF) == 0xBF) {
            return true;
        }
        return false;
    }
}

0 讨论(0)

北恋

2020-11-28 08:31

Change the Encoding to UTF-8 while parsing .This will remove the special characters

0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2020-11-28 08:32
As others have said, you posted 3 characters instead of one. I suggest you run this little snippet of code to see what's actually in your string:
```
public static void dumpString(String text)
{
    for (int i=0; i < text.length(); i++)
    {
        System.out.println("U+" + Integer.toString(text.charAt(i), 16) 
                           + " " + text.charAt(i));
    }
}
```
If you post the results of that, it'll be easier to work out what's going on. (I haven't bothered padding the string - we can do that by inspection...)
0 讨论(0)
发布评论:

提交评论
- 加载中...
日久生厌

2020-11-28 08:33
Character issues like this are difficult to diagnose because information is easily lost through misinterpretation of characters via application bugs, misconfiguration, cut'n'paste, etc.

As I (and apparently others) see it, you've pasted three characters:
```
codepoint   glyph   escaped    windows-1252    info
=======================================================================
U+00ef      ï       \u00ef     ef,             LATIN_1_SUPPLEMENT, LOWERCASE_LETTER
U+00bf      ¿       \u00bf     bf,             LATIN_1_SUPPLEMENT, OTHER_PUNCTUATION
U+00bd      ½       \u00bd     bd,             LATIN_1_SUPPLEMENT, OTHER_NUMBER
```
To identify the character, download and run the program from this page. Paste your character into the text field and select the glyph mode; paste the report into your question. It'll help people identify the problematic character.
0 讨论(0)
发布评论:

提交评论
- 加载中...
小蘑菇

2020-11-28 08:39
Use the unicode escape sequence. First you'll have to find the codepoint for the character you seek to replace (let's just say it is ABCD in hex):
```
str = str.replaceAll("\uABCD", "");
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
遇见更好的自我

2020-11-28 08:42
No above answer resolve my issue. When i download xml it apppends ï»¿<xml to my xml. I simply
```
xml = parser.getXmlFromUrl(url);

xml = xml.substring(3);// it remove first three character from string,
```
now it is running accurately.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页