removing invalid XML characters from a string in java
Hi i would like to remove all invalid XML characters from a string. i would like to use a regular expression with the string.replace method. like line.replace(regExp,""); what is the right regExp to use ? invalid XML character is everything that is not this : [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] thanks. Java's regex supports supplementary characters , so you can specify those high ranges with two UTF-16 encoded chars. Here is the pattern for removing characters that are illegal in XML 1.0 : // XML 1.0 // #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]