This is a sample text: \\1f\\1e\\1d\\020028
. I cannot modify the input text, I am reading long string of texts from a file.
I wan
Try adding a . at the end, like:
\\[a-fA-F0-9].
(answer changed after OP added more details)
Your string
String inputText = "\1f\1e\1d\02002868BF03030000000000000000S023\1f\1e\1d\03\0d";
Doesn't actually contains any \
literals because according to Java Language Specification in section 3.10.6. Escape Sequences for Character and String Literals \xxx
will be interpreted as character indexed in Unicode Table with octal (base/radix 8) value represented by xxx
part.
Example \123
= 1*82 + 2*81 + 3*80 = 1*64 + 2*8 + 3*1 = 64+16+3 = 83 which represents character S
If string you presented in your question is written exactly the same in your text file then you should write it as
String inputText = "\\1f\\1e\\1d\\02002868BF03030000000000000000S023\\1f\\1e\\1d\\03\\0d";
(with escaped \
which now will represent literal).
(older version of my answer)
It is hard to tell what exactly you did wrong without seeing your code. You should be able to find at least \1
, \1
, \1
, \0
since your regex can match one \
and one hexadecimal character placed after it.
Anyway this is how you can find results you mentioned in question:
String text = "\\1f\\1e\\1d\\020028";
Pattern p = Pattern.compile("\\\\[a-fA-F0-9]{2}");
// ^^^--we want to find two hexadecimal
// characters after \
Matcher m = p.matcher(text);
while (m.find())
System.out.println(m.group());
Output:
\1f
\1e
\1d
\02
If you don't want to modify the input string, you could try something like:
static public void main(String[] argv) {
String s = "\1f\1e\1d\020028";
Pattern regex = Pattern.compile("[\\x00-\\x1f][0-9A-Fa-f]");
Matcher match = regex.matcher(s);
while (match.find()) {
char[] c = match.group().toCharArray();
System.out.println(String.format("\\%d%s",c[0]+0, c[1])) ;
}
}
Yes, it's not perfect, but you get the idea.
You need to read the file properly and replace '\' characters with '\\'. Assume that there is file called test_file in your project with this content:
\1f\1e\1d\02002868BF03030000000000000000S023\1f\1e\1d\03\0d
Here is the code to read the file and extract values:
public static void main(String[] args) throws IOException, URISyntaxException {
Test t = new Test();
t.test();
}
public void test() throws IOException {
BufferedReader br =
new BufferedReader(
new InputStreamReader(
getClass().getResourceAsStream("/test_file.txt"), "UTF-8"));
String inputText;
while ((inputText = br.readLine()) != null) {
inputText = inputText.replace("\\", "\\\\");
Pattern pattern = Pattern.compile("\\\\[a-fA-F0-9]{2}");
Matcher match = pattern.matcher(inputText);
while (match.find()) {
System.out.println(match.group());
}
}
}