Java Pattern/ Matcher

前端 未结 4 1000
既然无缘
既然无缘 2021-01-14 08:42

This is a sample text: \\1f\\1e\\1d\\020028. I cannot modify the input text, I am reading long string of texts from a file.


I wan

相关标签:
4条回答
  • 2021-01-14 08:54

    Try adding a . at the end, like:

    \\[a-fA-F0-9].
    
    0 讨论(0)
  • 2021-01-14 08:56

    (answer changed after OP added more details)

    Your string

    String inputText = "\1f\1e\1d\02002868BF03030000000000000000S023\1f\1e\1d\03\0d";
    

    Doesn't actually contains any \ literals because according to Java Language Specification in section 3.10.6. Escape Sequences for Character and String Literals \xxx will be interpreted as character indexed in Unicode Table with octal (base/radix 8) value represented by xxx part.

    Example \123 = 1*82 + 2*81 + 3*80 = 1*64 + 2*8 + 3*1 = 64+16+3 = 83 which represents character S

    If string you presented in your question is written exactly the same in your text file then you should write it as

    String inputText = "\\1f\\1e\\1d\\02002868BF03030000000000000000S023\\1f\\1e\\1d\\03\\0d";
    

    (with escaped \ which now will represent literal).


    (older version of my answer)

    It is hard to tell what exactly you did wrong without seeing your code. You should be able to find at least \1, \1, \1, \0 since your regex can match one \ and one hexadecimal character placed after it.

    Anyway this is how you can find results you mentioned in question:

    String text = "\\1f\\1e\\1d\\020028";
    Pattern p = Pattern.compile("\\\\[a-fA-F0-9]{2}");
    //                                          ^^^--we want to find two hexadecimal 
    //                                               characters after \
    Matcher m = p.matcher(text);
    while (m.find())
        System.out.println(m.group());
    

    Output:

    \1f
    \1e
    \1d
    \02
    
    0 讨论(0)
  • 2021-01-14 08:59

    If you don't want to modify the input string, you could try something like:

    static public void main(String[] argv) {
    
                String s = "\1f\1e\1d\020028";
                Pattern regex = Pattern.compile("[\\x00-\\x1f][0-9A-Fa-f]");
                Matcher match = regex.matcher(s);
                while (match.find()) {
                        char[] c = match.group().toCharArray();
                        System.out.println(String.format("\\%d%s",c[0]+0, c[1])) ;
                }
        }
    

    Yes, it's not perfect, but you get the idea.

    0 讨论(0)
  • 2021-01-14 09:03

    You need to read the file properly and replace '\' characters with '\\'. Assume that there is file called test_file in your project with this content:

    \1f\1e\1d\02002868BF03030000000000000000S023\1f\1e\1d\03\0d
    

    Here is the code to read the file and extract values:

    public static void main(String[] args) throws IOException, URISyntaxException {        
        Test t = new Test();
        t.test();
    }
    
    public void test() throws IOException {        
        BufferedReader br =
            new BufferedReader(
                new InputStreamReader(
                    getClass().getResourceAsStream("/test_file.txt"), "UTF-8"));
        String inputText;
    
        while ((inputText = br.readLine()) != null) {
            inputText = inputText.replace("\\", "\\\\");
    
            Pattern pattern = Pattern.compile("\\\\[a-fA-F0-9]{2}");
            Matcher match = pattern.matcher(inputText);
    
            while (match.find()) {
                System.out.println(match.group());
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题