Using Regular Expressions to Extract a Value in Java

前端 未结 13 1084
失恋的感觉
失恋的感觉 2020-11-22 13:08

I have several strings in the rough form:

[some text] [some number] [some more text]

I want to extract the text in [some number] using the

相关标签:
13条回答
  • 2020-11-22 13:22

    Sometimes you can use simple .split("REGEXP") method available in java.lang.String. For example:

    String input = "first,second,third";
    
    //To retrieve 'first' 
    input.split(",")[0] 
    //second
    input.split(",")[1]
    //third
    input.split(",")[2]
    
    0 讨论(0)
  • 2020-11-22 13:23

    Allain basically has the java code, so you can use that. However, his expression only matches if your numbers are only preceded by a stream of word characters.

    "(\\d+)"
    

    should be able to find the first string of digits. You don't need to specify what's before it, if you're sure that it's going to be the first string of digits. Likewise, there is no use to specify what's after it, unless you want that. If you just want the number, and are sure that it will be the first string of one or more digits then that's all you need.

    If you expect it to be offset by spaces, it will make it even more distinct to specify

    "\\s+(\\d+)\\s+"
    

    might be better.

    If you need all three parts, this will do:

    "(\\D+)(\\d+)(.*)"
    

    EDIT The Expressions given by Allain and Jack suggest that you need to specify some subset of non-digits in order to capture digits. If you tell the regex engine you're looking for \d then it's going to ignore everything before the digits. If J or A's expression fits your pattern, then the whole match equals the input string. And there's no reason to specify it. It probably slows a clean match down, if it isn't totally ignored.

    0 讨论(0)
  • 2020-11-22 13:27

    In addition to Pattern, the Java String class also has several methods that can work with regular expressions, in your case the code will be:

    "ab123abc".replaceFirst("\\D*(\\d*).*", "$1")
    

    where \\D is a non-digit character.

    0 讨论(0)
  • 2020-11-22 13:27

    How about [^\\d]*([0-9]+[\\s]*[.,]{0,1}[\\s]*[0-9]*).* I think it would take care of numbers with fractional part. I included white spaces and included , as possible separator. I'm trying to get the numbers out of a string including floats and taking into account that the user might make a mistake and include white spaces while typing the number.

    0 讨论(0)
  • 2020-11-22 13:34

    Simple Solution

    // Regexplanation:
    // ^       beginning of line
    // \\D+    1+ non-digit characters
    // (\\d+)  1+ digit characters in a capture group
    // .*      0+ any character
    String regexStr = "^\\D+(\\d+).*";
    
    // Compile the regex String into a Pattern
    Pattern p = Pattern.compile(regexStr);
    
    // Create a matcher with the input String
    Matcher m = p.matcher(inputStr);
    
    // If we find a match
    if (m.find()) {
        // Get the String from the first capture group
        String someDigits = m.group(1);
        // ...do something with someDigits
    }
    

    Solution in a Util Class

    public class MyUtil {
        private static Pattern pattern = Pattern.compile("^\\D+(\\d+).*");
        private static Matcher matcher = pattern.matcher("");
    
        // Assumptions: inputStr is a non-null String
        public static String extractFirstNumber(String inputStr){
            // Reset the matcher with a new input String
            matcher.reset(inputStr);
    
            // Check if there's a match
            if(matcher.find()){
                // Return the number (in the first capture group)
                return matcher.group(1);
            }else{
                // Return some default value, if there is no match
                return null;
            }
        }
    }
    
    ...
    
    // Use the util function and print out the result
    String firstNum = MyUtil.extractFirstNumber("Testing4234Things");
    System.out.println(firstNum);
    
    0 讨论(0)
  • 2020-11-22 13:38

    In Java 1.4 and up:

    String input = "...";
    Matcher matcher = Pattern.compile("[^0-9]+([0-9]+)[^0-9]+").matcher(input);
    if (matcher.find()) {
        String someNumberStr = matcher.group(1);
        // if you need this to be an int:
        int someNumberInt = Integer.parseInt(someNumberStr);
    }
    
    0 讨论(0)
提交回复
热议问题