what is the Java equivalent of sscanf for parsing values from a string using a known pattern?

后端 未结 8 804
梦如初夏
梦如初夏 2020-11-27 06:01

So I come from a C background (originally originally, though I haven\'t used that language for almost 5 years) and I\'m trying to parse some values from a string in Java. In

相关标签:
8条回答
  • 2020-11-27 06:34

    None of these examples were really satisfactory to me so I made my own java sscanf utility:

    https://github.com/driedler/java-sscanf/tree/master/src/util/sscanf

    Here's an example of parsing a hex string:

    String buffer = "my hex string: DEADBEEF\n"
    Object output[] = Sscanf.scan(buffer, "my hex string: %X\n", 1);
    
    System.out.println("parse count: " + output.length);
    System.out.println("hex str1: " + (Long)output[0]);
    
    // Output:
    // parse count: 1
    // hex str1: 3735928559
    
    0 讨论(0)
  • 2020-11-27 06:40

    The problem is Java hasn't out parameters (or passing by reference) as C or C#.

    But there is a better way (and more solid). Use regular expressions:

    Pattern p = Pattern.compile("(\\d+)-(\\p{Alpha}+)-(\\d+) (\\d+)\\.(\\d+)\\.(\\d+)\\.(\\d+)")
    Matcher m = p.matcher("17-MAR-11 15.52.25.000000000");
    day = m.group(1);
    month= m.group(2);
    ....
    

    Of course C code is more concise, but this technique has one profit: Patterns specifies format more precise than '%s' and '%d'. So you can use \d{2} to specify that day MUST be compose of exactly 2 digits.

    0 讨论(0)
  • 2020-11-27 06:46

    Here is a solution using scanners:

    Scanner scanner = new Scanner("17-MAR-11 15.52.25.000000000");
    
    Scanner dayScanner = new Scanner(scanner.next());
    Scanner timeScanner = new Scanner(scanner.next());
    
    dayScanner.useDelimiter("-");
    System.out.println("day=" + dayScanner.nextInt());
    System.out.println("month=" + dayScanner.next());
    System.out.println("year=" + dayScanner.nextInt());
    
    timeScanner.useDelimiter("\\.");
    System.out.println("hour=" + timeScanner.nextInt());
    System.out.println("min=" + timeScanner.nextInt());
    System.out.println("sec=" + timeScanner.nextInt());
    System.out.println("fracpart=" + timeScanner.nextInt());
    
    0 讨论(0)
  • 2020-11-27 06:47

    This is far from as elegant solution as one would get with using regex, but ought to work.

    public static void stringStuffThing(){
    String x = "17-MAR-11 15.52.25.000000000";
    String y[] = x.split(" ");
    
    for(String s : y){
        System.out.println(s);
    }
    String date[] = y[0].split("-");
    String values[] = y[1].split("\\.");
    
    for(String s : date){
        System.out.println(s);
    }
    for(String s : values){
        System.out.println(s);
    }
    
    0 讨论(0)
  • Are you familiar with the concept of regular expressions? Java provides you with the ability to use regex by using the Pattern class. Check this one out: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

    You can test your String like that:

    Matcher matcher = Pattern.match(yourString);
    matcher.find();
    

    and then use the methods provided by Matcher to manipulate the string you found or NOT.

    0 讨论(0)
  • 2020-11-27 06:53

    2019 answer: Java's Scanner is flexible for reading a wide range of formats. But if your format has simple {%d, %f, %s} fields then you can scan easily with this small class (~90 lines):

    import java.util.ArrayList;
    
    /**
     * Basic C-style string formatting and scanning.
     * The format strings can contain %d, %f and %s codes.
     * @author Adam Gawne-Cain
     */
    public class CFormat {
        private static boolean accept(char t, char c, int i) {
            if (t == 'd')
                return "0123456789".indexOf(c) >= 0 || i == 0 && c == '-';
            else if (t == 'f')
                return "-0123456789.+Ee".indexOf(c) >= 0;
            else if (t == 's')
                return Character.isLetterOrDigit(c);
            throw new RuntimeException("Unknown format code: " + t);
        }
    
        /**
         * Returns string formatted like C, or throws exception if anything wrong.
         * @param fmt format specification
         * @param args values to format
         * @return string formatted like C.
         */
        public static String printf(String fmt, Object... args) {
            int a = 0;
            StringBuilder sb = new StringBuilder();
            int n = fmt.length();
            for (int i = 0; i < n; i++) {
                char c = fmt.charAt(i);
                if (c == '%') {
                    char t = fmt.charAt(++i);
                    if (t == 'd')
                        sb.append(((Number) args[a++]).intValue());
                    else if (t == 'f')
                        sb.append(((Number) args[a++]).doubleValue());
                    else if (t == 's')
                        sb.append(args[a++]);
                    else if (t == '%')
                        sb.append(t);
                    else
                        throw new RuntimeException("Unknown format code: " + t);
                } else
                    sb.append(c);
            }
            return sb.toString();
        }
    
        /**
         * Returns scanned values, or throws exception if anything wrong.
         * @param fmt format specification
         * @param str string to scan
         * @return scanned values
         */
        public static Object[] scanf(String fmt, String str) {
            ArrayList ans = new ArrayList();
            int s = 0;
            int ns = str.length();
            int n = fmt.length();
            for (int i = 0; i < n; i++) {
                char c = fmt.charAt(i);
                if (c == '%') {
                    char t = fmt.charAt(++i);
                    if (t=='%')
                        c=t;
                    else {
                        int s0 = s;
                        while ((s == s0 || s < ns) && accept(t, str.charAt(s), s - s0))
                            s++;
                        String sub = str.substring(s0, s);
                        if (t == 'd')
                            ans.add(Integer.parseInt(sub));
                        else if (t == 'f')
                            ans.add(Double.parseDouble(sub));
                        else
                            ans.add(sub);
                        continue;
                    }
                }
                if (str.charAt(s++) != c)
                    throw new RuntimeException();
            }
            if (s < ns)
                throw new RuntimeException("Unmatched characters at end of string");
            return ans.toArray();
        }
    }
    

    For example, the OP's case can be handled like this:

        // Example of "CFormat.scanf"
        String str = "17-MAR-11 15.52.25.000000000";
        Object[] a = CFormat.scanf("%d-%s-%d %d.%d.%f", str);
    
        // Pick out scanned fields
        int day = (Integer) a[0];
        String month = (String) a[1];
        int year = (Integer) a[2];
        int hour = (Integer) a[3];
        int min = (Integer) a[4];
        double sec = (Double) a[5];
    
        // Example of "CFormat.printf"  
        System.out.println(CFormat.printf("Got day=%d month=%s hour=%d min=%d sec=%f\n", day, month, year, hour, min, sec));
    
    0 讨论(0)
提交回复
热议问题