java regex for UUID

后端 未结 2 1224
半阙折子戏
半阙折子戏 2021-01-17 20:34

I want to parse a String which has UUID in the below format

\"<urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce>\"

I have tried

相关标签:
2条回答
  • 2021-01-17 21:04

    If this format don't be changed. I think more fast way is use String.substring() method. Example:

    String val = "&lt;urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce&gt;";
    String sUuid = val.substring(13, 49);
    UUID uuid =  UUID.fromString(sUuid);
    

    Inside class String used char array for store data, in package java.lang.String:

    public final class String
        implements java.io.Serializable, Comparable<String>, CharSequence {
    ...
    113: /** The value is used for character storage. */
    114: private final char value[];
    ...
    }
    

    Method 'String substring(int beginIndex, int endIndex)' make the copy of array elements, from start to end index, and create new String on basis new array. Copying of array it is a very fast operation.

    0 讨论(0)
  • 2021-01-17 21:06

    Your example of a faster regex is using a < where the input is &lt; so that's confusing.

    Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z but rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

    You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1 and re2 together in your final Pattern. There's no indication you need DOTALL either.

    private static final Pattern splitter =
      Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");
    

    Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
    Is each uuid preceded by "uuid:" as in your example? If so you can

    1. find the first index of "uuid:" as i, then
    2. substring 0 to i+5 [assuming you needed it at all], and
    3. substring i+5 to i+41, if I counted that right (36 characters in length).

    Along similar lines your faster regex could be:

    private static final Pattern URN_UUID_PATTERN =
        Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");
    

    OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

    0 讨论(0)
提交回复
热议问题