How would I create a java.util.UUID from a string with no dashes?
\"5231b533ba17478798a3f2df37de2aD7\" => #uuid \"5231b533-ba17-4787-98a3-f2df37de2aD7\"
<
Optimized version of @maerics 's answer:
String[] digitsList= {
"daa70a7ffa904841bf9a81a67bdfdb45",
"529737c950e6428f80c0bac104668b54",
"5673c26e2e8f4c129906c74ec634b807",
"dd5a5ee3a3c44e4fb53d2e947eceeda5",
"faacc25d264d4e9498ade7a994dc612e",
"9a1d322dc70349c996dc1d5b76b44a0a",
"5fcfa683af5148a99c1bd900f57ea69c",
"fd9eae8272394dfd8fd42d2bc2933579",
"4b14d571dd4a4c9690796da318fc0c3a",
"d0c88286f24147f4a5d38e6198ee2d18"
};
//Use compiled pattern to improve performance of bulk operations
Pattern pattern = Pattern.compile("(\\w{8})(\\w{4})(\\w{4})(\\w{4})(\\w{12})");
for (int i = 0; i < digitsList.length; i++)
{
String uuid = pattern.matcher(digitsList[i]).replaceAll("$1-$2-$3-$4-$5");
System.out.println(uuid);
}
A much (~ 900%) faster solution compared to using regexps and string manipulation is to just parse the hex string into 2 longs and create the UUID instance from those:
(defn uuid-from-string
"Converts a 32digit hex string into java.util.UUID"
[hex]
(java.util.UUID.
(Long/parseUnsignedLong (subs hex 0 16) 16)
(Long/parseUnsignedLong (subs hex 16) 16)))
Regexp solution is probably faster, but you can also look at that :)
String withoutDashes = "44e128a5-ac7a-4c9a-be4c-224b6bf81b20".replaceAll("-", "");
BigInteger bi1 = new BigInteger(withoutDashes.substring(0, 16), 16);
BigInteger bi2 = new BigInteger(withoutDashes.substring(16, 32), 16);
UUID uuid = new UUID(bi1.longValue(), bi2.longValue());
String withDashes = uuid.toString();
By the way, conversion from 16 binary bytes to uuid
InputStream is = ..binarty input..;
byte[] bytes = IOUtils.toByteArray(is);
ByteBuffer bb = ByteBuffer.wrap(bytes);
UUID uuidWithDashesObj = new UUID(bb.getLong(), bb.getLong());
String uuidWithDashes = uuidWithDashesObj.toString();
Here is an example that is faster because it doesn't use regexp.
public class Example1 {
/**
* Get a UUID with hyphens from 32 char hexadecimal.
*
* @param string a hexadecimal string
* @return a UUID string
*/
public static String toUuidString(String string) {
if (string == null || string.length() != 32) {
throw new IllegalArgumentException("invalid input string!");
}
char[] input = string.toCharArray();
char[] output = new char[36];
System.arraycopy(input, 0, output, 0, 8);
System.arraycopy(input, 8, output, 9, 4);
System.arraycopy(input, 12, output, 14, 4);
System.arraycopy(input, 16, output, 19, 4);
System.arraycopy(input, 20, output, 24, 12);
output[8] = '-';
output[13] = '-';
output[18] = '-';
output[23] = '-';
return new String(output);
}
public static void main(String[] args) {
String example = "daa70a7ffa904841bf9a81a67bdfdb45";
String canonical = toUuidString(example);
UUID uuid = UUID.fromString(canonical);
}
}
java.util.UUID.fromString(
"5231b533ba17478798a3f2df37de2aD7"
.replaceFirst(
"(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)", "$1-$2-$3-$4-$5"
)
).toString()
5231b533-ba17-4787-98a3-f2df37de2ad7
Or parse each half of the hexadecimal string as long
integer numbers, and pass to constructor of UUID.
UUID uuid = new UUID ( long1 , long2 ) ;
A UUID is a 128-bit value. A UUID is not actually made up of letters and digits, it is made up of bits. You can think of it as describing a very, very large number.
We could display those bits as a one hundred and twenty eight 0
& 1
characters.
0111 0100 1101 0010 0101 0001 0101 0110 0110 0000 1110 0110 0100 0100 0100 1100 1010 0001 0111 0111 1010 1001 0110 1110 0110 0111 1110 1100 1111 1100 0101 1111
Humans do not easily read bits, so for convenience we usually represent the 128-bit value as a hexadecimal string made up of letters and digits.
74d25156-60e6-444c-a177-a96e67ecfc5f
Such a hex string is not the UUID itself, only a human-friendly representation. The hyphens are added per the UUID spec as canonical formatting, but are optional.
74d2515660e6444ca177a96e67ecfc5f
By the way, the UUID spec clearly states that lowercase letters must be used when generating the hex string while uppercase should be tolerated as input. Unfortunately, many implementations violate that lowercase-generation rule, including those from Apple, Microsoft, and others. See my blog post.
The following refers to Java, not Clojure.
In Java 7 (and earlier), you may use the java.util.UUID class to instantiate a UUID based on a hex string with hyphens as input. Example:
java.util.UUID uuidFromHyphens = java.util.UUID.fromString("6f34f25e-0b0d-4426-8ece-a8b3f27f4b63");
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );
However, that UUID class fails with inputting a hex string without hyphens. This failure is unfortunate as the UUID spec does not require the hyphens in a hex string representation. This fails:
java.util.UUID uuidFromNoHyphens = java.util.UUID.fromString("6f34f25e0b0d44268ecea8b3f27f4b63");
One workaround is to format the hex string to add the canonical hyphens. Here's my attempt at using regex to format the hex string. Beware… This code works, but I'm no regex expert. You should make this code more robust, say checking that the length of the string is 32 characters before formatting and 36 after.
// -----| With Hyphens |----------------------
java.util.UUID uuidFromHyphens = java.util.UUID.fromString( "6f34f25e-0b0d-4426-8ece-a8b3f27f4b63" );
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );
System.out.println();
// -----| Without Hyphens |----------------------
String hexStringWithoutHyphens = "6f34f25e0b0d44268ecea8b3f27f4b63";
// Use regex to format the hex string by inserting hyphens in the canonical format: 8-4-4-4-12
String hexStringWithInsertedHyphens = hexStringWithoutHyphens.replaceFirst( "([0-9a-fA-F]{8})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]+)", "$1-$2-$3-$4-$5" );
System.out.println( "hexStringWithInsertedHyphens: " + hexStringWithInsertedHyphens );
java.util.UUID myUuid = java.util.UUID.fromString( hexStringWithInsertedHyphens );
System.out.println( "myUuid: " + myUuid );
You might find this alternative syntax more readable, using Posix notation within the regex where \\p{XDigit}
takes the place of [0-9a-fA-F]
(see Pattern doc):
String hexStringWithInsertedHyphens = hexStringWithoutHyphens.replaceFirst( "(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)", "$1-$2-$3-$4-$5" );
Complete example.
java.util.UUID uuid =
java.util.UUID.fromString (
"5231b533ba17478798a3f2df37de2aD7"
.replaceFirst (
"(\\p{XDigit}{8})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}{4})(\\p{XDigit}+)",
"$1-$2-$3-$4-$5"
)
);
System.out.println ( "uuid.toString(): " + uuid );
uuid.toString(): 5231b533-ba17-4787-98a3-f2df37de2ad7
Another solution would be something similar to Pawel's solution but without creating new Strings and only solving the questions problem. If perfomance is a concern, avoid regex/split/replaceAll and UUID.fromString like the plague.
String hyphenlessUuid = in.nextString();
BigInteger bigInteger = new BigInteger(hyphenlessUuid, 16);
new UUID(bigInteger.shiftRight(64).longValue(), bigInteger.longValue());