How to generate a random alpha-numeric string

前端 未结 30 2455
忘掉有多难
忘掉有多难 2020-11-21 05:38

I\'ve been looking for a simple Java algorithm to generate a pseudo-random alpha-numeric string. In my situation it would be used as a unique session/key identifie

30条回答
  •  余生分开走
    2020-11-21 06:02

    This is easily achievable without any external libraries.

    1. Cryptographic Pseudo Random Data Generation (PRNG)

    First you need a cryptographic PRNG. Java has SecureRandom for that and typically uses the best entropy source on the machine (e.g. /dev/random). Read more here.

    SecureRandom rnd = new SecureRandom();
    byte[] token = new byte[byteLength];
    rnd.nextBytes(token);
    

    Note: SecureRandom is the slowest, but most secure way in Java of generating random bytes. I do however recommend not considering performance here since it usually has no real impact on your application unless you have to generate millions of tokens per second.

    2. Required Space of Possible Values

    Next you have to decide "how unique" your token needs to be. The whole and only point of considering entropy is to make sure that the system can resist brute force attacks: the space of possible values must be so large that any attacker could only try a negligible proportion of the values in non-ludicrous time1.

    Unique identifiers such as random UUID have 122 bit of entropy (i.e., 2^122 = 5.3x10^36) - the chance of collision is "*(...) for there to be a one in a billion chance of duplication, 103 trillion version 4 UUIDs must be generated2". We will choose 128 bits since it fits exactly into 16 bytes and is seen as highly sufficient for being unique for basically every, but the most extreme, use cases and you don't have to think about duplicates. Here is a simple comparison table of entropy including simple analysis of the birthday problem.

    For simple requirements, 8 or 12 byte length might suffice, but with 16 bytes you are on the "safe side".

    And that's basically it. The last thing is to think about encoding so it can be represented as a printable text (read, a String).

    3. Binary to Text Encoding

    Typical encodings include:

    • Base64 every character encodes 6 bit, creating a 33% overhead. Fortunately there are standard implementations in Java 8+ and Android. With older Java you can use any of the numerous third-party libraries. If you want your tokens to be URL safe use the URL-safe version of RFC4648 (which usually is supported by most implementations). Example encoding 16 bytes with padding: XfJhfv3C0P6ag7y9VQxSbw==

    • Base32 every character encodes 5 bit, creating a 40% overhead. This will use A-Z and 2-7, making it reasonably space efficient while being case-insensitive alpha-numeric. There isn't any standard implementation in the JDK. Example encoding 16 bytes without padding: WUPIL5DQTZGMF4D3NX5L7LNFOY

    • Base16 (hexadecimal) every character encodes four bit, requiring two characters per byte (i.e., 16 bytes create a string of length 32). Therefore hexadecimal is less space efficient than Base32, but it is safe to use in most cases (URL) since it only uses 0-9 and A to F. Example encoding 16 bytes: 4fa3dd0f57cb3bf331441ed285b27735. See a Stack Overflow discussion about converting to hexadecimal here.

    Additional encodings like Base85 and the exotic Base122 exist with better/worse space efficiency. You can create your own encoding (which basically most answers in this thread do), but I would advise against it, if you don't have very specific requirements. See more encoding schemes in the Wikipedia article.

    4. Summary and Example

    • Use SecureRandom
    • Use at least 16 bytes (2^128) of possible values
    • Encode according to your requirements (usually hex or base32 if you need it to be alpha-numeric)

    Don't

    • ... use your home brew encoding: better maintainable and readable for others if they see what standard encoding you use instead of weird for loops creating characters at a time.
    • ... use UUID: it has no guarantees on randomness; you are wasting 6 bits of entropy and have a verbose string representation

    Example: Hexadecimal Token Generator

    public static String generateRandomHexToken(int byteLength) {
        SecureRandom secureRandom = new SecureRandom();
        byte[] token = new byte[byteLength];
        secureRandom.nextBytes(token);
        return new BigInteger(1, token).toString(16); // Hexadecimal encoding
    }
    
    //generateRandomHexToken(16) -> 2189df7475e96aa3982dbeab266497cd
    

    Example: Base64 Token Generator (URL Safe)

    public static String generateRandomBase64Token(int byteLength) {
        SecureRandom secureRandom = new SecureRandom();
        byte[] token = new byte[byteLength];
        secureRandom.nextBytes(token);
        return Base64.getUrlEncoder().withoutPadding().encodeToString(token); //base64 encoding
    }
    
    //generateRandomBase64Token(16) -> EEcCCAYuUcQk7IuzdaPzrg
    

    Example: Java CLI Tool

    If you want a ready-to-use CLI tool you may use dice:

    Example: Related issue - Protect Your Current Ids

    If you already have an id you can use (e.g., a synthetic long in your entity), but don't want to publish the internal value, you can use this library to encrypt it and obfuscate it: https://github.com/patrickfav/id-mask

    IdMask idMask = IdMasks.forLongIds(Config.builder(key).build());
    String maskedId = idMask.mask(id);
    // Example: NPSBolhMyabUBdTyanrbqT8
    long originalId = idMask.unmask(maskedId);
    

提交回复
热议问题