How do I create a URL shortener?

后端 未结 30 2023
我寻月下人不归
我寻月下人不归 2020-11-22 05:11

I want to create a URL shortener service where you can write a long URL into an input field and the service shortens the URL to \"http://www.example.org/abcdef\

相关标签:
30条回答
  • 2020-11-22 05:40

    Why not just generate a random string and append it to the base URL? This is a very simplified version of doing this in C#.

    static string chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890";
    static string baseUrl = "https://google.com/";
    
    private static string RandomString(int length)
    {
        char[] s = new char[length];
        Random rnd = new Random();
        for (int x = 0; x < length; x++)
        {
            s[x] = chars[rnd.Next(chars.Length)];
        }
        Thread.Sleep(10);
    
        return new String(s);
    }
    

    Then just add the append the random string to the baseURL:

    string tinyURL = baseUrl + RandomString(5);
    

    Remember this is a very simplified version of doing this and it's possible the RandomString method could create duplicate strings. In production you would want to take in account for duplicate strings to ensure you will always have a unique URL. I have some code that takes account for duplicate strings by querying a database table I could share if anyone is interested.

    0 讨论(0)
  • 2020-11-22 05:41

    Function based in Xeoncross Class

    function shortly($input){
    $dictionary = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','0','1','2','3','4','5','6','7','8','9'];
    if($input===0)
        return $dictionary[0];
    $base = count($dictionary);
    if(is_numeric($input)){
        $result = [];
        while($input > 0){
            $result[] = $dictionary[($input % $base)];
            $input = floor($input / $base);
        }
        return join("", array_reverse($result));
    }
    $i = 0;
    $input = str_split($input);
    foreach($input as $char){
        $pos = array_search($char, $dictionary);
        $i = $i * $base + $pos;
    }
    return $i;
    }
    
    0 讨论(0)
  • 2020-11-22 05:41

    Here is a Node.js implementation that is likely to bit.ly. generate a highly random seven-character string.

    It uses Node.js crypto to generate a highly random 25 charset rather than randomly selecting seven characters.

    var crypto = require("crypto");
    exports.shortURL = new function () {
        this.getShortURL = function () {
            var sURL = '',
                _rand = crypto.randomBytes(25).toString('hex'),
                _base = _rand.length;
            for (var i = 0; i < 7; i++)
                sURL += _rand.charAt(Math.floor(Math.random() * _rand.length));
            return sURL;
        };
    }
    
    0 讨论(0)
  • 2020-11-22 05:45

    This is my initial thoughts, and more thinking can be done, or some simulation can be made to see if it works well or any improvement is needed:

    My answer is to remember the long URL in the database, and use the ID 0 to 9999999999999999 (or however large the number is needed).

    But the ID 0 to 9999999999999999 can be an issue, because

    1. it can be shorter if we use hexadecimal, or even base62 or base64. (base64 just like YouTube using A-Z a-z 0-9 _ and -)
    2. if it increases from 0 to 9999999999999999 uniformly, then hackers can visit them in that order and know what URLs people are sending each other, so it can be a privacy issue

    We can do this:

    1. have one server allocate 0 to 999 to one server, Server A, so now Server A has 1000 of such IDs. So if there are 20 or 200 servers constantly wanting new IDs, it doesn't have to keep asking for each new ID, but rather asking once for 1000 IDs
    2. for the ID 1, for example, reverse the bits. So 000...00000001 becomes 10000...000, so that when converted to base64, it will be non-uniformly increasing IDs each time.
    3. use XOR to flip the bits for the final IDs. For example, XOR with 0xD5AA96...2373 (like a secret key), and the some bits will be flipped. (whenever the secret key has the 1 bit on, it will flip the bit of the ID). This will make the IDs even harder to guess and appear more random

    Following this scheme, the single server that allocates the IDs can form the IDs, and so can the 20 or 200 servers requesting the allocation of IDs. The allocating server has to use a lock / semaphore to prevent two requesting servers from getting the same batch (or if it is accepting one connection at a time, this already solves the problem). So we don't want the line (queue) to be too long for waiting to get an allocation. So that's why allocating 1000 or 10000 at a time can solve the issue.

    0 讨论(0)
  • 2020-11-22 05:47

    Take a look at https://hashids.org/ it is open source and in many languages.

    Their page outlines some of the pitfalls of other approaches.

    0 讨论(0)
  • 2020-11-22 05:47

    Why not just translate your id to a string? You just need a function that maps a digit between, say, 0 and 61 to a single letter (upper/lower case) or digit. Then apply this to create, say, 4-letter codes, and you've got 14.7 million URLs covered.

    0 讨论(0)
提交回复
热议问题