How do I create a URL shortener?

后端 未结 30 2039
我寻月下人不归
我寻月下人不归 2020-11-22 05:11

I want to create a URL shortener service where you can write a long URL into an input field and the service shortens the URL to \"http://www.example.org/abcdef\

30条回答
  •  旧巷少年郎
    2020-11-22 05:45

    This is my initial thoughts, and more thinking can be done, or some simulation can be made to see if it works well or any improvement is needed:

    My answer is to remember the long URL in the database, and use the ID 0 to 9999999999999999 (or however large the number is needed).

    But the ID 0 to 9999999999999999 can be an issue, because

    1. it can be shorter if we use hexadecimal, or even base62 or base64. (base64 just like YouTube using A-Z a-z 0-9 _ and -)
    2. if it increases from 0 to 9999999999999999 uniformly, then hackers can visit them in that order and know what URLs people are sending each other, so it can be a privacy issue

    We can do this:

    1. have one server allocate 0 to 999 to one server, Server A, so now Server A has 1000 of such IDs. So if there are 20 or 200 servers constantly wanting new IDs, it doesn't have to keep asking for each new ID, but rather asking once for 1000 IDs
    2. for the ID 1, for example, reverse the bits. So 000...00000001 becomes 10000...000, so that when converted to base64, it will be non-uniformly increasing IDs each time.
    3. use XOR to flip the bits for the final IDs. For example, XOR with 0xD5AA96...2373 (like a secret key), and the some bits will be flipped. (whenever the secret key has the 1 bit on, it will flip the bit of the ID). This will make the IDs even harder to guess and appear more random

    Following this scheme, the single server that allocates the IDs can form the IDs, and so can the 20 or 200 servers requesting the allocation of IDs. The allocating server has to use a lock / semaphore to prevent two requesting servers from getting the same batch (or if it is accepting one connection at a time, this already solves the problem). So we don't want the line (queue) to be too long for waiting to get an allocation. So that's why allocating 1000 or 10000 at a time can solve the issue.

提交回复
热议问题