Django - short non-linear non-predictable ID in the URL

后端 未结 1 1502
天涯浪人
天涯浪人 2021-02-09 06:17

I know there are similar questions (like this, this, this and this) but I have specific requirements and looking for a less-expensive way to do the following (on Django 1.10.2):

相关标签:
1条回答
  • 2021-02-09 06:51

    After researching a lot of options on SO, blogs etc., I ended up doing the following:

    • Encoding the id to base32 only for the URLs and decoding it back in urls.py (using an edited version of Django’s util functions to encode to base 36 since I needed uppercase letters instead of lowercase).
    • Not storing the encoded id anywhere. Just encoding and decoding everytime on the fly.
    • Keeping the default id intact and using it as primary key.

    (good hints, posts and especially this comment helped a lot)

    What this solution helps achieve:

    1. Absolutely no edits to models or post_save signals.
    2. No collision checks needed. Avoiding one extra request to the db.
    3. Lookup still happens on the default id which is fast. Also, no double save()requests on the model for every insert.
    4. Short and sweet encoded ID (the number of characters go up as the number of records increase but still not very long)

    What it doesn’t help achieve/any drawbacks:

    1. Encryption - the ID is encoded but not encrypted, so the user may still be able to figure out the pattern to get to the id (but I dont care about it much, as mentioned above).
    2. A tiny overhead of encoding and decoding on each URL construction/request but perhaps that’s better than collision checks and/or multiple save() calls on the model object for insertions.

    For reference, looks like there are multiple ways to generate random IDs that I discovered along the way (like Django’s get_random_string, Python’s random, Django’s UUIDField etc.) and many ways to encode the current ID (base 36, base 62, XORing, and what not). The encoded ID can also be stored as another (indexed) field and looked up every time (like here) but depends on the performance parameters of the web app (since looking up a varchar id is less performant that looking up an integer id). This identifier field can either be saved from a overwritten model’s save() function, or by using a post_save() signal (see here) (while both approaches will need the save() function to be called twice for every insert).

    All ears to optimizations to the above approach. I love SO and the community. Everytime there’s so much to learn here.

    Update: After more than a year of this post, I found this great library called hashids which does pretty much the same thing quite well! Its available in many languages including Python.

    0 讨论(0)
提交回复
热议问题