Generate unique hashes for django models

后端 未结 4 1396
感情败类
感情败类 2021-02-01 07:20

I want to use unique hashes for each model rather than ids.

I implemented the following function to use it across the board easily.

import random,hashlib         


        
相关标签:
4条回答
  • 2021-02-01 07:35

    I do not like this bit:

    uuid = uuid[:5]
    

    In the best scenario (uuid are uniformly distributed) you will get a collision with probability greater than 0.5 after 1k of elements!

    It is because of the birthday problem. In a brief it is proven that the probability of collision exceeds 0.5 when number of elements is larger than square root from number of possible labels.

    You have 0xFFFFF=10^6 labels (different numbers) so after a 1000 of generated values you will start having collisions.

    Even if you enlarge length to -1 you have still problem here:

    str(random.random())[2:]
    

    You will start having collisions after 3 * 10^6 (the same calculations follows).

    I think your best bet is to use uuid that is more likely to be unique, here is an example

    >>> import uuid
    >>> uuid.uuid1().hex
    '7e0e52d0386411df81ce001b631bdd31'
    

    Update If you do not trust math just run the following sample to see the collision:

     >>> len(set(hashlib.sha256(str(i)).hexdigest()[:5] for i in range(0,2000)))
     1999 # it should obviously print 2000 if there wasn't any collision
    
    0 讨论(0)
  • 2021-02-01 07:49

    Django 1.8+ has a built-in UUIDField. Here's the suggested implementation, using the standard library's uuid module, from the docs:

    import uuid
    from django.db import models
    
    class MyUUIDModel(models.Model):
        id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
        # other fields
    

    For older django versions you can use the django-uuidfield package.

    0 讨论(0)
  • 2021-02-01 08:01

    The ugly:

    import random

    From the documentation:

    This module implements pseudo-random number generators for various distributions.

    If anything, please use os.urandom

    Return a string of n random bytes suitable for cryptographic use.

    This is how I use it in my models:

    import os
    from binascii import hexlify
    
    def _createId():
        return hexlify(os.urandom(16))
    
    class Book(models.Model):
        id_book = models.CharField(max_length=32, primary_key=True, default=_createId)
    
    0 讨论(0)
  • 2021-02-01 08:02

    Use your database engine's UUID support instead of making up your own hash. Almost everything beyond SQLite supports them, so there's little reason to not use them.

    0 讨论(0)
提交回复
热议问题