How to implement “autoincrement” on Google AppEngine

前端 未结 9 797
面向向阳花
面向向阳花 2020-11-27 16:23

I have to label something in a \"strong monotone increasing\" fashion. Be it Invoice Numbers, shipping label numbers or the like.

  1. A number MUST NOT BE used twi
相关标签:
9条回答
  • 2020-11-27 17:03

    If you drop the requirement that IDs must be strictly sequential, you can use a hierarchical allocation scheme. The basic idea/limitation is that transactions must not affect multiple storage groups.

    For example, assuming you have the notion of "users", you can allocate a storage group for each user (creating some global object per user). Each user has a list of reserved IDs. When allocating an ID for a user, pick a reserved one (in a transaction). If no IDs are left, make a new transaction allocating 100 IDs (say) from the global pool, then make a new transaction to add them to the user and simultaneously withdraw one. Assuming each user interacts with the application only sequentially, there will be no concurrency on the user objects.

    0 讨论(0)
  • 2020-11-27 17:03

    If you aren't too strict on the sequential, you can "shard" your incrementer. This could be thought of as an "eventually sequential" counter.

    Basically, you have one entity that is the "master" count. Then you have a number of entities (based on the load you need to handle) that have their own counters. These shards reserve chunks of ids from the master and serve out from their range until they run out of values.

    Quick algorithm:

    1. You need to get an ID.
    2. Pick a shard at random.
    3. If the shard's start is less than its end, take it's start and increment it.
    4. If the shard's start is equal to (or more oh-oh) its end, go to the master, take the value and add an amount n to it. Set the shards start to the retrieved value plus one and end to the retrieved plus n.

    This can scale quite well, however, the amount you can be out by is the number of shards multiplied by your n value. If you want your records to appear to go up this will probably work, but if you want to have them represent order it won't be accurate. It is also important to note that the latest values may have holes, so if you are using that to scan for some reason you will have to mind the gaps.

    Edit

    I needed this for my app (that was why I was searching the question :P ) so I have implemented my solution. It can grab single IDs as well as efficiently grab batches. I have tested it in a controlled environment (on appengine) and it performed very well. You can find the code on github.

    0 讨论(0)
  • 2020-11-27 17:08

    Remember: Sharding increases the probability that you will get a unique, auto-increment value, but does not guarantee it. Please take Nick's advice if you MUST have a unique auto-incrment.

    0 讨论(0)
  • 2020-11-27 17:12

    Alternatively, you could use allocate_ids(), as people have suggested, then creating these entities up front (i.e. with placeholder property values).

    first, last = MyModel.allocate_ids(1000000)
    keys = [Key(MyModel, id) for id in range(first, last+1)]
    

    Then, when creating a new invoice, your code could run through these entries to find the one with the lowest ID such that the placeholder properties have not yet been overwritten with real data.

    I haven't put that into practice, but seems like it should work in theory, most likely with the same limitations people have already mentioned.

    0 讨论(0)
  • 2020-11-27 17:13

    If you absolutely have to have sequentially increasing numbers with no gaps, you'll need to use a single entity, which you update in a transaction to 'consume' each new number. You'll be limited, in practice, to about 1-5 numbers generated per second - which sounds like it'll be fine for your requirements.

    0 讨论(0)
  • 2020-11-27 17:13

    I implemented something very simplistic for my blog, which increments an IntegerProperty, iden rather than the Key ID.

    I define max_iden() to find the maximum iden integer currently being used. This function scans through all existing blog posts.

    def max_iden():
        max_entity = Post.gql("order by iden desc").get()
        if max_entity:
            return max_entity.iden
        return 1000    # If this is the very first entry, start at number 1000
    

    Then, when creating a new blog post, I assign it an iden property of max_iden() + 1

    new_iden = max_iden() + 1
    p = Post(parent=blog_key(), header=header, body=body, iden=new_iden)
    p.put()
    

    I wonder if you might also want to add some sort of verification function after this, i.e. to ensure the max_iden() has now incremented, before moving onto the next invoice.

    Altogether: fragile, inefficient code.

    0 讨论(0)
提交回复
热议问题