Handling race condition in model.save()

旧街凉风 提交于 2019-12-18 02:12:42

问题


How should one handle a possible race condition in a model's save() method?

For example, the following example implements a model with an ordered list of related items. When creating a new Item the current list size is used as its position.

From what I can tell, this can go wrong if multiple Items are created concurrently.

class OrderedList(models.Model):
    # ....
    @property
    def item_count(self):
        return self.item_set.count()

class Item(models.Model):
    # ...
    name   = models.CharField(max_length=100)
    parent = models.ForeignKey(OrderedList)
    position = models.IntegerField()
    class Meta:
        unique_together = (('parent','position'), ('parent', 'name'))

    def save(self, *args, **kwargs):
        if not self.id:
            # use item count as next position number
            self.position = parent.item_count
        super(Item, self).save(*args, **kwargs)

I've come across @transactions.commit_on_success() but that seems to apply only to views. Even if it did apply to model methods, I still wouldn't know how to properly handle a failed transaction.

I am currenly handling it like so, but it feels more like a hack than a solution

def save(self, *args, **kwargs):
    while not self.id:
        try:
            self.position = self.parent.item_count
            super(Item, self).save(*args, **kwargs)
        except IntegrityError:
            # chill out, then try again
            time.sleep(0.5)

Any suggestions?

Update:

Another problem with the above solution is that the while loop will never end if IntegrityError is caused by a name conflict (or any other unique field for that matter).

For the record, here's what I have so far which seems to do what I need:

def save(self, *args, **kwargs):   
    # for object update, do the usual save     
    if self.id: 
        super(Step, self).save(*args, **kwargs)
        return

    # for object creation, assign a unique position
    while not self.id:
        try:
            self.position = self.parent.item_count
            super(Step, self).save(*args, **kwargs)
        except IntegrityError:
            try:
                rival = self.parent.item_set.get(position=self.position)
            except ObjectDoesNotExist: # not a conflict on "position"
                raise IntegrityError
            else:
                sleep(random.uniform(0.5, 1)) # chill out, then try again

回答1:


It may feel like a hack to you, but to me it looks like a legitimate, reasonable implementation of the "optimistic concurrency" approach -- try doing whatever, detect conflicts caused by race conditions, if one occurs, retry a bit later. Some databases systematically uses that instead of locking, and it can lead to much better performance except under systems under a lot of write-load (which are quite rare in real life).

I like it a lot because I see it as a general case of the Hopper Principle: "it's easy to ask forgiveness than permission", which applies widely in programming (especially but not exclusively in Python -- the language Hopper is usually credited for is, after all, Cobol;-).

One improvement I'd recommend is to wait a random amount of time -- avoid a "meta-race condition" where two processes try at the same time, both find conflicts, and both retry again at the same time, leading to "starvation". time.sleep(random.uniform(0.1, 0.6)) or the like should suffice.

A more refined improvement is to lengthen the expected wait if more conflicts are met -- this is what is known as "exponential backoff" in TCP/IP (you wouldn't have to lengthen things exponentially, i.e. by a constant multiplier > 1 each time, of course, but that approach has nice mathematical properties). It's only warranted to limit problems for very write-loaded systems (where multiple conflicts during attempted writes happen quite often) and it may likely not be worth it in your specific case.




回答2:


Add optional FOR UPDATE clause to QuerySets http://code.djangoproject.com/ticket/2705




回答3:


I use Shawn Chin's solution and it proves very useful. The only change I did was to replace the

self.position = self.parent.item_count

with

self.position = self.parent.latest('position').position

just to make sure I am dealing with the latest position number (which in my case might not be item_count because of some reserved unused positions)



来源:https://stackoverflow.com/questions/3522827/handling-race-condition-in-model-save

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!