ID Generation for Sharded Database (Azure Federated Database)

后端 未结 4 1090
后悔当初
后悔当初 2021-02-09 12:29

I have been looking for some articles or guidence on best practice for id generation (for the federated/primary key) for Azure Federated databases and haven\'t found anything co

相关标签:
4条回答
  • 2021-02-09 13:10

    When you think about your federation key it is important to think about a key that will actually cause a good distribution across federation members, so in many cases a generated id is not a good idea. For example - partitioning on order id will mean that all the latest orders are in the latest federation member, and is likely to be the one most users are acting on, so the benefits of federation will be greatly reduced, partitioning on country/customer id/etc is more likely to achieve the scalability benefits federation is designed to bring.

    When it comes to a row's unique identity you need to consider that entities will be stored accross different databases and for that reason identity or sequence generations are not available, check out Cihan Biyikoglu blog post on this - his recommendation is to either use uniqueidentifier or datetimeoffset

    0 讨论(0)
  • 2021-02-09 13:11

    You could create sequences in the application using a variety of techniques, but they are not straightforward because of the distributed nature. One that is quite good is using blob storage and preconditions.

    Depending on your project schedule you may want to use the SQL 2012 SEQUENCE and put all your sequences in a small non-federated database. SEQUENCE is not available yet on SQL Azure.

    0 讨论(0)
  • 2021-02-09 13:11

    In my projects I always using GUID for the federation key, as I don't think it causes massive performance problem. Maybe my project is not that huge, but it does works to me. So my answer to your first question is 'yes'.

    Your next question, I'm thinking about to have an ID Generator service there, exactly as that you thought, but yes it could be a bottleneck. I was thinking if we can have an ID pool, which utilizes some distribution cache to store the IDs that generated by this service. So that use anyone wants an ID it will retrieved from the pool, rather than generating on demand. So the ID Generator will continue pushing IDs in that pool and the consumers will pop an ID from it. That might be helpful, but again, I've never implemented in this way so I may not be able to say if it's the best practice or not.

    Hope this helps.

    0 讨论(0)
  • 2021-02-09 13:16

    The one negative of using a GUID as a primary key is that if the table is clustered on the primary key, it would cause substantial page splits on inserts. This is because good GUIDs are not generated in chronologically order so as to be hard to guess.

    Azure SQL tables do need a clustered index. My suggestion is to have a clustered index on a range based value (like datetime) and use a non-clustered index for the primary key, which would be the GUID.

    0 讨论(0)
提交回复
热议问题