Proposed solution: Generate unique IDs in a distributed environment

后端 未结 2 1656
深忆病人
深忆病人 2021-02-01 08:47

I\'ve been browsing the net trying to find a solution that will allow us to generate unique IDs in a regionally distributed environment.

I looked at the following option

2条回答
  •  不知归路
    2021-02-01 09:28

    You are concerned about IDs for two reasons:

    1. Potential for collisions in a complex network infrastructure
    2. Appearance

    Starting with the second issue, Appearance. While a UUID certainly isn't a great beauty when it comes to an identifier, there are diminishing returns as you introduce a truly unique number across a complex data center (or data centers) as you mention. I'm not convinced that there is a dramatic change in perception of an application when a long number versus a UUID is used for example in a URL to a web application. Ideally, neither would be shown, and the ID would only ever be sent via Ajax requests, etc. While a nice clean memorable URL is preferable, it's never stopped me from shopping at Amazon (where they have absolutely hideous URLs). :)

    Even with your proposal, the identifiers, while they would be shorter in the number of characters than a UUID, they are no more memorable than a UUID. So, the appearance likely would remain debatable.

    Talking about the first point..., yes, there are a few cases where UUIDs have been known to generate conflicts. While that shouldn't happen in a properly configured and consistently obtained architecture, I can see how it might happen (but I'm personally a lot less concerned about it).

    So, if you're talking about alternatives, I've become a fan of the simplicity of the MongoDB ObjectId and its techniques for avoiding duplication when generating an ID. The full documentation is here. The quick relevant pieces are similar to your potential design in several ways:

    ObjectId is a 12-byte BSON type, constructed using:

    • a 4-byte value representing the seconds since the Unix epoch,
    • a 3-byte machine identifier,
    • a 2-byte process id, and
    • a 3-byte counter, starting with a random value.

    The timestamp can often be useful for sorting. The machine identifier is similar to your application server having a unique ID. The process id is just additional entropy, and finally to prevent conflicts, there is a counter that is auto incremented whenever the timestamp is the same as the last time an ObjectId is generated (so that ObjectIds can be created rapidly). ObjectIds can be generated on the client or on the database. Further, ObjectIds do take up fewer bytes than a UUID (but only 4). Of course, you could not use the timestamp and drop 4 bytes.

    For clarification, I'm not suggesting you use MongoDB, but be inspired by the technique they use for ID generation.

    So, I think your solution is decent (and maybe you want to be inspired by MongoDB's implementation of a unique ID) and doable. As to whether you need to do it, I think that's a question only you can answer.

提交回复
热议问题