Will UUID as primary key in PostgreSQL give bad index performance?

后端 未结 2 2091
抹茶落季
抹茶落季 2021-01-29 20:30

I have created an app in Rails on Heroku using a PostgreSQL database.

It has a couple of tables designed to be able to sync with mobile devices where data can be created

相关标签:
2条回答
  • 2021-01-29 20:51

    As the accepted answer states, range queries may be slow in this case, but not only on id.

    Autoincrement is naturally sorted by date, so when autoincrement is used the data is stored chronologically on disk (see B-Tree) which speeds up reads (no seeking for HDDs). For example, if one lists all the users the natural order would be by date created which is the same as autoincrement and so range queries execute faster on HDDs while on SSD, i guess, the difference would be nonexistent since SSDs are by design always random access (no head seeking, no mechanical parts involved, just pure electricity)

    0 讨论(0)
  • 2021-01-29 21:02

    (I work on Heroku Postgres)

    We use UUIDs as primary keys on a few systems and it works great.

    I recommend you use the uuid-ossp extension, and even have postgres generate UUIDs for you:

    heroku pg:psql
    psql (9.1.4, server 9.1.6)
    SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
    Type "help" for help.
    
    dcvgo3fvfmbl44=> CREATE EXTENSION "uuid-ossp"; 
    CREATE EXTENSION  
    dcvgo3fvfmbl44=> CREATE TABLE test (id uuid primary key default uuid_generate_v4(), name text);  
    NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
    CREATE TABLE  
    dcvgo3fvfmbl44=> \d test
                     Table "public.test"  
    Column | Type |              Modifiers              
    --------+------+-------------------------------------  
    id     | uuid | not null default uuid_generate_v4()  name   | text |  
    Indexes:
        "test_pkey" PRIMARY KEY, btree (id)
    
    dcvgo3fvfmbl44=> insert into test (name) values ('hgmnz'); 
    INSERT 0 1 
    dcvgo3fvfmbl44=> select * from test;
                      id                  | name  
    --------------------------------------+-------   
     e535d271-91be-4291-832f-f7883a2d374f | hgmnz  
    (1 row)
    

    EDIT performance implications

    It will always depend on your workload.

    The integer primary key has the advantage of locality where like-data sits closer together. This can be helpful for eg: range type queries such as WHERE id between 1 and 10000 although lock contention is worse.

    If your read workload is totally random in that you always make primary key lookups, there shouldn't be any measurable performance degradation: you only pay for the larger data type.

    Do you write a lot to this table, and is this table very big? It's possible, although I haven't measured this, that there are implications in maintaining that index. For lots of datasets UUIDs are just fine though, and using UUIDs as identifiers has some nice properties.

    Finally, I may not be the most qualified person to discuss or advice on this, as I have never run a table large enough with a UUID PK where it has become a problem. YMMV. (Having said that, I'd love to hear of people who run into problems with the approach!)

    0 讨论(0)
提交回复
热议问题