brad's life - UUIDs in the database [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

UUIDs in the database [Nov. 11th, 2005|11:28 am]
Previous Entry Add to Memories Share Next Entry
[Tags|, ]

There's talk at work of changing an upcoming's product's schema to use UUIDs as primary keys instead of dual-column integers of the form (userid, per-user-assetid), and I'm not really comfortably with the idea, but I wanted to throw it out for discussion.

The advantages to UUIDs that motivated the discussion are:

  • No reliance on centralized number allocators. Any node in a cluster could inject a record without coordination with others. Multiple data centers could be merged without breaking numbers.
My concerns are:
  • Bigger keys -- 16 byte primary keys instead of 8. And for inter-asset relationships (for the same user) where normally a table would have 4 bytes (user) + 4 bytes (asset1) + 4 bytes (asset2) for a total of 12 bytes, now we'd have 16 bytes + 16 bytes for a total of 32 bytes.... that's a big increase, and there'd still be no way to do a prefix match on the key to get all relationships for a given users without adding YET ANOTHER index to the table, or doing a join. Let me make that more clear. Consider this schema:
    create table thingy_relation (
      userid  INT UNSIGNED NOT NULL,   /* 4 bytes */
      thing1  INT UNSIGNED NOT NULL,   /* 4 bytes */
      thing2  INT UNSIGNED NOT NULL,   /* 4 bytes */
      PRIMARY KEY (userid, thing1, thing2),  /* 12 bytes */
      INDEX       (userid, thing2)           /* 8 bytes  */
    Now we'd have to do:
    create table thingy_relation_uuid (
      thing1  CHAR(16) BINARY NOT NULL, /* 16 bytes */
      thing2  CHAR(16) BINARY NOT NULL, /* 16 bytes */
      PRIMARY KEY (thing1, thing2),   /* 32 bytes */
      INDEX       (thing2)            /* 16 bytes */

    This hypothetical schema tracks which assets are related to each other in some way, and lets you query either forward or backward from an asset.

    And notice in the thingy_relation_uuid table, there's no way to query and just get all relations from a userid.

  • Clustered indexes, now not clustered by userid (spatial locality) and assetid (temporal locality) -- with a dual-key (userid,assetid) schema, all of a user's rows are clustered together in the B-Tree (with InnoDB, at least), and better, sequential posts/photos/etc (potentially related and/or near each other in time) are also clustered next to each other. So a query like, "Give me the most recent 100 blog entries form userid 8232." generally means one real disk seek and very few database pages. But if you're using a UUID, records will most likely be all over the B-Tree, meaning a lot more disk seeks and database pages brought in to memory, meaning the "needed records / pages brought in" ratio goes down, wasting memory. The high bits of a UUID are generally low-res time, so you still get temporal locality for things like "Give me the most recent 100 blog posts for all users on the server", but we're already partitioning the data, so that's not a query we'd even do per-partition. So things like "Give me 100 most recent blog posts from bob" will be efficient if bob does 100 posts all back-to-back without a few million other posts from other users mixed in, but that's not going to happen on a busy server.
  • MySQL shell -- "select * from thingy_relation_uuid". Whoops. You just screwed your terminal with binary gibberish. And if you have to debug something and do some one-off queries for reports/investigation/etc, how do you type in the binary packaged UUID structure? You don't, not without a new MySQL shell and/or some UDFs to convert to/from UUID ASCII form. This is all solvable, but I'm afraid it'll never get done.
  • Silent replication failures -- another thing that was argued about UUIDs is that it makes really tricky replication topologies easier, due to PKs not colliding, but it also means it's possible to mess up the replication and not notice it for the same reasons... you won't get errors when you do something wrong. So I see this as a sort of non-bonus.

Counter argument to ease of number allocation: there are lots of ways to do number allocation, several of which LJ has used, several of which other people have done. I don't thing something as drastic as UUIDs should be seriously considered for that reason alone, without considering other approaches to number allocations, and all the potential traps of UUIDs.

It may seem like I'm leaning heavily against UUIDs, and that might be true currently, but I'm really curious what other people's experiences with UUIDs as PKs have been.

Enlighten me? Any other pros/cons?


From: evan
2005-11-11 08:12 pm (UTC)


We encounter lots of these sorts of tradeoffs (with likely bigger data sets) and generally switch from one to the other depending on the particular use case.

Custom UUIDs with shared prefixes, as someone else proposed, have clustering problems: if you cluster (by that I mean split into disjoint clusters) by the prefix, you get locality, but you get nonuniform distribution of data; if you cluster by the suffix, you lose locality. Again, this tradeoff depends on a number of factors, like how many clusters you have and what the variance in your user dataset size is.