Context
Web application, PHP 5, MySQL 5.0.91
The Problem
I recently switched from using an auto-incremented integer
Your concern, that "most of the UUID is useless and is wasting space" is inherent to the size of the data type. You will never be able to have as many entries in your database, as the theoretical limit of 16 bytes allows.
In fact, V1 UUID is more fit than V4 if you use the UUID just as a table ID - because it uses MAC-address and time stamp to prevent clashes. In V4 there is no such mechanism, although practically you don't need to worry too much about clashes either :) You should use V4 UUID instead of V1 if you need your UUID to be unpredictable.
Also, note that composing for example 4x4 byte random values may not be the same as creating a 16 byte random value. As always with crypto and randomness: I would disadvise from implementing your own UUID::V4 routine.
If installed on your machine, you can make use of the php-uuid
package.
An example code (which can be used in your application as is) can be found here:
http://rommelsantor.com/clog/2012/02/23/generate-uuid-in-php/
Use it like this:
$uuid = uuid_create(1);
Users that are are able to install packages on their webserver, can install the required package, like: (here for ubuntu)
apt-get install php5-dev uuid-dev
pecl install uuid
It's actually a fairly good idea to appreciate having the "similar parts". It will allow you to leverage the MAC address to be able to identify "which of my servers generated this UUID?"... which will be extremely helpful when migrating data between remote locations. You can even do "this is my test data" and "this is my production data" this way.
PHP has a large number of UUID-generator libraries.
Here's one PECL/PEAR thing (I never used it):
http://pecl.php.net/package/uuid
From the CakePHP framework:
http://api.cakephp.org/class/string#method-Stringuuid (cake 2.x) http://api13.cakephp.org/class/string#method-Stringuuid (cake 1.3)
Last generator option:
Consider using a Linux command-line uuid
program, which would have the -v
version control flag and related options, and using that to feed your database. It's sort of inefficient, but at least you won't have to write up your own generator functions.
http://linux.die.net/man/1/uuid - man page
(package uuid
for Debian)
I noticed that for the namespace versions, you'll be generating lots of "long human names" to convert into uuids. As long as you don't have conflicts with those, it might be very sweet. For example, users registering with e-mail addresses... Get v5 uuid for that e-mail address... you'll always find that person! It seems to spit out the same UUID each time, and the UUID will represent the unique relationship bob@bob.com has with example.com, as a member.
uuid -v5 ns:URL "http://example.com/member/bob@bob.com/"
Commentary:
Also, UUIDs, the way you seem to be storing them, are CHAR(36)? You might regret that once comparison operators kick in.
Postgres will treat UUID as 128-bit values (and presumably do optimized binary operations), whereas MYSQL's CHAR(36) solution is looking at 36 bytes = 288-bits ANSI or 576-bits UTF8 plus-or-minus bits/bytes for office-keeping (and presumably do much slower multibyte-char-by-multibyte-char string routines).
I've actually put a lot of consideration into the issues for MySQL plus UUID... and my conclusion was that you'd want to write up a stored function that converts the hex representation into the binary representation for storage, and that would make all "select" statements require a conversion back into hex representation... and who knows how efficient any of that will be... so finally just switch to Postgres. XD
If you do want to switch to Postgres, try be very careful about installing it on your existing server(s) if they are production servers. As in... make a clone to test the migration process before actually doing a migration. I somehow managed to kill my system because of "installing this package will remove a large number of important other packages" (I don't know how the installer made those decisions).
Alternatively, go with Microsoft SQL for their GUID equivalent, if you're prepared to eventually pay them lots of money to operate a DB...
Doing UUID and MySQL just tends to be a bad idea at the moment.