Instagram's Postgres method of implementing custom Ids for Sharding is great, but I need the implementation in MySQL.
So, I converted the method found at the bottom of this blog, here: http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
MySQL Version:
CREATE TRIGGER shard_insert BEFORE INSERT ON tablename
FOR EACH ROW BEGIN
DECLARE seq_id BIGINT;
DECLARE now_millis BIGINT;
DECLARE our_epoch BIGINT DEFAULT 1314220021721;
DECLARE shard_id INT DEFAULT 1;
SET now_millis = (SELECT UNIX_TIMESTAMP(NOW(3)) * 1000);
SET seq_id = (SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA = "dbname" AND TABLE_NAME = "tablename");
SET NEW.id = (SELECT ((now_millis - our_epoch) << 23) | (shard_id << 10) | (SELECT MOD(seq_id, 1024)));
END
The table looks roughly like this:
CREATE TABLE tablename (
id BIGINT AUTO_INCREMENT,
...
)
Question:
- There is a concurrency problem here. When spawning 100 threads and running inserts, I am getting duplicate sequence values, meaning two triggers are getting the same auto_increment value. How can I fix this?
I tried creating a new table, e.g. "tablename_seq", with one row, a counter to store my own auto_increment values, then doing updates to that table inside the TRIGGER, but the problem is I can't LOCK the table in a Stored Procedure (trigger), so I have the exact same problem, I can't guarantee a counter to be unique between triggers :(.
I'm stumped and really would appreciate any tips!
Possible Solution:
- MySQL 5.6 has UUID_SHORT() which generates unique incrementing values which are guaranteed to be unique. It appears in practice when calling this that each call increments the value +1. By using: SET seq_id = (SELECT UUID_SHORT()); it appears to remove the concurrency problem. The side effect of this is that now (roughly) no more than 1024 inserts can happen per millisecond in the entire system. If more do, then it's possible for a DUPLICATE PRIMARY KEY error. The good news is that in benchmarks on my machine, I get ~3,000 inserts/s with or wtihout the trigger contianing UUID_SHORT(), so it doesn't appear to slow it down at all.
The following SQL Fiddle generates an output as shown below:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 45
Server version: 5.5.35-1
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select `id` from `tablename`;
+-------------------+
| id |
+-------------------+
| 11829806563853313 |
| 11829806563853314 |
| 11829806563853315 |
| 11829806563853316 |
| 11829806563853317 |
| 11829806563853318 |
| 11829806563853319 |
| 11829806563853320 |
| 11829806563853321 |
| 11829806563853322 |
| 11829806563853323 |
| 11829806563853324 |
| 11829806563853325 |
| 11829806563853326 |
| 11829806563853327 |
| 11829806563853328 |
| 11829806563853329 |
| 11829806563853330 |
| 11829806563853331 |
| 11829806563853332 |
| 11829806563853333 |
| 11829806563853334 |
| 11829806563853335 |
| 11829806563853336 |
| 11829806563853337 |
| 11829806563853338 |
| 11829806563853339 |
| 11829806563853340 |
| 11829806563853341 |
| 11829806563853342 |
| 11829806563853343 |
| 11829806563853344 |
| 11829806563853345 |
| 11829806563853346 |
| 11829806563853347 |
| 11829806563853348 |
| 11829806563853349 |
| 11829806563853350 |
| 11829806563853351 |
| 11829806563853352 |
+-------------------+
40 rows in set (0.01 sec)
Accept the answer if it really solves your need.
UPDATE
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 46
Server version: 5.5.35-1
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> DELIMITER //
mysql> DROP FUNCTION IF EXISTS `nextval`//
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> DROP TRIGGER IF EXISTS `shard_insert`//
Query OK, 0 rows affected (0.00 sec)
mysql> DROP TABLE IF EXISTS `tablename_seq`, `tablename`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE `tablename_seq` (
-> `seq` BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY
-> )//
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE `tablename` (
-> `id` BIGINT UNSIGNED PRIMARY KEY
-> )//
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE FUNCTION `nextval`()
-> RETURNS BIGINT UNSIGNED
-> DETERMINISTIC
-> BEGIN
-> DECLARE `_last_insert_id` BIGINT UNSIGNED;
-> INSERT INTO `tablename_seq` VALUES (NULL);
-> SET `_last_insert_id` := LAST_INSERT_ID();
-> DELETE FROM `tablename_seq`
-> WHERE `seq` = `_last_insert_id`;
-> RETURN `_last_insert_id`;
-> END//
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TRIGGER `shard_insert` BEFORE INSERT ON `tablename`
-> FOR EACH ROW
-> BEGIN
-> DECLARE `seq_id`, `now_millis` BIGINT UNSIGNED;
-> DECLARE `our_epoch` BIGINT UNSIGNED DEFAULT 1314220021721;
-> DECLARE `shard_id` INT UNSIGNED DEFAULT 1;
-> SET `now_millis` := `our_epoch` + UNIX_TIMESTAMP();
-> SET `seq_id` := `nextval`();
-> SET NEW.`id` := (SELECT (`now_millis` - `our_epoch`) << 23 |
-> `shard_id` << 10 |
-> MOD(`seq_id`, 1024)
-> );
-> END//
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO `tablename`
-> VALUES
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0),
-> (0), (0), (0), (0), (0)//
Query OK, 40 rows affected (0.00 sec)
Records: 40 Duplicates: 0 Warnings: 0
mysql> DELIMITER ;
mysql> SELECT `id` FROM `tablename`;
+-------------------+
| id |
+-------------------+
| 12581084357198849 |
| 12581084357198850 |
| 12581084357198851 |
| 12581084357198852 |
| 12581084357198853 |
| 12581084357198854 |
| 12581084357198855 |
| 12581084357198856 |
| 12581084357198857 |
| 12581084357198858 |
| 12581084357198859 |
| 12581084357198860 |
| 12581084357198861 |
| 12581084357198862 |
| 12581084357198863 |
| 12581084357198864 |
| 12581084357198865 |
| 12581084357198866 |
| 12581084357198867 |
| 12581084357198868 |
| 12581084357198869 |
| 12581084357198870 |
| 12581084357198871 |
| 12581084357198872 |
| 12581084357198873 |
| 12581084357198874 |
| 12581084357198875 |
| 12581084357198876 |
| 12581084357198877 |
| 12581084357198878 |
| 12581084357198879 |
| 12581084357198880 |
| 12581084357198881 |
| 12581084357198882 |
| 12581084357198883 |
| 12581084357198884 |
| 12581084357198885 |
| 12581084357198886 |
| 12581084357198887 |
| 12581084357198888 |
+-------------------+
40 rows in set (0.00 sec)
See db-fiddle.
An alternative is to grab blocks of auto increment numbers. If you set MySQLs auto increment increment to something like 1000, a process can do an insert in the "sequence" table and get the auto increment value. The process then knows that it has 1000 sequential numbers it can use, starting at that number, that will be free of conflicts. There is no need to record every increment in a central table if all you are recording is a number.
This is most commonly used in multiple master setups in addition to the auto increment offset. You could go the multiple master route too, and insert on the different masters. The auto increment increment and offset would assure no conflicts. This would require solid knowledge of MySQL replication.
来源:https://stackoverflow.com/questions/25677554/can-auto-increment-be-safely-used-in-a-before-trigger-in-mysql