So, one of my tables in MySQL which uses the InnoDB storage engine will contain multi-billion rows(with potentially no limit to how many will be inserted).
Can you t
A simple answer to your question would be yes InnoDB would be the perfect choice for a multi-billion row data set.
There is a host of optimization that is possbile.
The most obvious optimizations would be setting a large buffer pool, as buffer pool is the single most important thing when it comes to InnoDB because InnoDB buffers the data as well as the index in the buffer pool. If you have a dedicated MySQL server with only InnoDB tables, then you should set upto 80% of the available RAM to be used by InnoDB.
Another most important optimization is having proper indexes on the table (keeping in mind the data access/update pattern), both primary and secondary. (Remember that primary indexes are automatically appended to secondary indexes).
With InnoDB there are some extra goodies, such as protection from data corruption, auto-recovery etc.
As for increasing write-performance, you should setup your transaction log files to be upto a total of 4G.
One other thing that you can do is partition the table.
You can eek out more performance, by setting the bin-log-format to "row", and setting the auto_inc_lock_mode to 2 (that will ensure that innodb does not hold table level locks when inserting into auto-increment columns).
If you need any specific advice you can contact me, I would be more than willing to help.
You may want to have a look at:
http://www.mysqlperformanceblog.com/2006/06/09/why-mysql-could-be-slow-with-large-tables/
http://forums.whirlpool.net.au/archive/954126
If you have a very large table (Billions of records) and need to data mine the table (queries that read lots of data), mysql can slow to a crawl. Large databases (200+GB) are fine, but they are bound by IO/ temp table to disk and multiple other issues when attempting to read large groups that don't fit in memory.
No idea on the MySQL/InnoDB part (I'd assume it'll cope). But if you end up looking at alternatives, PostgreSQL can manage a DB of unlimited size on paper. (At least one 32TB database exists according to the FAQ.)
Can you tell me what sort of optimizations i can do to help speed up things?
Your milage will vary depending on your application. But with billions of rows, you're at least looking into partitioning your data, in order to work on smaller tables.
In the case of PostgreSQL, you'd also look into creating partial indexes where appropriate.
optimizations
MySQL as of version 5.4 does not support partial indexes.[3] In MySQL, the term "partial index" is sometimes used to refer to prefix indexes, where only a truncated prefix of each value is stored in the index. This is another technique for reducing index size.[4]