How is data compression more effective than indexing for search performance?

后端 未结 2 1920
一生所求
一生所求 2021-01-14 18:53

For our application, we keep large amounts of data indexed by three integer columns (source, type and time). Loading significant chunks of that data can take some time and w

2条回答
  •  悲哀的现实
    2021-01-14 19:39

    This made me wonder if the performance impact of disk I/O is actually much heavier than I thought.

    Definitely. If you have to go to disk, the performance hit is many orders of magnitude greater than memory. This reminds me of the classic Jim Gray paper, Distributed Computing Economics:

    Computing economics are changing. Today there is rough price parity between (1) one database access, (2) ten bytes of network traffic, (3) 100,000 instructions, (4) 10 bytes of disk storage, and (5) a megabyte of disk bandwidth. This has implications for how one structures Internet-scale distributed computing: one puts computing as close to the data as possible in order to avoid expensive network traffic.

    The question, then, is how much data do you have and how much memory can you afford?

    And if the database gets really big -- as in nobody could ever afford that much memory, even in 20 years -- you need clever distributed database systems like Google's BigTable or Hadoop.

提交回复
热议问题