What is considered a “large” table in SQL Server?

后端 未结 6 657
独厮守ぢ
独厮守ぢ 2021-02-01 04:47

I have a table with 10 million records in it. Is that considered a lot of records? Should I be worried about search times? If not, it will keep growing, so what is cons

相关标签:
6条回答
  • 2021-02-01 05:24

    As Aaron said, it is relative. But maybe I can elaborate some.

    First, one major factor is how large the columns are. If you have a table of nothing but 10 million integers (and there are reasons you just might want something like that, look at Tally Tables.) then it is not large at all. On the other hand, a denormalized table of merely a hundred rows might take up a lot of space and have massive performance problems if each row contained say an id field with an integer acting as a primary key followed by a varchar(max) with html and then a sequence of varbinary(max) columns that held jpgs used by that html.

    So, to get a handle on the size of the table, you need to look at both the number of rows and the size of each row. One metric for size that might be a bit more useful is to look at the space it takes up. (Assuming this is later than SQL Server 2000, you can right click on the table in SSMS, go to properties, and then to the Storage page.)

    Of course, its still hard to say when that will start affecting performance. You will certainly notice a change in performance once the table gets too large to fit inside of RAM, but that can happen frequently with decent sized datasets, especially if you choose to partially denormalize and is not a cause for concern. Having indexes that are too large to fit inside of RAM can cause a bigger performance concern, and that one can be cause for evaluation. But its not necessarily a problem, especially if it is meant to be a covering index for some query and you are working with a RAM constrained environment (what RAM constrained means is also relative, but for a rough rule of thumb there I would try to put at least 8 GB on even a desktop that was going to do serious work with SQL Server).

    Now, table size certainly can be a factor in search speed and there are ways to deal with it. But before I talk about those, let me point out that it is normally one of the smaller factors I would look at in terms of performance. I wrote an article about this recently here. Before thinking about table size, I would look to make sure the queries were optimized, and the indexes made sense. I would even look at increasing RAM and getting faster harddrives (SSDs make a difference if you can afford one large enough for your purposes) before worrying about table sizes.

    But, if you want to decrease table size:

    • Normalize. This can actually have some big drawbacks for performance, but it can have some performance advantages and it has big data consistency advantages as well as storage advantages.
    • Consider your datatypes. If you need NVarchar, you need NVarchar. But if varchar will work, then it will use up less space. Same with int vs bigint.
    • Partition. Again, done wrong this can degrade performance instead of improving it, but done right it can help with performance. It can be somewhat tricky to do right so approach with caution.
    • Move old, unnecessary data to an archival warehouse and out of the main system. Of course, this depends on getting the definition of unnecessary data right.

    Summary:

    This got longer than I expected, so to summarize:

    1. What is large is relative, but you have to consider the column size along with the number of rows.
    2. The table size can definitely affect performance, but lots of other things affect it more, so I wouldn't look there first or even second.
    3. If you must reduce table size, basically get rid of data you don't need, and reallocate other data to other places. But you have to be smart about how or you can do more harm than good.
    0 讨论(0)
  • 2021-02-01 05:30

    Ditto other posters on how "large" depends what your data is, what kind of query you want to do, what your hardware is, and what your definition of a reason search time is.

    But here's one way to define "large": a "large" table is one that exceeds the amount of real memory the host can allocate to SQL Server. SQL Server is perfectly capable of working with tables that greatly exceed physical memory in size, but any time a query requires a table scan (i.e., reading every record) of such a table, you will get clobbered. Ideally you want to keep the entire table in memory; if that is not possible, you at least want to keep the necessary indexes in memory. If you have an index that supports your query and you can keep that index in RAM, performance will still scale pretty well.

    If it is not obvious to you as a designer what your clustered index (physical arrangement of data) and non-clustered indexes (pointers to the clustered index, essentially) should be, SQL Server comes with very good profiling tools that will help you define indexes in appropriate ways for your workload.

    Finally, consider throwing hardware at the problem. SQL Server performance is nearly always memory-bound rather than cpu-bound, so don't buy a fast 8-core machine and cripple it with 4 GB of physical memory. If you need reliably low latency from a 100 GB database, consider hosting it on a machine with 64 GB---or even 128 GB---of ram.

    0 讨论(0)
  • 2021-02-01 05:31

    If you have 10 million records in any table, this is time to look into the same. If it's related to any kind of Audit Log, it can be OK but otherwise you have to careful about performance.

    0 讨论(0)
  • 2021-02-01 05:36

    Everything is relative...

    I used to be a DBA for a company that designed, built and hosted marketing databases and it wasn't uncommon for there to be databases with billions of rows. So other databases with millions of rows were considered "small".

    Also, there tend to be a few tables in any schema that have lots of data (e.g. transactions), while others might be smaller look-up tables.

    What I'm getting at is that there is no point at which a table becomes "large".

    If you have a large table then that is certainly a possible candidate for optimisation. I say "possible" as it is perfectly reasonable for a table to become very large but seldom be used for queries (e.g. some kind of history table).

    0 讨论(0)
  • 2021-02-01 05:42

    "Large" is like "smart" - it's relative. 10 million rows is a good size, but whether the table is large depends on a number of factors:

    • how many columns and what are their data types?
    • how many indexes?
    • what is the actual size of the table (e.g. number of pages * 8kb, which you can get from sys.dm_db_partition_stats)?
    • what type of queries are run against it?
    • are individual indexes held in memory or do most queries benefit from a clustered index scan (where, essentially, the whole table needs to be in memory)?
    • how much memory is on the machine?
    • what do you consider large?

    Search times are not necessarily driven by size per se, but rather the effectiveness of your indexing strategy and the types of queries you're running for searches. If you have things like:

    WHERE description LIKE '%foo%'
    

    Then a normal index is not going to help you whatsoever, and you should start to get worried. You might consider Full-Text Search for cases like this.

    10 million rows in a table with a single INT column (e.g. a Numbers table) is nothing. 10 million rows of Products with long descriptions, XML, Geography data, images etc. is quite another.

    There is a reason that the max capacity specifications for SQL Server do not document an upper bound for number of rows in a table.

    0 讨论(0)
  • 2021-02-01 05:44

    large is not a useful concept in db design.

    Performance is determined by many things, but the label large is not one of them. Instead, concern yourself with:

    • hardware
    • OS and db configuration
    • schema design
    • indexing
    • query optimization
    • most importantly, testing for yourself on equivalent hardware with equivalent volume of data and under concurrent usage

    Only then you will have an answer that is relevant to you. Beyond this, application design is also a huge factor. N+1 queries and caching can have huge effects on perceived (and real) performance.

    0 讨论(0)
提交回复
热议问题