\"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.\" (Donald Knuth). My SQL tables are unlikely to conta
Put indexes ONLY if you have to :)
There are times when putting indexes can actually hurt performance, depending on what the table is used for...
So, in other words, you would think about putting indexes on tables when it is necessary as determined by profiling the application.
Knuth's wise words are not applicable to the creation (or not) of indexes, since by adding indexes you are not optimising anything directly: you are providing an index that the DBMSs optimiser may use to optimise some queries. In fact, you could better argue that deciding not to index a small table is premature optimisation, as by doing so you restrict the DBMS optimiser's options!
Different DBMSs will have different guidelines for choosing whether or not to index columns based on various factors including table size, and it is these that should be considered.
What is an example of premature optimisation in databases: "denormalising for performance" before any benchmarking has indicated that the normalised database actually has any performance issues.
Absolutely incorrect. 100% incorrect. Don't put a million pointless indexes, but you do want a Primary Key (in most cases), and you do want it CLUSTERED correctly.
Here's why:
SELECT * FROM MySmallTable <-- No worries... Index won't help
SELECT
*
FROM
MyBigTable INNER JOIN MySmallTable ON... <-- Ahh, now I'm glad I have my index.
Here's a good rule to go by.
"Since I have a TABLE, I'm likely going to want to query it at some time... If I'm going to query it, I'm likely going to do so in a consistent way..." <-- That's how you should index the table.
EDIT: I'm adding this line: If you have a concrete example in mind, I'll show you how to index it, and how much of a savings you'll get from doing so. Please supply a table, and an example of how you plan in using that table.
I guess there is an auto indexing on the primary key of the table which should be sufficient when querying on a table with less data.
So, yes explicit indexes can be avoided in case there is a small data set to be worked upon.
The value of indexes is in speeding reads. For instance, if you are doing lots of SELECTs based on a range of dates in a date column, it makes sense to put an index on that column. And of course, generally you add indexes on any column you're going to be JOINing on with any significant frequency. The efficiency gain is also related to the ratio of the size of your typical recordsets to the number of records (i.e. grabbing 20/2000 records benefits more from indexing than grabbing 90/100 records). A lookup on an unindexed column is essentially a linear search.
The cost of indexes comes on writes, because every INSERT also requires an internal insert to each column index.
So, the answer depends entirely on your application -- if it's something like a dynamic website where the number of reads can be 100x or 1000x the writes, and you're doing frequent, disparate lookups based on data columns, indexing may well be beneficial. But if writes greatly outnumber reads, then your tuning should focus on speeding those queries.
It takes very little time to identify and benchmark a handful of your app's most frequent operations both with and without indexes on the JOIN/WHERE columns, I suggest you do that. It's also smart to monitor your production app and identify the most expensive, and most frequent queries, and focus your optimization efforts on the intersection of those two sets of queries (which could mean indexes or something totally different, like allocating more or less memory for query or join caches).
If the rows have narrow width, and a few thousand rows fit on say 10-20 8K pages, it is unlikely that the SQL optimiser would elect to use an index even if you create one.