Which is faster/best? SELECT * or SELECT column1, colum2, column3, etc

后端 未结 30 3106
清歌不尽
清歌不尽 2020-11-21 23:59

I\'ve heard that SELECT * is generally bad practice to use when writing SQL commands because it is more efficient to SELECT columns you specificall

30条回答
  •  暗喜
    暗喜 (楼主)
    2020-11-22 00:30

    Given your specification that you are selecting all columns, there is little difference at this time. Realize, however, that database schemas do change. If you use SELECT * you are going to get any new columns added to the table, even though in all likelihood, your code is not prepared to use or present that new data. This means that you are exposing your system to unexpected performance and functionality changes.

    You may be willing to dismiss this as a minor cost, but realize that columns that you don't need still must be:

    1. Read from database
    2. Sent across the network
    3. Marshalled into your process
    4. (for ADO-type technologies) Saved in a data-table in-memory
    5. Ignored and discarded / garbage-collected

    Item #1 has many hidden costs including eliminating some potential covering index, causing data-page loads (and server cache thrashing), incurring row / page / table locks that might be otherwise avoided.

    Balance this against the potential savings of specifying the columns versus an * and the only potential savings are:

    1. Programmer doesn't need to revisit the SQL to add columns
    2. The network-transport of the SQL is smaller / faster
    3. SQL Server query parse / validation time
    4. SQL Server query plan cache

    For item 1, the reality is that you're going to add / change code to use any new column you might add anyway, so it is a wash.

    For item 2, the difference is rarely enough to push you into a different packet-size or number of network packets. If you get to the point where SQL statement transmission time is the predominant issue, you probably need to reduce the rate of statements first.

    For item 3, there is NO savings as the expansion of the * has to happen anyway, which means consulting the table(s) schema anyway. Realistically, listing the columns will incur the same cost because they have to be validated against the schema. In other words this is a complete wash.

    For item 4, when you specify specific columns, your query plan cache could get larger but only if you are dealing with different sets of columns (which is not what you've specified). In this case, you do want different cache entries because you want different plans as needed.

    So, this all comes down, because of the way you specified the question, to the issue resiliency in the face of eventual schema modifications. If you're burning this schema into ROM (it happens), then an * is perfectly acceptable.

    However, my general guideline is that you should only select the columns you need, which means that sometimes it will look like you are asking for all of them, but DBAs and schema evolution mean that some new columns might appear that could greatly affect the query.

    My advice is that you should ALWAYS SELECT specific columns. Remember that you get good at what you do over and over, so just get in the habit of doing it right.

    If you are wondering why a schema might change without code changing, think in terms of audit logging, effective/expiration dates and other similar things that get added by DBAs for systemically for compliance issues. Another source of underhanded changes is denormalizations for performance elsewhere in the system or user-defined fields.

提交回复
热议问题