What is the reason not to use select *?

后端 未结 20 2741
独厮守ぢ
独厮守ぢ 2020-11-21 07:14

I\'ve seen a number of people claim that you should specifically name each column you want in your select query.

Assuming I\'m going to use all of the columns anyway

相关标签:
20条回答
  • 2020-11-21 08:00
    1. In a roundabout way you are breaking the modularity rule about using strict typing wherever possible. Explicit is almost universally better.

    2. Even if you now need every column in the table, more could be added later which will be pulled down every time you run the query and could hurt performance. It hurts performance because

      • You are pulling more data over the wire; and
      • Because you might defeat the optimizer's ability to pull the data right out of the index (for queries on columns that are all part of an index.) rather than doing a lookup in the table itself

    When TO use select *

    When you explicitly NEED every column in the table, as opposed to needing every column in the table THAT EXISTED AT THE TIME YOU WROTE THE QUERY. For example, if were writing an DB management app that needed to display the entire contents of the table (whatever they happened to be) you might use that approach.

    0 讨论(0)
  • 2020-11-21 08:03

    The essence of the quote of not prematurely optimizing is to go for simple and straightforward code and then use a profiler to point out the hot spots, which you can then optimize to be efficient.

    When you use select * you're make it impossible to profile, therefore you're not writing clear & straightforward code and you are going against the spirit of the quote. select * is an anti-pattern.


    So selecting columns is not a premature optimization. A few things off the top of my head ....

    1. If you specify columns in a SQL statement, the SQL execution engine will error if that column is removed from the table and the query is executed.
    2. You can more easily scan code where that column is being used.
    3. You should always write queries to bring back the least amount of information.
    4. As others mention if you use ordinal column access you should never use select *
    5. If your SQL statement joins tables, select * gives you all columns from all tables in the join

    The corollary is that using select * ...

    1. The columns used by the application is opaque
    2. DBA's and their query profilers are unable to help your application's poor performance
    3. The code is more brittle when changes occur
    4. Your database and network are suffering because they are bringing back too much data (I/O)
    5. Database engine optimizations are minimal as you're bringing back all data regardless (logical).

    Writing correct SQL is just as easy as writing Select *. So the real lazy person writes proper SQL because they don't want to revisit the code and try to remember what they were doing when they did it. They don't want to explain to the DBA's about every bit of code. They don't want to explain to their clients why the application runs like a dog.

    0 讨论(0)
  • 2020-11-21 08:03

    I understand where you're going regarding premature optimization, but that really only goes to a point. The intent is to avoid unnecessary optimization in the beginning. Are your tables unindexed? Would you use nvarchar(4000) to store a zip code?

    As others have pointed out, there are other positives to specifying each column you intend to use in the query (such as maintainability).

    0 讨论(0)
  • I actually noticed a strange behaviour when I used select * in views in SQL Server 2005.

    Run the following query and you will see what I mean.

    IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[starTest]') AND type in (N'U'))
    DROP TABLE [dbo].[starTest]
    CREATE TABLE [dbo].[starTest](
        [id] [int] IDENTITY(1,1) NOT NULL,
        [A] [varchar](50) NULL,
        [B] [varchar](50) NULL,
        [C] [varchar](50) NULL
    ) ON [PRIMARY]
    
    GO
    
    insert into dbo.starTest
    select 'a1','b1','c1'
    union all select 'a2','b2','c2'
    union all select 'a3','b3','c3'
    
    go
    IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[vStartest]'))
    DROP VIEW [dbo].[vStartest]
    go
    create view dbo.vStartest as
    select * from dbo.starTest
    go
    
    go
    IF  EXISTS (SELECT * FROM sys.views WHERE object_id = OBJECT_ID(N'[dbo].[vExplicittest]'))
    DROP VIEW [dbo].[vExplicittest]
    go
    create view dbo.[vExplicittest] as
    select a,b,c from dbo.starTest
    go
    
    
    select a,b,c from dbo.vStartest
    select a,b,c from dbo.vExplicitTest
    
    IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[starTest]') AND type in (N'U'))
    DROP TABLE [dbo].[starTest]
    CREATE TABLE [dbo].[starTest](
        [id] [int] IDENTITY(1,1) NOT NULL,
        [A] [varchar](50) NULL,
        [B] [varchar](50) NULL,
        [D] [varchar](50) NULL,
        [C] [varchar](50) NULL
    ) ON [PRIMARY]
    
    GO
    
    insert into dbo.starTest
    select 'a1','b1','d1','c1'
    union all select 'a2','b2','d2','c2'
    union all select 'a3','b3','d3','c3'
    
    select a,b,c from dbo.vStartest
    select a,b,c from dbo.vExplicittest
    

    Compare the results of last 2 select statements. I believe what you will see is a result of Select * referencing columns by index instead of name.

    If you rebuild the view it will work fine again.

    EDIT

    I have added a separate question, *“select * from table” vs “select colA, colB, etc. from table” interesting behaviour in SQL Server 2005* to look into that behaviour in more details.

    0 讨论(0)
  • 2020-11-21 08:11

    You might join two tables and use column A from the second table. If you later add column A to the first table (with same name but possibly different meaning) you'll most likely get the values from the first table and not the second one as earlier. That won't happen if you explicitly specify the columns you want to select.

    Of course specifying the columns also sometimes causes bugs if you forget to add the new columns to every select clause. If the new column is not needed every time the query is executed, it may take some time before the bug gets noticed.

    0 讨论(0)
  • 2020-11-21 08:11

    When you're specifying columns, you're also tying yourself into a specific set of columns and making yourself less flexible, making Feuerstein roll over in, well, whereever he is. Just a thought.

    0 讨论(0)
提交回复
热议问题