What's the best way to select the minimum value from several columns?

后端 未结 19 2055
余生分开走
余生分开走 2020-11-27 02:47

Given the following table in SQL Server 2005:

ID   Col1   Col2   Col3
--   ----   ----   ----
1       3     34     76  
2      32    976     24
3       7             


        
相关标签:
19条回答
  • 2020-11-27 03:30

    I know that question is old, but I was still in the need of the answer and was not happy with other answers so I had to devise my own which is a twist on @paxdiablo´s answer.


    I came from land of SAP ASE 16.0, and I only needed a peek at statistics of certain data which are IMHO validly stored in different columns of a single row (they represent different times - when arrival of something was planned, what it was expected when the action started and finally what was the actual time). Thus I had transposed columns into the rows of temporary table and preformed my query over this as usually.

    N.B. Not the one-size-fits-all solution ahead!

    CREATE TABLE #tempTable (ID int, columnName varchar(20), dataValue int)
    
    INSERT INTO #tempTable 
      SELECT ID, 'Col1', Col1
        FROM sourceTable
       WHERE Col1 IS NOT NULL
    INSERT INTO #tempTable 
      SELECT ID, 'Col2', Col2
        FROM sourceTable
       WHERE Col2 IS NOT NULL
    INSERT INTO #tempTable 
      SELECT ID, 'Col3', Col3
        FROM sourceTable
       WHERE Col3 IS NOT NULL
    
    SELECT ID
         , min(dataValue) AS 'Min'
         , max(dataValue) AS 'Max'
         , max(dataValue) - min(dataValue) AS 'Diff' 
      FROM #tempTable 
      GROUP BY ID
    

    This took some 30 seconds on source set of 630000 rows and used only index-data, so not the thing to run in time-critical process but for things like one-time data inspection or end-of-the-day report you might be fine (but verify this with your peers or superiors, please!). Main bonus of this style for me was that I could readily use more/less columns and change grouping, filtering, etc., especially once data was copyied over.

    The additional data (columnName, maxes, ...) were to aid me in my search, so you might not need them; I left them here to maybe spark some ideas :-).

    0 讨论(0)
  • 2020-11-27 03:31

    The best way to do that is probably not to do it - it's strange that people insist on storing their data in a way that requires SQL "gymnastics" to extract meaningful information, when there are far easier ways to achieve the desired result if you just structure your schema a little better :-)

    The right way to do this, in my opinion, is to have the following table:

    ID    Col    Val
    --    ---    ---
     1      1      3
     1      2     34
     1      3     76
    
     2      1     32
     2      2    976
     2      3     24
    
     3      1      7
     3      2    235
     3      3      3
    
     4      1    245
     4      2      1
     4      3    792
    

    with ID/Col as the primary key (and possibly Col as an extra key, depending on your needs). Then your query becomes a simple select min(val) from tbl and you can still treat the individual 'old columns' separately by using where col = 2 in your other queries. This also allows for easy expansion should the number of 'old columns' grow.

    This makes your queries so much easier. The general guideline I tend to use is, if you ever have something that looks like an array in a database row, you're probably doing something wrong and should think about restructuring the data.


    However, if for some reason you can't change those columns, I'd suggest using insert and update triggers and add another column which these triggers set to the minimum on Col1/2/3. This will move the 'cost' of the operation away from the select to the update/insert where it belongs - most database tables in my experience are read far more often than written so incurring the cost on write tends to be more efficient over time.

    In other words, the minimum for a row only changes when one of the other columns change, so that's when you should be calculating it, not every time you select (which is wasted if the data isn't changing). You would then end up with a table like:

    ID   Col1   Col2   Col3   MinVal
    --   ----   ----   ----   ------
     1      3     34     76        3
     2     32    976     24       24
     3      7    235      3        3
     4    245      1    792        1
    

    Any other option that has to make decisions at select time is usually a bad idea performance-wise, since the data only changes on insert/update - the addition of another column takes up more space in the DB and will be slightly slower for the inserts and updates but can be much faster for selects - the preferred approach should depend on your priorities there but, as stated, most tables are read far more often than they're written.

    0 讨论(0)
  • 2020-11-27 03:32

    Both this question And this question try to answer this.

    The recap is that Oracle has a built in function for this, with Sql Server you are stuck either defining a user-defined-function or using case statements.

    0 讨论(0)
  • 2020-11-27 03:33

    For multiple columns its best to use a CASE statement, however for two numeric columns i and j you can use simple math:

    min(i,j) = (i+j)/2 - abs(i-j)/2

    This formula can be used to get the minimum value of multiple columns but its really messy past 2, min(i,j,k) would be min(i,min(j,k))

    0 讨论(0)
  • 2020-11-27 03:34

    This is brute force but works

     select case when col1 <= col2 and col1 <= col3 then col1
               case when col2 <= col1 and col2 <= col3 then col2
               case when col3 <= col1 and col3 <= col2 then col3
        as 'TheMin'
               end
    
    from Table T
    

    ... because min() works only on one column and not across columns.

    0 讨论(0)
  • 2020-11-27 03:35
    SELECT ID, Col1, Col2, Col3, 
        (SELECT MIN(Col) FROM (VALUES (Col1), (Col2), (Col3)) AS X(Col)) AS TheMin
    FROM Table
    
    0 讨论(0)
提交回复
热议问题