Convert this SQL query to Linq (Not Exists + sub query)

前端 未结 1 1009
执念已碎
执念已碎 2021-01-26 01:42

I would like this SQL to be converted to LINQ. (it shouldl select rows from input which do not exist in table production based on 3 columns. If a column in both tables contains

相关标签:
1条回答
  • First of all - this is not a good SQL query. Every column is wrapped in a non-sargable function which means that the engine won't be able to take advantage of any indexes on any of those columns (assuming you have any).

    Let's start by rewriting this as a semi-decent SQL query:

    SELECT i.*
    FROM Input i
    LEFT JOIN Production p
        ON (p.CustID = i.CustID OR (p.CustID IS NULL AND i.CustID IS NULL))
        AND (p.CustName = i.CustName OR (p.CustName IS NULL AND i.CustName IS NULL))
        AND (p.Household = i.Household OR
            (p.Household IS NULL AND i.Household IS NULL))
    WHERE p.CustID IS NULL
    

    Now having said this, LEFT JOIN / IS NULL is not great for efficiency either, but we don't have much choice here because we're comparing on multiple columns. Based on your column names, I'm starting to wonder if the schema is properly normalized. A CustID should most likely be associated with one and only one CustName - the fact that you have to compare both of these seems a bit odd. And Household - I'm not sure what that is, but if it's a varchar(x)/nvarchar(x) column then I wonder if it might also have a 1:1 relationship with the customer.

    If I'm speculating too much here then feel free to dismiss this paragraph; but just in case, I want to say that if this data isn't properly normalized, normalizing it would make it much easier and faster to query on:

    SELECT *
    FROM Input
    WHERE CustID NOT IN (SELECT CustID FROM Production)
    

    Anyway, going back to the first query, since that's what we have to work with for now. Unfortunately it's impossible to create a join on those specific conditions in Linq, so we need to rewrite the SQL query as something slightly worse (because we now have to read from Input twice):

    SELECT *
    FROM Input
    WHERE <Primary Key> NOT IN
    (
        SELECT i.<Primary Key>
        FROM Input i
        INNER JOIN Production p
        ON (p.CustID = i.CustID OR (p.CustID IS NULL AND i.CustID IS NULL))
        AND (p.CustName = i.CustName OR (p.CustName IS NULL AND i.CustName IS NULL))
        AND (p.Household = i.Household OR
            (p.Household IS NULL AND i.Household IS NULL))
    )
    

    Now we have something we can finally translate to Linq syntax. We still can't do the join explicitly, which would be best, but we go old-school, start from the cartesian join and toss the join conditions into the WHERE segment, and the server will still be able to sort it out:

    var excluded =
        from i in input
        from p in production
        where
            ((p.CustID == i.CustID) || ((p.CustID == null) && (i.CustID == null))) &&
            ((p.CustName == i.CustName) || 
                ((p.CustName == null) && (i.CustName == null))) &&
            ((p.Household == i.Household) ||
                ((p.Household == null) && (i.Household == null)));
        select i.PrimaryKey;
    
    var results =
        from i in input
        where !excluded.Contains(i.PrimaryKey)
        select i;
    

    I'm assuming here that you have some sort of primary key on the table. If you don't, you've got other problems, but you can get around this particular problem using EXCEPT:

    var excluded =
        from i in input
        from p in production
        where
            ((p.CustID == i.CustID) || ((p.CustID == null) && (i.CustID == null))) &&
            ((p.CustName == i.CustName) || 
                ((p.CustName == null) && (i.CustName == null))) &&
            ((p.Household == i.Household) ||
                ((p.Household == null) && (i.Household == null)));
        select i;
    
    var results = input.Except(excluded);
    
    0 讨论(0)
提交回复
热议问题