Does “foreach” cause repeated Linq execution?

前端 未结 8 1678
自闭症患者
自闭症患者 2020-11-30 08:25

I\'ve been working for the first time with the Entity Framework in .NET, and have been writing LINQ queries in order to get information from my model. I would like to progr

相关标签:
8条回答
  • 2020-11-30 08:56

    It will execute the LINQ statement the same number of times no matter if you do .ToList() or not. I have an example here with colored output to the console:

    What happens in the code (see code at the bottom):

    • Create a list of 100 ints (0-99).
    • Create a LINQ statement that prints every int from the list followed by two * to the console in red color, and then return the int if it's an even number.
    • Do a foreach on the query, printing out every even number in green color.
    • Do a foreach on the query.ToList(), printing out every even number in green color.

    As you can see in the output below, the number of ints written to the console is the same, meaning the LINQ statement is executed the same number of times.

    The difference is in when the statement is executed. As you can see, when you do a foreach on the query (that you have not invoked .ToList() on), the list and the IEnumerable object, returned from the LINQ statement, are enumerated at the same time.

    When you cache the list first, they are enumerated separately, but still the same amount of times.

    The difference is very important to understand, because if the list is modified after you have defined your LINQ statement, the LINQ statement will operate on the modified list when it is executed (e.g. by .ToList()). BUT if you force execution of the LINQ statement (.ToList()) and then modify the list afterwards, the LINQ statement will NOT work on the modified list.

    Here's the output:

    Here's my code:

    // Main method:
    static void Main(string[] args)
    {
        IEnumerable<int> ints = Enumerable.Range(0, 100);
    
        var query = ints.Where(x =>
        {
            Console.ForegroundColor = ConsoleColor.Red;
            Console.Write($"{x}**, ");
            return x % 2 == 0;
        });
    
        DoForeach(query, "query");
        DoForeach(query, "query.ToList()");
    
        Console.ForegroundColor = ConsoleColor.White;
    }
    
    // DoForeach method:
    private static void DoForeach(IEnumerable<int> collection, string collectionName)
    {
        Console.ForegroundColor = ConsoleColor.Yellow;
        Console.WriteLine("\n--- {0} FOREACH BEGIN: ---", collectionName);
    
        if (collectionName.Contains("query.ToList()"))
            collection = collection.ToList();
    
        foreach (var item in collection)
        {
            Console.ForegroundColor = ConsoleColor.Green;
            Console.Write($"{item}, ");
        }
    
        Console.ForegroundColor = ConsoleColor.Yellow;
        Console.WriteLine("\n--- {0} FOREACH END ---", collectionName);
    }
    

    Note about execution time: I did a few timing tests (not enough to post it here though) and I didn't find any consistency in either method being faster than the other (including the execution of .ToList() in the timing). On larger collections, caching the collection first and then iterating it seemed a bit faster, but there was no definitive conclusion from my test.

    0 讨论(0)
  • 2020-11-30 09:03

    foreach, by itself, only runs through its data once. In fact, it specifically runs through it once. You can't look ahead or back, or alter the index the way you can with a for loop.

    However, if you have multiple foreachs in your code, all operating on the same LINQ query, you may get the query executed multiple times. This is entirely dependent on the data, though. If you're iterating over an LINQ-based IEnumerable/IQueryable that represents a database query, it will run that query each time. If you're iterating over an List or other collection of objets, it will run through the list each time, but won't hit your database repeatedly.

    In other words, this is a property of LINQ, not a property of foreach.

    0 讨论(0)
  • 2020-11-30 09:04

    It depends on how the Linq query is being used.

    var q = {some linq query here}
    
    while (true)
    {
        foreach(var item in q)
        {
        ...
        }
    }
    

    The code above will execute the Linq query multiple times. Not because of the foreach, but because the foreach is inside another loop, so the foreach itself is being executed multiple times.

    If all consumers of a linq query use it "carefully" and avoid dumb mistakes such as the nested loops above, then a linq query should not be executed multiple times needlessly.

    There are occasions when reducing a linq query to an in-memory result set using ToList() are warranted, but in my opinion ToList() is used far, far too often. ToList() almost always becomes a poison pill whenever large data is involved, because it forces the entire result set (potentially millions of rows) to be pulled into memory and cached, even if the outermost consumer/enumerator only needs 10 rows. Avoid ToList() unless you have a very specific justification and you know your data will never be large.

    0 讨论(0)
  • 2020-11-30 09:07

    try this on LinqPad

    void Main()
    {
        var testList = Enumerable.Range(1,10);
        var query = testList.Where(x => 
        {
            Console.WriteLine(string.Format("Doing where on {0}", x));
            return x % 2 == 0;
        });
        Console.WriteLine("First foreach starting");
        foreach(var i in query)
        {
            Console.WriteLine(string.Format("Foreached where on {0}", i));
        }
    
        Console.WriteLine("First foreach ending");
        Console.WriteLine("Second foreach starting");
        foreach(var i in query)
        {
            Console.WriteLine(string.Format("Foreached where on {0} for the second time.", i));
        }
        Console.WriteLine("Second foreach ending");
    }
    

    Each time the where delegate is being run we shall see a console output, hence we can see the Linq query being run each time. Now by looking at the console output we see the second foreach loop still causes the "Doing where on" to print, thus showing that the second usage of foreach does in fact cause the where clause to run again...potentially causing a slow down.

    First foreach starting
    Doing where on 1
    Doing where on 2
    Foreached where on 2
    Doing where on 3
    Doing where on 4
    Foreached where on 4
    Doing where on 5
    Doing where on 6
    Foreached where on 6
    Doing where on 7
    Doing where on 8
    Foreached where on 8
    Doing where on 9
    Doing where on 10
    Foreached where on 10
    First foreach ending
    Second foreach starting
    Doing where on 1
    Doing where on 2
    Foreached where on 2 for the second time.
    Doing where on 3
    Doing where on 4
    Foreached where on 4 for the second time.
    Doing where on 5
    Doing where on 6
    Foreached where on 6 for the second time.
    Doing where on 7
    Doing where on 8
    Foreached where on 8 for the second time.
    Doing where on 9
    Doing where on 10
    Foreached where on 10 for the second time.
    Second foreach ending
    
    0 讨论(0)
  • 2020-11-30 09:07

    The difference is in the underlying type. As LINQ is built on top of IEnumerable (or IQueryable) the same LINQ operator may have completely different performance characteristics.

    A List will always be quick to respond, but it takes an upfront effort to build a list.

    An iterator is also IEnumerable and may employ any algorithm every time it fetches the "next" item. This will be faster if you don't actually need to go through the complete set of items.

    You can turn any IEnumerable into a list by calling ToList() on it and storing the resulting list in a local variable. This is advisable if

    • You don't depend on deferred execution.
    • You have to access more total items than the whole set.
    • You can pay the upfront cost of retrieving and storing all items.
    0 讨论(0)
  • 2020-11-30 09:12

    Using LINQ even without entities what you will get is that deferred execution is in effect. It is only by forcing an iteration that the actual linq expression is evaluated. In that sense each time you use the linq expression it is going to be evaluated.

    Now with entities this is still the same, but there is just more functionality at work here. When the entity framework sees the expression for the first time, it looks if he has executed this query already. If not, it will go to the database and fetch the data, setup its internal memory model and return the data to you. If the entity framework sees it already fetched the data beforehand, it is not going to go to the database and use the memory model that it setup earlier to return data to you.

    This can make your life easier, but it can also be a pain. For instance if you request all records from a table by using a linq expression. The entity framework will load all data from the table. If later on you evaluate the same linq expression, even if in the time being records were deleted or added, you will get the same result.

    The entity framework is a complicated thing. There are of course ways to make it reexecute the query, taking into account the changes it has in its own memory model and the like.

    I suggest reading "programming entity framework" of Julia Lerman. It addresses lots of issues like the one you having right now.

    0 讨论(0)
提交回复
热议问题