Background: I\'ve got a bunch of strings that I\'m getting from a database, and I want to return them. Traditionally, it would be something like this:
public Li
The only way this would cause problems is if the caller abuses the protocol of IEnumerable<T>
. The correct way to use it is to call Dispose
on it when it is no longer needed.
The implementation generated by yield return
takes the Dispose
call as a signal to execute any open finally
blocks, which in your example will call Dispose
on the objects you've created in the using
statements.
There are a number of language features (in particular foreach
) which make it very easy to use IEnumerable<T>
correctly.
It's a balancing act: do you want to force all the data into memory immediately so you can free up the connection, or do you want to benefit from streaming the data, at the cost of tying up the connection for all that time?
The way I look at it, that decision should potentially be up to the caller, who knows more about what they want to do. If you write the code using an iterator block, the caller can very easily turned that streaming form into a fully-buffered form:
List<string> stuff = new List<string>(GetStuff(connectionString));
If, on the other hand, you do the buffering yourself, there's no way the caller can go back to a streaming model.
So I'd probably use the streaming model and say explicitly in the documentation what it does, and advise the caller to decide appropriately. You might even want to provide a helper method to basically call the streamed version and convert it into a list.
Of course, if you don't trust your callers to make the appropriate decision, and you have good reason to believe that they'll never really want to stream the data (e.g. it's never going to return much anyway) then go for the list approach. Either way, document it - it could very well affect how the return value is used.
Another option for dealing with large amounts of data is to use batches, of course - that's thinking somewhat away from the original question, but it's a different approach to consider in the situation where streaming would normally be attractive.
I've bumped into this wall a few times. SQL database queries are not easily streamable like files. Instead, query only as much as you think you'll need and return it as whatever container you want (IList<>
, DataTable
, etc.). IEnumerable
won't help you here.
Dont use yield here. your sample is fine.
As an aside - note that the IEnumerable<T>
approach is essentially what the LINQ providers (LINQ-to-SQL, LINQ-to-Entities) do for a living. The approach has advantages, as Jon says. However, there are definite problems too - in particular (for me) in terms of (the combination of) separation | abstraction.
What I mean here is that:
.ToList()
etc)This ties in a bit with my thoughts here: Pragmatic LINQ.
But I should stress - there are definitely times when the streaming is highly desirable. It isn't a simple "always vs never" thing...