Background: I\'ve got a bunch of strings that I\'m getting from a database, and I want to return them. Traditionally, it would be something like this:
public Li
You're not always unsafe with the IEnumerable. If you leave the framework call GetEnumerator
(which is what most of the people will do), then you're safe. Basically, you're as safe as the carefullness of the code using your method:
class Program
{
static void Main(string[] args)
{
// safe
var firstOnly = GetList().First();
// safe
foreach (var item in GetList())
{
if(item == "2")
break;
}
// safe
using (var enumerator = GetList().GetEnumerator())
{
for (int i = 0; i < 2; i++)
{
enumerator.MoveNext();
}
}
// unsafe
var enumerator2 = GetList().GetEnumerator();
for (int i = 0; i < 2; i++)
{
enumerator2.MoveNext();
}
}
static IEnumerable<string> GetList()
{
using (new Test())
{
yield return "1";
yield return "2";
yield return "3";
}
}
}
class Test : IDisposable
{
public void Dispose()
{
Console.WriteLine("dispose called");
}
}
Whether you can affort to leave the database connection open or not depends on your architecture as well. If the caller participates in an transaction (and your connection is auto enlisted), then the connection will be kept open by the framework anyway.
Another advantage of yield
is (when using a server-side cursor), your code doesn't have to read all data (example: 1,000 items) from the database, if your consumer wants to get out of the loop earlier (example: after the 10th item). This can speed up querying data. Especially in an Oracle environment, where server-side cursors are the common way to retrieve data.
You are not missing anything. Your sample shows how NOT to use yield return. Add the items to a list, close the connection, and return the list. Your method signature can still return IEnumerable.
Edit: That said, Jon has a point (so surprised!): there are rare occasions where streaming is actually the best thing to do from a performance perspective. After all, if it's 100,000 (1,000,000? 10,000,000?) rows we're talking about here, you don't want to be loading that all into memory first.
Slightly more concise way to force evaluation of iterator:
using System.Linq;
//...
var stuff = GetStuff(connectionString).ToList();
No, you are on the right path... the yield will lock the reader... you can test it doing another database call while calling the IEnumerable
You could always use a separate thread to buffer the data (perhaps to a queue) while also doing a yeild to return the data. When the user requests data (returned via a yeild), an item is removed from the queue. Data is also being continuously added to the queue via the separate thread. That way, if the user requests the data fast enough, the queue is never very full and you do not have to worry about memory issues. If they don't, then the queue will fill up, which may not be so bad. If there is some sort of limitation you would like to impose on memory, you could enforce a maximum queue size (at which point the other thread would wait for items to be removed before adding more to the queue). Naturally, you will want to make sure you handle resources (i.e., the queue) correctly between the two threads.
As an alternative, you could force the user to pass in a boolean to indicate whether or not the data should be buffered. If true, the data is buffered and the connection is closed as soon as possible. If false, the data is not buffered and the database connection stays open as long as the user needs it to be. Having a boolean parameter forces the user to make the choice, which ensures they know about the issue.
What you can do is use a SqlDataAdapter instead and fill a DataTable. Something like this:
public IEnumerable<string> GetStuff(string connectionString)
{
DataTable table = new DataTable();
using (SqlConnection sqlConnection = new SqlConnection(connectionString))
{
string commandText = "GetStuff";
using (SqlCommand sqlCommand = new SqlCommand(commandText, sqlConnection))
{
sqlCommand.CommandType = CommandType.StoredProcedure;
SqlDataAdapter dataAdapter = new SqlDataAdapter(sqlCommand);
dataAdapter.Fill(table);
}
}
foreach(DataRow row in table.Rows)
{
yield return row["myImportantColumn"].ToString();
}
}
This way, you're querying everything in one shot, and closing the connection immediately, yet you're still lazily iterating the result. Furthermore, the caller of this method can't cast the result to a List and do something they shouldn't be doing.