Can someone suggest a way to create batches of a certain size in linq?
Ideally I want to be able to perform operations in chunks of some configurable amount.
I'm joining this very late but i found something more interesting.
So we can use here Skip
and Take
for better performance.
public static class MyExtensions
{
public static IEnumerable> Batch(this IEnumerable items, int maxItems)
{
return items.Select((item, index) => new { item, index })
.GroupBy(x => x.index / maxItems)
.Select(g => g.Select(x => x.item));
}
public static IEnumerable Batch2(this IEnumerable items, int skip, int take)
{
return items.Skip(skip).Take(take);
}
}
Next I checked with 100000 records. The looping only is taking more time in case of Batch
Code Of console application.
static void Main(string[] args)
{
List Ids = GetData("First");
List Ids2 = GetData("tsriF");
Stopwatch FirstWatch = new Stopwatch();
FirstWatch.Start();
foreach (var batch in Ids2.Batch(5000))
{
// Console.WriteLine("Batch Ouput:= " + string.Join(",", batch));
}
FirstWatch.Stop();
Console.WriteLine("Done Processing time taken:= "+ FirstWatch.Elapsed.ToString());
Stopwatch Second = new Stopwatch();
Second.Start();
int Length = Ids2.Count;
int StartIndex = 0;
int BatchSize = 5000;
while (Length > 0)
{
var SecBatch = Ids2.Batch2(StartIndex, BatchSize);
// Console.WriteLine("Second Batch Ouput:= " + string.Join(",", SecBatch));
Length = Length - BatchSize;
StartIndex += BatchSize;
}
Second.Stop();
Console.WriteLine("Done Processing time taken Second:= " + Second.Elapsed.ToString());
Console.ReadKey();
}
static List GetData(string name)
{
List Data = new List();
for (int i = 0; i < 100000; i++)
{
Data.Add(string.Format("{0} {1}", name, i.ToString()));
}
return Data;
}
Time taken Is like this.
First - 00:00:00.0708 , 00:00:00.0660
Second (Take and Skip One) - 00:00:00.0008, 00:00:00.0008