I am looking for a better pattern for working with a list of elements which each need processed and then depending on the outcome are removed from
The cost of removing an item from the list is proportional to the number of items following the one to be removed. In the case where the first half of the items qualify for removal, any approach which is based upon removing items individually will end up having to perform about N*N/4 item-copy operations, which can get very expensive if the list is large.
A faster approach is to scan through the list to find the first item to be removed (if any), and then from that point forward copy each item which should be retained to the spot where it belongs. Once this is done, if R items should be retained, the first R items in the list will be those R items, and all of the items requiring deletion will be at the end. If those items are deleted in reverse order, the system won't end up having to copy any of them, so if the list had N items of which R items, including all of the first F, were retained, it will be necessary to copy R-F items, and shrink the list by one item N-R times. All linear time.
Copy the list you are iterating. Then remove from the copy and interate the original. Going backwards is confusing and doesn't work well when looping in parallel.
var ids = new List<int> { 1, 2, 3, 4 };
var iterableIds = ids.ToList();
Parallel.ForEach(iterableIds, id =>
{
ids.Remove(id);
});
I wish the "pattern" was something like this:
foreach( thing in thingpile )
{
if( /* condition#1 */ )
{
foreach.markfordeleting( thing );
}
elseif( /* condition#2 */ )
{
foreach.markforkeeping( thing );
}
}
foreachcompleted
{
// then the programmer's choices would be:
// delete everything that was marked for deleting
foreach.deletenow(thingpile);
// ...or... keep only things that were marked for keeping
foreach.keepnow(thingpile);
// ...or even... make a new list of the unmarked items
others = foreach.unmarked(thingpile);
}
This would align the code with the process that goes on in the programmer's brain.
Select the elements you do want rather than trying to remove the elements you don't want. This is so much easier (and generally more efficient too) than removing elements.
var newSequence = (from el in list
where el.Something || el.AnotherThing < 0
select el);
I wanted to post this as a comment in response to the comment left by Michael Dillon below, but it's too long and probably useful to have in my answer anyway:
Personally, I'd never remove items one-by-one, if you do need removal, then call RemoveAll
which takes a predicate and only rearranges the internal array once, whereas Remove
does an Array.Copy
operation for every element you remove. RemoveAll
is vastly more efficient.
And when you're backwards iterating over a list, you already have the index of the element you want to remove, so it would be far more efficient to call RemoveAt
, because Remove
first does a traversal of the list to find the index of the element you're trying to remove, but you already know that index.
So all in all, I don't see any reason to ever call Remove
in a for-loop. And ideally, if it is at all possible, use the above code to stream elements from the list as needed so no second data structure has to be created at all.
For loops are a bad construct for this.
while
var numbers = new List<int>(Enumerable.Range(1, 3));
while (numbers.Count > 0)
{
numbers.RemoveAt(0);
}
But, if you absolutely must use for
var numbers = new List<int>(Enumerable.Range(1, 3));
for (; numbers.Count > 0;)
{
numbers.RemoveAt(0);
}
Or, this:
public static class Extensions
{
public static IList<T> Remove<T>(
this IList<T> numbers,
Func<T, bool> predicate)
{
numbers.ForEachBackwards(predicate, (n, index) => numbers.RemoveAt(index));
return numbers;
}
public static void ForEachBackwards<T>(
this IList<T> numbers,
Func<T, bool> predicate,
Action<T, int> action)
{
for (var i = numbers.Count - 1; i >= 0; i--)
{
if (predicate(numbers[i]))
{
action(numbers[i], i);
}
}
}
}
Usage:
var numbers = new List<int>(Enumerable.Range(1, 10)).Remove((n) => n > 5);
I found myself in a similar situation where I had to remove every nth element in a given List<T>
.
for (int i = 0, j = 0, n = 3; i < list.Count; i++)
{
if ((j + 1) % n == 0) //Check current iteration is at the nth interval
{
list.RemoveAt(i);
j++; //This extra addition is necessary. Without it j will wrap
//down to zero, which will throw off our index.
}
j++; //This will always advance the j counter
}