I have quite large List named items (>= 1,000,000 items) and some condition denoted by
Removing a lot of elements from an ArrayList
is an O(n^2)
operation. I would recommend simply using a LinkedList
that's more optimized for insertion and removal (but not for random access). LinkedList has a bit of a memory overhead.
If you do need to keep ArrayList
, then you are better off creating a new list.
Update: Comparing with creating a new list:
Reusing the same list, the main cost is coming from deleting the node and updating the appropriate pointers in LinkedList. This is a constant operation for any node.
When constructing a new list, the main cost is coming from creating the list, and initializing array entries. Both are cheap operations. You might incurre the cost of resizing the new list backend array as well; assuming that the final array is larger than half of the incoming array.
So if you were to remove only one element, then LinkedList
approach is probably faster. If you were to delete all nodes except for one, probably the new list approach is faster.
There are more complications when you bring memory management and GC. I'd like to leave these out.
The best option is to implement the alternatives yourself and benchmark the results when running your typical load.