I was following a previous post on this that says:
For LinkedList
- get is O(n)
- add is O(1)
- remove is O(n)
Remember that big-O complexity describes asymptotic behaviour and may not reflect actual implementation speed. It describes how the cost of each operation grows with the size of the list, not the speed of each operation. For example, the following implementation of add
is O(1) but is not fast:
public class MyList extends LinkedList {
public void add(Object o) {
Thread.sleep(10000);
super.add(o);
}
}
I suspect in your case ArrayList is performing well because it increases it's internal buffer size fairly aggressively so there will not be a large number of reallocations. When the buffer does not need to be resized ArrayList will have faster add
s.
You also need to be very careful when you do this kind of profiling. I'd suggest you change your profiling code to do a warm-up phase (so the JIT has the opportunity to do some optimization without affecting your results) and average the results over a number of runs.
private final static int WARMUP = 1000;
private final static int TEST = 1000;
private final static int SIZE = 500000;
public void perfTest() {
// Warmup
for (int i = 0; i < WARMUP; ++i) {
buildArrayList();
}
// Test
long sum = 0;
for (int i = 0; i < TEST; ++i) {
sum += buildArrayList();
}
System.out.println("Average time to build array list: " + (sum / TEST));
}
public long buildArrayList() {
long start = System.nanoTime();
ArrayList a = new ArrayList();
for (int i = 0; i < SIZE; ++i) {
a.add(i);
}
long end = System.nanoTime();
return end - start;
}
... same for buildLinkedList
(Note that sum
may overflow and you might be better to use System.currentTimeMillis()
).
It's also possible that the compiler is optimizing away your empty get
loops. Make sure the loop actually does something to ensure that the right code is getting called.
To understand why the results you got do not contradict the "big O" characterization. We need to go back to first principles; i.e. the definition.
Let f(x) and g(x) be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinity
if and only if, for sufficiently large values of x, f(x) is at most a constant multiplied by g(x) in absolute value. That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M |g(x)| for all x > x_0.
In many contexts, the assumption that we are interested in the growth rate as the variable x goes to infinity is left unstated, and one writes more simply that f(x) = O(g(x)).
So, the statement add1 is O(1)
, means is that the time cost of an add1
operation on a list of size N tends towards a constant Cadd1 as N tends to infinity.
And the statement add2 is O(1) amortized over N operations
, means is that the average time cost of one of a sequence of N add2
operations tends towards a constant Cadd2 as N tends to infinity.
What is does not say is what those constants Cadd1 and Cadd2 are. In fact the reason that LinkedList is slower than ArrayList in your benchmark is that Cadd1 is larger than Cadd2.
The lesson is that big O notation does not predict absolute or even relative performance. All it predicts is the shape of the performance function as the controlling variable gets very large. This is useful to know, but it doesn't tell you everything you need to know.
You can separate add or remove as a two step operation.
LinkedList: If you add a element to index n, you can move the pointer from 0 to n-1, then you can perform your so called O(1) add operation. Remove operation is the same.
ArraryList: ArrayList implements the RandomAccess interface, which means it can access a element in O(1).
If you add a element in index n, it can go to the the n-1 index in O(1), move the elements after n-1, add set the element in the n slot.
The moving operation is performed by a native method calledSystem.arraycopy
, it's pretty fast.
public static void main(String[] args) {
List<Integer> arrayList = new ArrayList<Integer>();
for (int i = 0; i < 100000; i++) {
arrayList.add(i);
}
List<Integer> linkList = new LinkedList<Integer>();
long start = 0;
long end = 0;
Random random = new Random();
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
linkList.add(random.nextInt(100000), 7);
}
end = System.currentTimeMillis();
System.out.println("LinkedList add ,random index" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
arrayList.add(random.nextInt(100000), 7);
}
end = System.currentTimeMillis();
System.out.println("ArrayList add ,random index" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
linkList.add(0, 7);
}
end = System.currentTimeMillis();
System.out.println("LinkedList add ,index == 0" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
arrayList.add(0, 7);
}
end = System.currentTimeMillis();
System.out.println("ArrayList add ,index == 0" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
linkList.add(i);
}
end = System.currentTimeMillis();
System.out.println("LinkedList add ,index == size-1" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
arrayList.add(i);
}
end = System.currentTimeMillis();
System.out.println("ArrayList add ,index == size-1" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
linkList.remove(Integer.valueOf(random.nextInt(100000)));
}
end = System.currentTimeMillis();
System.out.println("LinkedList remove ,random index" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
arrayList.remove(Integer.valueOf(random.nextInt(100000)));
}
end = System.currentTimeMillis();
System.out.println("ArrayList remove ,random index" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
linkList.remove(0);
}
end = System.currentTimeMillis();
System.out.println("LinkedList remove ,index == 0" + (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 10000; i++) {
arrayList.remove(0);
}
end = System.currentTimeMillis();
System.out.println("ArrayList remove ,index == 0" + (end - start));
}
This is a bad benchmark IMO.
ArrayList
resizes, which is costly. If you had constructed ArrayList
as new ArrayList(500000)
you would construct in one blow, and then all allocations would be quite cheap (one preallocating backed array)list.get
you will find linkedlists are awful for grabbing anything other than the first or last element. For an arraylist: the jdk get is what you'd expect:
public E get(int index) {
RangeCheck(index);
return elementData[index];
}
(basically just return the indexed array element.,
For a linkedlist:
public E get(int index) {
return entry(index).element;
}
looks similar? Not quite. entry is a method not an primitive array, and look what it has to do:
private Entry<E> entry(int index) {
if (index < 0 || index >= size)
throw new IndexOutOfBoundsException("Index: "+index+
", Size: "+size);
Entry<E> e = header;
if (index < (size >> 1)) {
for (int i = 0; i <= index; i++)
e = e.next;
} else {
for (int i = size; i > index; i--)
e = e.previous;
}
return e;
}
That's right, if you ask for say list.get(250000)
, it's gotta start at the head and repeatedly iterate through the next element. 250000 accesses or so (there's an optimization in the code where it starts at head or tail depending on which would be less accesses.)
1) Underlying Data Structure The first difference between ArrayList and LinkedList comes with the fact that ArrayList is backed by Array while LinkedList is backed by LinkedList. This will lead further differences in performance.
2) LinkedList implements Deque Another difference between ArrayList and LinkedList is that apart from the List interface, LinkedList also implements Deque interface, which provides first in first out operations for add() and poll() and several other Deque functions. 3) Adding elements in ArrayList Adding element in ArrayList is O(1) operation if it doesn't trigger re-size of Array, in which case it becomes O(log(n)), On the other hand appending an element in LinkedList is O(1) operation, as it doesn't require any navigation.
4) Removing element from a position In order to remove an element from a particular index e.g. by calling remove(index), ArrayList performs a copy operation which makes it close to O(n) while LinkedList needs to traverse to that point which also makes it O(n/2), as it can traverse from either direction based upon proximity.
5) Iterating over ArrayList or LinkedList Iteration is the O(n) operation for both LinkedList and ArrayList where n is a number of an element.
6) Retrieving element from a position The get(index) operation is O(1) in ArrayList while its O(n/2) in LinkedList, as it needs to traverse till that entry. Though, in Big O notation O(n/2) is just O(n) because we ignore constants there.
7) Memory LinkedList uses a wrapper object, Entry, which is a static nested class for storing data and two nodes next and previous while ArrayList just stores data in Array.
So memory requirement seems less in the case of ArrayList than LinkedList except for the case where Array performs the re-size operation when it copies content from one Array to another.
If Array is large enough it may take a lot of memory at that point and trigger Garbage collection, which can slow response time.
From all the above differences between ArrayList vs LinkedList, It looks ArrayList is the better choice than LinkedList in almost all cases, except when you do a frequent add() operation than remove(), or get().
It's easier to modify a linked list than ArrayList, especially if you are adding or removing elements from start or end because linked list internally keeps references of those positions and they are accessible in O(1) time.
In other words, you don't need to traverse through the linked list to reach the position where you want to add elements, in that case, addition becomes O(n) operation. For example, inserting or deleting an element in the middle of a linked list.
In my opinion, use ArrayList over LinkedList for most of the practical purpose in Java.
An ArrayList is a simpler data structure than a LinkedList. An ArrayList has a single array of pointers in contiguous memory locations. It only has to be recreated if the array is expanded beyond its allocated size.
A LinkedList consists of a chain of nodes; each node is separated allocated and has front and back pointers to other nodes.
So what does this mean? Unless you need to insert in the middle, splice, delete in the middle etc. an ArrayList will usually be faster. It needs less memory allocations, has much better locality of reference (which is important for processor caching) etc.