I\'m pasting this text from an ebook I have. It says the complexity if O(n2) and also gives an explanation for it, but I fail to see how.
Question: What i
There's a typo in this book.
1st case :
public String makeSentence(String[] words) {
String sentence = new String();
for (String w : words) sentence += w;
return sentence;
}
Complexity : O(n^2) -> (n words) x (n characters copied at each iteration, for copying the current sentence into a StringBuffer)
2nd case :
public String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
Complexity : O(n) -> (n words) x O(1) (amortized complexity for StringBuffer concatenation)
I tried to check it using this program
public class Test {
private static String[] create(int n) {
String[] res = new String[n];
for (int i = 0; i < n; i++) {
res[i] = "abcdefghijklmnopqrst";
}
return res;
}
private static String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
public static void main(String[] args) {
String[] ar = create(Integer.parseInt(args[0]));
long begin = System.currentTimeMillis();
String res = makeSentence(ar);
System.out.println(System.currentTimeMillis() - begin);
}
}
And result was, as expected, O(n):
java Test 200000 - 128 ms
java Test 500000 - 370 ms
java Test 1000000 - 698 ms
Version 1.6.0.21
As the explanation given in the book, for ever word in the string array a new object of sentence gets created and that sentence object first copies the previous sentence and then traverses till the end of the array and then appends the new word, hence the complexity of n^2
.
Hence n*n
will be n^2
.
It's a bit difficult to answer a question about the complexity of this code when it is written at a high level which abstracts away the details of the implementation. The Java documentation doesn't seem to give any guarantees in terms of the complexity of the append
function. As others have pointed out, the StringBuffer
class can (and should) be written so that the complexity of appending strings does not depend on the current length of the string held in StringBuffer
.
However, I suspect it is not that helpful to the person asking this question to simply say "your book is wrong!" - instead, let us see what assumptions are being made and make clear what the author was trying to say.
You can make the following assumptions:
new StringBuffer
is O(1)w
in words
is O(1)sentence.toString
is at most O(n).The question is really what is the order of sentence.append(w)
, and that depends on how it happens inside the StringBuffer
. The naive way is to do it like Shlemiel the Painter.
Suppose you use a C-style null-terminated string for the contents of StringBuffer
. The way you find the end of such a string is by reading each character, one by one, until you find the null character - then to append a new string S, you can start copying characters from S to the StringBuffer
string (finishing with another null character). If you write append
this way, it is O(a + b), where a is the number of characters currently in the StringBuffer
, and b is the number of characters in the new word. If you loop over an array of words, and each time you have to read all the characters you just appended before appending the new word, then the complexity of the loop is O(n^2), where n is the total number of characters in all the words (also the number of characters in the final sentence).
On the other hand, suppose that the contents of StringBuffer
is still an array of characters, but we also store an integer size
which tells us how long the string is (number of characters). Now we no longer have to read every character in the StringBuffer
in order to find the end of the string; we can just look up index size
in the array, which is O(1) instead of O(a). Then the append
function now only depends on the number of characters being appended, O(b). In this case the complexity of the loop is O(n), where n is the total number of characters in all the words.
Finally, there's one more aspect of the implementation that hasn't been covered yet, and that is the one actually brought up by the answer in the textbook - memory allocation. Each time you want to write more characters to your StringBuffer
, you're not guaranteed to have enough space in your character array to actually fit the new word in. If there isn't enough space, your computer needs to first allocate some more room in a clean section of memory, and then copy all the information in the old StringBuffer
array across, and then it can continue as before. Copying data like this will take O(a) time (where a is the number of characters to be copied).
In the worst case, you have to allocate more memory every time you add a new word. This basically takes us back to square one where the loop has O(n^2) complexity, and is what the book seems to suggest. If you assume that nothing crazy is happening (the words aren't getting longer at an exponential rate!), then you can probably reduce the number of memory allocations to something more like O(log(n)) by having the allocated memory grow exponentially. If that's the number of memory allocations, and memory allocations in general are O(a), then the total complexity attributed just to memory management in the loop is O(n log(n)). Since the appending work is O(n) and less than the complexity of the memory management, the total complexity of the function is O(n log(n)).
Again, the Java documentation doesn't help us in terms of how the capacity of the StringBuffer
grows, it just says "If the internal buffer overflows, it is automatically made larger". Depending on how it happens, you could end up with either O(n^2) or O(n log(n)) overall.
As an exercise left to the reader: Find an easy way to modify the function so that the overall complexity is O(n), by removing memory reallocation issues.