Why is the complexity of simple string concatenation O(n^2)?

问题

I read on several manuals and online sources that the running time of "simple string concatenation" is O(n^2)?

The algorithm is this: we take the first 2 strings, create a new string, copy the characters of the 2 original strings in the new string, and repeat this process over and over again until all strings are concatenated. We are not using StringBuilder or similar implementations: just a simple string concatenation.

I think the running time should be something like O(kn) where k = number of strings, n = total number of characters. You don't copy the same characters n times, but k times, so it should not be O(n^2). For example, if you have 2 strings, it's just O(n). Basically it's n + (n-x) + (n-y) + (n-z)... but k times, not n times.

Where am I wrong?

回答1:

A precise problem statement is necessary here:

There are two metrics to consider: How much space is required and how much time is required. This note looks at the time requirements.

The concatenation operation is specified to only concatenate two of the strings at a time, with concatentation being performed with left association:

((k1 + k2) + k3) ...

There are two parameters that may be considered, and two ways of looking at the second parameter.

The first parameter is the total size (in characters) of the strings which are to be concatenated.

The second parameter is either the number of strings which are to be concatenated, or is the size of each of the strings which are to be concatenated.

Considering the first case:

n - Total size (in characters) of the strings to be concatenated.

k - Total number of strings to be concatenated.

The time the concatenation is roughly:

(n/k) * (k^2) / 2

Or, to within a constant factory:

n * k

Then, for a fixed 'k', the concatenation time is linear!

Considering instead the second case:

n - Total size of the strings

m - Size of each of the sub-strings

This corresponds to the prior case but with:

k = n / m

The prior estimate then becomes:

n * k = n * (n / m) = n^2 / m

That is, for a fixed 'm', the concatenation time is quadratic.

回答2:

If you write some tests and look at the byte code you will see that StringBuilder is used to implement concatenation. And sometimes it will pre-allocate the internal array to increase the efficiency to do so. That is clearly not O(n^2) complexity.

Here is the Java code.


  public static void main(String[] args) {
      String[] william = {
            "To ", "be ", "or ", "not ", "to ", ", that", "is ", "the ",
            "question."
      };
      String quote = "";
      for (String word : william) {
         quote += word;
      }
   }

Here is the byte code.

 public static void main(java.lang.String[] args);
      0  bipush 9
      2  anewarray java.lang.String [16]
      5  dup
      6  iconst_0
      7  ldc <String "To "> [18]
      9  aastore
     10  dup
     11  iconst_1
     12  ldc <String "be "> [20]
     14  aastore
     15  dup
     16  iconst_2
     17  ldc <String 0"or "> [22]
     19  aastore
     20  dup
     21  iconst_3
     22  ldc <String "not "> [24]
     24  aastore
     25  dup
     26  iconst_4
     27  ldc <String "to "> [26]
     29  aastore
     30  dup
     31  iconst_5
     32  ldc <String ", that"> [28]
     34  aastore
     35  dup
     36  bipush 6
     38  ldc <String "is "> [30]
     40  aastore
     41  dup
     42  bipush 7
     44  ldc <String "the "> [32]
     46  aastore
     47  dup
     48  bipush 8
     50  ldc <String "question."> [34]
     52  aastore
     53  astore_1 [william]
     54  ldc <String ""> [36]
     56  astore_2 [quote]
     57  aload_1 [william]
     58  dup
     59  astore 6
     61  arraylength
     62  istore 5
     64  iconst_0
     65  istore 4
     67  goto 98
     70  aload 6
     72  iload 4
     74  aaload
     75  astore_3 [word]
     76  new java.lang.StringBuilder [38]
     79  dup
     80  aload_2 [quote]
     81  invokestatic java.lang.String.valueOf(java.lang.Object) : java.lang.String [40]
     84  invokespecial java.lang.StringBuilder(java.lang.String) [44]
     87  aload_3 [word]
     88  invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [47]
     91  invokevirtual java.lang.StringBuilder.toString() : java.lang.String [51]
     94  astore_2 [quote]
     95  iinc 4 1
     98  iload 4
    100  iload 5
    102  if_icmplt 70

来源：https://stackoverflow.com/questions/58309852/why-is-the-complexity-of-simple-string-concatenation-on2

标签

string

time-complexity

concatenation

big-o