问题
My question is this: Is string concatenation in C# safe? If string concatenation leads to unexpected errors, and replacing that string concatenation by using StringBuilder causes those errors to disappear, what might that indicate?
Background: I am developing a small command line C# application. It takes command line arguments, performs a slightly complicated SQL query, and outputs about 1300 rows of data into a formatted XML file.
My initial program would always run fine in debug mode. However, in release mode it would get to about the 750th SQL result, and then die with an error. The error was that a certain column of data could not be read, even through the Read() method of the SqlDataReader object had just returned true.
This problem was fixed by using StringBuilder for all operations in the code, where previously there had been "string1 + string2". I'm not talking about string concatenation inside the SQL query loop, where StringBuilder was already in use. I'm talking about simple concatenations between two or three short string variables earlier in the code.
I had the impression that C# was smart enough to handle the memory management for adding a few strings together. Am I wrong? Or does this indicate some other sort of code problem?
回答1:
Apart from what you're doing is probably best done with XML APIs instead of strings or StringBuilder I doubt that the error you see is due to string concatenation. Maybe switching to StringBuilder just masked the error or went over it gracefully, but I doubt using strings really was the cause.
回答2:
To answer your question: String contatenation in C# (and .NET in general) is "safe", but doing it in a tight loop as you describe is likely to cause severe memory pressure and put strain on the garbage collector.
I would hazard a guess that the errors you speak of were related to resource exhaustion of some sort, but it would be helpful if you could provide more detail — for example, did you receive an exception? Did the application terminate abnormally?
Background: .NET strings are immutable, so when you do a concatenation like this:
var stringList = new List<string> {"aaa", "bbb", "ccc", "ddd", //... };
string result = String.Empty;
foreach (var s in stringList)
{
result = result + s;
}
This is roughly equivalent to the following:
string result = "";
result = "aaa"
string temp1 = result + "bbb";
result = temp1;
string temp2 = temp1 + "ccc";
result = temp2;
string temp3 = temp2 + "ddd";
result = temp3;
// ...
result = tempN + x;
The purpose of this example is to emphasise that each time around the loop results in the allocation of a new temporary string.
Since the strings are immutable, the runtime has no alternative options but to allocate a new string each time you add another string to the end of your result.
Although the result
string is constantly updated to point to the latest and greatest intermediate result, you are producing a lot of these un-named temporary string that become eligible for garbage collection almost immediately.
At the end of this concatenation you will have the following strings stored in memory (assuming, for simplicity, that the garbage collector has not yet run).
string a = "aaa";
string b = "bbb";
string c = "ccc";
// ...
string temp1 = "aaabbb";
string temp2 = "aaabbbccc";
string temp3 = "aaabbbcccddd";
string temp4 = "aaabbbcccdddeee";
string temp5 = "aaabbbcccdddeeefff";
string temp6 = "aaabbbcccdddeeefffggg";
// ...
Although all of these implicit temporary variables are eligible for garbage collection almost immediately, they still have to be allocated. When performing concatenation in a tight loop this is going to put a lot of strain on the garbage collector and, if nothing else, will make your code run very slowly. I have seen the performance impact of this first hand, and it becomes truly dramatic as your concatenated string becomes larger.
The recommended approach is to always use a StringBuilder
if you are doing more than a few string concatenations. StringBuilder
uses a mutable buffer to reduce the number of allocations that are necessary in building up your string.
回答3:
String concatenation is safe though more memory intensive than using a StringBuilder if contatenating large numbers of strings in a loop. And in extreme cases you could be running out of memory.
It's almost certainly a bug in your code.
Maybe you're contatenating a very large number of strings. Or maybe it's something else completely different.
I'd go back to debugging without any preconceptions of the root cause - if you're still having problems try to reduce it to the minimum needed to repro the problem and post code.
回答4:
How long would it take the concatenation version vs the string builder version? It's possible that your connection to the DB is being closed. If you are doing a lot of concatenation, i would go w/ StringBuilder as it is a bit more efficient.
回答5:
One cause may be that strings are immutable in .Net so when you do an operation on one such as concatenation you are actually creating a new string.
Another possible cause is that string length is an int so the maximum possible length is Int32.MaxValue or 2,147,483,647.
In either case a StringBuilder is better than "string1 + string2" for this type of operation. Although, using the built-in XML capabilities would be even better.
回答6:
string.Concat(string[])
is by far the fastest way to concatenate strings. It litterly kills StringBuilder
in performance when used in loops, especially if you create the StringBuilder
in each iteration.
There are loads of references if you Google "c# string format vs stringbuilder" or something like that.
http://www.codeproject.com/KB/cs/StringBuilder_vs_String.aspx gives you an ideer about the times. Here string.Join wins the concatenation test but I belive this is because the string.Concat(string, string)
is used instead of the overloaded version that takes an array.
If you take a look at the MSIL code that is generated by the different methods you'll see what going on beneath the hood.
回答7:
Here is my shot in the dark...
Strings in .NET (not stringbuilders) go into the String Intern Pool. This is basically an area managed by the CLR to share strings to improve performance. There has to be some limit here, although I have no idea what that limit is. I imagine all the concatenation you are doing is hitting the ceiling of the string intern pool. So SQL says yes I have a value for you, but it can't put it anywhere so you get an exception.
A quick and easy test would be to nGen your assembly and see if you still get the error. After nGen'ing, you application no longer will use the pool.
If that fails, I'd contact Microsoft to try and get some hard details. I think my idea sounds plausible, but I have no idea why it works in debug mode. Perhaps in debug mode strings aren't interned. I am also no expert.
回答8:
When compounding strings together I always use StringBuilder. It's designed for it and is more efficient that simply using "string1 + string2".
来源:https://stackoverflow.com/questions/779051/string-concatenation-unsafe-in-c-need-to-use-stringbuilder