I\'ve been tinkering with small functions on my own time, trying to find ways to refactor them (I recently read Martin Fowler\'s book Refactoring: Improving the Design of Ex
This is in response to ctacke's comment on Jon Skeet's answer (It's not long for a comment)
I always thought foreach was pretty well known to be slower than a for loop since it has to use the iterator.
Actually, no, in this case foreach would be faster. Index access is bounds checked (ie. i is check to be in range three time in the loop: once in the for() and once each for the two ca[i]s), which makes a for loop slower than foreach.
If the C# compiler detects the specific syntax:
for(i = 0; i < ca.Length; i++)
then it will perform a ad hoc optimization, removing the internal bound-checks, making the for() loop faster. However, since here we must treat ca[0] as a special case (to prevent a leading space on the output), we can't trigger that optimization.
Try refactoring so that the regular expression that you're using to split the string in the second method is stored in a static method, and has been built using the RegexOptions.Compiled option. More info about this here: http://msdn.microsoft.com/en-us/library/8zbs0h2f.aspx.
I didn't test the theory, but I'd imagine that having to recreate the regex every time would be time consuming.
I know what they say about RegEx, use it to solve a problem and now you have two problems, but I remain a fan, Just for grins, here is a RegEx version. RegEx, with a little initiation is easy to read, less code, and lets you easily snap in additional delimiters (as I did with the comma).
s1 = MakeNiceString( "LookOut,Momma,There'sAWhiteBoatComingUpTheRiver" ) );
private string MakeNiceString( string input )
{
StringBuilder sb = new StringBuilder( input );
int Incrementer = 0;
MatchCollection mc;
const string SPACE = " ";
mc = Regex.Matches( input, "[A-Z|,]" );
foreach ( Match m in mc )
{
if ( m.Index > 0 )
{
sb.Insert( m.Index + Incrementer, SPACE );
Incrementer++;
}
}
return sb.ToString().TrimEnd();
}
In C# (.Net, really) When you append a string there are several things going on in the background. Now, I forget the specifics, but it is something like:
string A = B + C;
A += D; A += E;
// ... rinse-repeat for 10,000 iterations
For each line above, .NET will: 1) Allocate some new memory for A. 2) Copy the string B into the new memory. 3) Extend the memory to hold C. 4) Append the string C to A.
The longer the string A, the more time this takes. Add to that, the more times you do this, the longer A gets, the exponentially longer this takes.
However, with StringBuilder you are not allocating new memory, thus you skip that problem.
If you say :
StringBuilder A = new StringBuilder();
A.Append(B);
A.Append(C);
// .. rinse/repeat for 10,000 times...
string sA = A.ToString();
StringBuilder (Edit: fixed description) has a string in memory. It doesn't need to re-allocate the entire string for each added sub-string. When you issue ToString(), the string is already appended in the proper format.
One shot instead of a loop that takes an increasingly longer period.
I hope that helps answer the WHY it took so much less time.