Comparison of substring operation performance between .NET and Java

后端未结

关注

 4  900

囚心锁ツ 2020-12-19 02:52

Taking substrings of a string is a very common string manipulation operation, but I heard that there might be considerable differences in performance/implementation between

4条回答

囚心锁ツ (楼主)

2020-12-19 03:36
In .NET, Substring is O(n) rather than the O(1) of Java. This is because in .NET, the String object contains all the actual character data itself¹ - so taking a substring involves copying all the data within the new substring. In Java, substring can just create a new object referring to the original char array, with a different starting index and length.

There are pros and cons of each approach:
- .NET's approach has better cache coherency, creates fewer objects², and avoids the situation where one small substring prevents a very large char[] being garbage collected. I believe in some cases it can make interop very easy too, internally.
- Java's approach makes taking a substring very efficient, and probably some other operations too
There's a little more detail in my strings article.

As for the general question of avoiding performance pitfalls, I think I should have a canned answer ready to cut and paste: make sure your architecture is efficient, and implement it in the most readable way you can. Measure the performance, and optimise where you find bottlenecks.

¹ Incidentally, this makes string very special - it's the only non-array type whose memory footprint varies by instance within the same CLR.

² For small strings, this is a big win. It's bad enough that there's all the overhead of one object, but when there's an extra array involved as well, a single-character string could take around 36 bytes in Java. (That's a "finger-in-the-air" number - I can't remember the exact object overheads. It will also depend on the VM you're using.)
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...