Why is it that they decided to make String
immutable in Java and .NET (and some other languages)? Why didn\'t they make it mutable?
The decision to have string mutable in C++ causes a lot of problems, see this excellent article by Kelvin Henney about Mad COW Disease.
COW = Copy On Write.
For most purposes, a "string" is (used/treated as/thought of/assumed to be) a meaningful atomic unit, just like a number.
You should know why. Just think about it.
I hate to say it, but unfortunately we're debating this because our language sucks, and we're trying to using a single word, string, to describe a complex, contextually situated concept or class of object.
We perform calculations and comparisons with "strings" similar to how we do with numbers. If strings (or integers) were mutable, we'd have to write special code to lock their values into immutable local forms in order to perform any kind of calculation reliably. Therefore, it is best to think of a string like a numeric identifier, but instead of being 16, 32, or 64 bits long, it could be hundreds of bits long.
When someone says "string", we all think of different things. Those who think of it simply as a set of characters, with no particular purpose in mind, will of course be appalled that someone just decided that they should not be able to manipulate those characters. But the "string" class isn't just an array of characters. It's a STRING
, not a char[]
. There are some basic assumptions about the concept we refer to as a "string", and it generally can be described as meaningful, atomic unit of coded data like a number. When people talk about "manipulating strings", perhaps they're really talking about manipulating characters to build strings, and a StringBuilder is great for that. Just think a bit about what the word "string" truly means.
Consider for a moment what it would be like if strings were mutable. The following API function could be tricked into returning information for a different user if the mutable username string is intentionally or unintentionally modified by another thread while this function is using it:
string GetPersonalInfo( string username, string password )
{
string stored_password = DBQuery.GetPasswordFor( username );
if (password == stored_password)
{
//another thread modifies the mutable 'username' string
return DBQuery.GetPersonalInfoFor( username );
}
}
Security isn't just about 'access control', it's also about 'safety' and 'guaranteeing correctness'. If a method can't be easily written and depended upon to perform a simple calculation or comparison reliably, then it's not safe to call it, but it would be safe to call into question the programming language itself.
String
is not a primitive type, yet you normally want to use it with value semantics, i.e. like a value.
A value is something you can trust won't change behind your back.
If you write: String str = someExpr();
You don't want it to change unless YOU do something with str
.
String
as an Object
has naturally pointer semantics, to get value semantics as well it needs to be immutable.
Immutability is not so closely tied to security. For that, at least in .NET, you get the SecureString
class.
Later edit: In Java you will find GuardedString
, a similar implementation.
According to Effective Java, chapter 4, page 73, 2nd edition:
"There are many good reasons for this: Immutable classes are easier to design, implement, and use than mutable classes. They are less prone to error and are more secure.
[...]
"Immutable objects are simple. An immutable object can be in exactly one state, the state in which it was created. If you make sure that all constructors establish class invariants, then it is guaranteed that these invariants will remain true for all time, with no effort on your part.
[...]
Immutable objects are inherently thread-safe; they require no synchronization. They cannot be corrupted by multiple threads accessing them concurrently. This is far and away the easiest approach to achieving thread safety. In fact, no thread can ever observe any effect of another thread on an immutable object. Therefore, immutable objects can be shared freely
[...]
Other small points from the same chapter:
Not only can you share immutable objects, but you can share their internals.
[...]
Immutable objects make great building blocks for other objects, whether mutable or immutable.
[...]
The only real disadvantage of immutable classes is that they require a separate object for each distinct value.
There are at least two reasons.
First - security http://www.javafaq.nu/java-article1060.html
The main reason why String made immutable was security. Look at this example: We have a file open method with login check. We pass a String to this method to process authentication which is necessary before the call will be passed to OS. If String was mutable it was possible somehow to modify its content after the authentication check before OS gets request from program then it is possible to request any file. So if you have a right to open text file in user directory but then on the fly when somehow you manage to change the file name you can request to open "passwd" file or any other. Then a file can be modified and it will be possible to login directly to OS.
Second - Memory efficiency http://hikrish.blogspot.com/2006/07/why-string-class-is-immutable.html
JVM internally maintains the "String Pool". To achive the memory efficiency, JVM will refer the String object from pool. It will not create the new String objects. So, whenever you create a new string literal, JVM will check in the pool whether it already exists or not. If already present in the pool, just give the reference to the same object or create the new object in the pool. There will be many references point to the same String objects, if someone changes the value, it will affect all the references. So, sun decided to make it immutable.