I\'m fairly new to Java (been writing other stuff for many years) and unless I\'m missing something (and I\'m happy to be wrong here) the following is a fatal flaw...
<Dave, you have to forgive me (well, I guess you don't "have to", but I'd rather you did) but that explanation is not overly convincing. The Security gains are fairly minimal since anyone who needs to change the value of the string will find a way to do it with some ugly workaround. And speed?! You yourself (quite correctly) assert that the whole business with the + is extremely expensive.
The rest of you guys, please understand that I GET how it works, I'm asking WHY it works that way... please stop explaining the difference between the methodologies.
(and I honestly am not looking for any sort of fight here, btw, I just don't see how this was a rational decision).
When you pass "foo", you're passing the reference to "foo" as a value to ThisDoesntWork(). That means that when you do the assignment to "foo" inside of your method, you are merely setting a local variable (foo)'s reference to be a reference to your new string.
Another thing to keep in mind when thinking about how strings behave in Java is that strings are immutable. It works the same way in C#, and for some good reasons:
Now onto your bigger question. Why are objects passed this way? Well, if Java passed your string as what you'd traditionally call "by value", it would have to actually copy the entire string before passing it to your function. That's quite slow. If it passed the string by reference and let you change it (like C does), you'd have the problems I just listed.
Since my original answer was "Why it happened" and not "Why was the language designed so it happened," I'll give this another go.
To simplify things, I'll get rid of the method call and show what is happening in another way.
String a = "hello";
String b = a;
String b = "howdy"
System.out.print(a) //prints hello
To get the last statement to print "hello", b has to point to the same "hole" in memory that a points to (a pointer). This is what you want when you want pass by reference. There are a couple of reasons Java decided not to go this direction:
Pointers are Confusing The designers of Java tried to remove some of the more confusing things about other languages. Pointers are one of the most misunderstood and improperly used constructs of C/C++ along with operator overloading.
Pointers are Security Risks Pointers cause many security problems when misused. A malicious program assigns something to that part of memory, then what you thought was your object is actually someone else's. (Java already got rid of the biggest security problem, buffer overflows, with checked arrays)
Abstraction Leakage When you start dealing with "What's in memory and where" exactly, your abstraction becomes less of an abstraction. While abstraction leakage almost certainly creeps into a language, the designers didn't want to bake it in directly.
Objects Are All You Care About In Java, everything is an object, not the space an object occupies. Adding pointers would make the space an object occupies importantant, though.......
You could emulate what you want by creating a "Hole" object. You could even use generics to make it type safe. For example:
public class Hole<T> {
private T objectInHole;
public void putInHole(T object) {
this.objectInHole = object;
}
public T getOutOfHole() {
return objectInHole;
}
public String toString() {
return objectInHole.toString();
}
.....equals, hashCode, etc.
}
Hole<String> foo = new Hole<String)();
foo.putInHole(new String());
System.out.println(foo); //this prints nothing
thisWorks(foo);
System.out.println(foo);//this prints howdy
public static void thisWorks(Hole<String> foo){
foo.putInHole("howdy");
}
In java all variables passed are actually passed around by value- even objects. All variables passed to a method are actually copies of the original value. In the case of your string example the original pointer ( its actually a reference - but to avoid confusion ill use a different word ) is copied into a new variable which becomes the parameter to the method.
It would be a pain if everything was by reference. One would need to make private copies all over the place which would definitely be a real pain. Everybody knows that using immutability for value types etc makes your programs infinitely simpler and more scalable.
Some benefits include: - No need to make defensive copies. - Threadsafe - no need to worry about locking just in case someone else wants to change the object.
Reference typed arguments are passed as references to objects themselves (not references to other variables that refer to objects). You can call methods on the object that has been passed. However, in your code sample:
public static void thisDoesntWork(String foo){
foo = "howdy";
}
you are only storing a reference to the string "howdy"
in a variable that is local to the method. That local variable (foo
) was initialized to the value of the caller's foo
when the method was called, but has no reference to the caller's variable itself. After initialization:
caller data method
------ ------ ------
(foo) --> "" <-- (foo)
After the assignment in your method:
caller data method
------ ------ ------
(foo) --> ""
"hello" <-- (foo)
You have another issues there: String
instances are immutable (by design, for security) so you can't modify its value.
If you really want your method to provide an initial value for your string (or at any time in its life, for that matter), then have your method return a String
value which you assign to the caller's variable at the point of the call. Something like this, for example:
String foo = thisWorks();
System.out.println(foo);//this prints the value assigned to foo in initialization
public static String thisWorks(){
return "howdy";
}
If we would make a rough C and assembler analogy:
void Main()
{
// stack memory address of message is 0x8001. memory address of Hello is 0x0001.
string message = "Hello";
// assembly equivalent of: message = "Hello";
// [0x8001] = 0x0001
// message's stack memory address
printf("%d", &message); // 0x8001
printf("%d", message); // memory pointed to of message(0x8001): 0x0001
PassStringByValue(message); // pass the pointer pointed to of message. 0x0001, not 0x8001
printf("%d", message); // memory pointed to of message(0x8001): 0x0001. still the same
// message's stack memory address doesn't change
printf("%d", &message); // 0x8001
}
void PassStringByValue(string foo)
{
printf("%d", &foo); // &foo contains foo's *stack* address (0x4001)
// foo(0x4001) contains the memory pointed to of message, 0x0001
printf("%d", foo); // 0x0001
// World is in memory address 0x0002
foo = "World"; // on foo's memory address (0x4001), change the memory it pointed to, 0x0002
// assembly equivalent of: foo = "World":
// [0x4001] = 0x0002
// print the new memory pointed by foo
printf("%d", foo); // 0x0002
// Conclusion: Not in any way 0x8001 was involved in this function. Hence you cannot change the Main's message value.
// foo = "World" is same as [0x4001] = 0x0002
}
void Main()
{
// stack memory address of message is 0x8001. memory address of Hello is 0x0001.
string message = "Hello";
// assembly equivalent of: message = "Hello";
// [0x8001] = 0x0001
// message's stack memory address
printf("%d", &message); // 0x8001
printf("%d", message); // memory pointed to of message(0x8001): 0x0001
PassStringByRef(ref message); // pass the stack memory address of message. 0x8001, not 0x0001
printf("%d", message); // memory pointed to of message(0x8001): 0x0002. was changed
// message's stack memory address doesn't change
printf("%d", &message); // 0x8001
}
void PassStringByRef(ref string foo)
{
printf("%d", &foo); // &foo contains foo's *stack* address (0x4001)
// foo(0x4001) contains the address of message(0x8001)
printf("%d", foo); // 0x8001
// World is in memory address 0x0002
foo = "World"; // on message's memory address (0x8001), change the memory it pointed to, 0x0002
// assembly equivalent of: foo = "World":
// [0x8001] = 0x0002;
// print the new memory pointed to of message
printf("%d", foo); // 0x0002
// Conclusion: 0x8001 was involved in this function. Hence you can change the Main's message value.
// foo = "World" is same as [0x8001] = 0x0002
}
One possible reason why everything is passed by value in Java, its language designer folks want to simplify the language and make everything done in OOP manner.
They would rather have you design an integer swapper using objects than them provide a first class support for by-reference passing, the same for delegate(Gosling feels icky with pointer to function, he would rather cram that functionality to objects) and enum.
They over-simplify(everything is object) the language to the detriment of not having first class support for most language constructs, e.g. passing by reference, delegates, enum, properties comes to mind.