I have some slides from IBM named : \"From Java Code to Java Heap: Understanding the Memory Usage of Your Application\", that says, when we use String
instead of <
In the JVM, a character variable is stored in a single 16-bit memory allocation and changes to that Java variable overwrite that same memory location.This makes creating or updating character variables very fast and memory-cheap, but increases the JVM's overhead compared to the static allocation as used in Strings.
The JVM stores Java Strings in a variable size memory space (essentially, an array), which is exactly the same size (plus 1, for the string termination character) of the string when the String object is created or first assigned a value. Thus, an object with initial value "HELP!" would be allocated 96 bits of storage ( 6 characters, each 16-bits in size). This value is considered immutable, allowing the JVM to inline references to that variable, making static string assignments very fast, and very compact, plus very efficient from the JVM point of view.
Reference
This figure relates to JDK 6- 32-bit.
In pre-Java-7 world strings which were implemented as a pointer to a region of a char[]
array:
// "8 (4)" reads "8 bytes for x64, 4 bytes for x32"
class String{ //8 (4) house keeping + 8 (4) class pointer
char[] buf; //12 (8) bytes + 2 bytes per char -> 24 (16) aligned
int offset; //4 bytes -> three int
int length; //4 bytes -> fields align to
int hash; //4 bytes -> 16 (12) bytes
}
So I counted:
36 bytes per new String("a") for JDK 6 x32 <-- the overhead from the article
56 bytes per new String("a") for JDK 6 x64.
Just to compare, in JDK 7+ String
is a class which holds a char[]
buffer and a hash
field only.
class String{ //8 (4) + 8 (4) bytes -> 16 (8) aligned
char[] buf; //12 (8) bytes + 2 bytes per char -> 24 (16) aligned
int hash; //4 bytes -> 8 (4) aligned
}
So it's:
28 bytes per String for JDK 7 x32
48 bytes per String for JDK 7 x64.
UPDATE
For 3.75:1
ratio see @Andrey's explanation below. This proportion falls down to 1 as the length of the string grows.
Useful links:
I'll try explaining the numbers referenced in the source article.
The article describes object metadata typically consisting of: class, flags and lock.
The class and lock are stored in the object header and take 8 bytes on 32bit VM. I haven't found though any information about JVM implementations which has flags info in the object header. It might be so that this is stored somewhere externally (e.g. by garbage collector to count references to the object etc.).
So let's assume that the article talks about some x32 AbstractJVM which uses 12 bytes of memory to store meta information about the object.
Then for char[]
we have:
2 * (4 - (length + 2) % 4)
)For java.lang.String
we have:
So, let's count how much memory is needed to store "MyString"
as String
object:
12 + 16 + (12 + 4 + 2 * "MyString".length + 2 * ("MyString".length % 2)) = 60 bytes.
From other side we know that to store only the data (without information about the data type, length or anything else) we need:
2 * "MyString".length = 16 bytes
Overhead is 60 / 16 = 3.75
Similarly for single character array we get the 'maximum overhead':
12 + 16 + (12 + 4 + 2 * "a".length + 2 * ("a".length % 2)) = 48 bytes
2 * "a".length = 2 bytes
48 / 2 = 24
Following the article authors' logic ultimately the maximum overhead of value infinity is achieved when we store an empty string :).
I had read from old stackoverflow answer not able to get it. In Oracle's JDK a String has four instance-level fields:
A character array
An integral offset
An integral character count
An integral hash value
That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods, a design choice that some other Java library implementers have eschewed.) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array.
If you can get away with allocating and consuming just the basic character array, there's space to be saved there. It's certainly not idiomatic to do so in Java though; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference.