问题
To create HashMap/HashSet for N elements, we generally donew HashMap((int)(N/0.75F)+1)
which is annoying.
Why the library has not taken care of this in the first place and allows initialization like new HashMap(N)
(should not rehash till N elements) taking care of this calculation (int)(N/0.75F)+1
?
回答1:
Update
Updating to reflect changed question. No, there is no such standard API but it seems there is a method Maps.newHashMapWithExpectedSize(int) in guava:
Creates a
HashMap
instance, with a high enough "initial capacity" that it should holdexpectedSize
elements without growth.
i have to initialize it to (int)(N/0.75F)+1
No you don't. If you create new HashMap
from other Map
, HashMap
calculates capacity first by default:
public HashMap(Map<? extends K, ? extends V> m) {
this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,
DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);
putAllForCreate(m);
}
If you add elements one by one, the same process happens as well:
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
//...
}
createEntry(hash, key, value, bucketIndex);
}
The only reason to use HashMap(int initialCapacity, float loadFactor)
constructor is when you know from the very beginning how many elements you want to store in the HashMap
, thus avoiding resizing and rehashing later (map has correct size from the very beginning).
One interesting implementation detail is that initial capacity is trimmed to the nearest power of two (see: Why ArrayList grows at a rate of 1.5, but for Hashmap it's 2?):
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
So if you want your HashMap
to have exact capacity as defined, just use powers of two.
Choosing different loadFactor
allows you to trade space for performance - smaller value means more memory, but less collisions.
回答2:
I have run the following program
public static void main(String... args) throws IllegalAccessException, NoSuchFieldException {
for (int i = 12; i < 80; i++) {
Map<Integer, Integer> map = new HashMap<Integer, Integer>((int) Math.ceil(i / 0.75));
int beforeAdding = Array.getLength(getField(map, "table"));
for (int j = 0; j < i; j++) map.put(j, j);
int afterAdding = Array.getLength(getField(map, "table"));
map.put(i, i);
int oneMore = Array.getLength(getField(map, "table"));
System.out.printf("%,d: initial %,d, after N %,d, after N+1 %,d%n ",
i, beforeAdding, afterAdding, oneMore);
}
}
private static <T> T getField(Map<Integer, Integer> map, String fieldName) throws NoSuchFieldException, IllegalAccessException {
Field table = map.getClass().getDeclaredField(fieldName);
table.setAccessible(true);
return (T) table.get(map);
}
which prints out
12: initial 16, after N 16, after N+1 32
13: initial 32, after N 32, after N+1 32
.. deleted ..
24: initial 32, after N 32, after N+1 64
25: initial 64, after N 64, after N+1 64
.. deleted ..
47: initial 64, after N 64, after N+1 64
48: initial 64, after N 64, after N+1 128
49: initial 128, after N 128, after N+1 128
.. deleted ..
79: initial 128, after N 128, after N+1 128
This shows that the default initialiser the initial capacity is rounded to the next power of two. The problem with this value is that if you want this to be the eventual size, you have to take into account the load factor if you want to avoid resizing. Ideally you shouldn't have to, in the way the Map copy constructor does for you.
回答3:
Most implementations grow automatically as you add more elements. The performance of most implementations also tends to decrease when the containers get fuller. That's why there is a load factor in the first place: to leave some empty space available.
来源:https://stackoverflow.com/questions/13329819/why-hashmap-initial-capacity-is-not-properly-handled-by-the-library