In HashMap why threshold value (The next size value at which to resize) is capacity * load factor. Why not as equal to size or capacity
Javadoc, Javadoc, Javadoc. That is the first place you look. On the HashMap
it says:
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
As on the theory of hash maps - if your map is full, then you're doing something very, very wrong. By that time you're likely at O(sqrt(N))
on lookups with random data - BAD. You never want your hashmap to be full. But a very sparse map will waste too much space (as you've noted), and will take too long to iterate through. Hence there should be a load factor, that is less than 1 for most use cases.
Note: The "wasted space" is proportional to the size of the map, and inversely proportional to the load factor. However lookup times have a more complex expected performance function. This means that the same load factor will not work for different size hash maps - as it will mean different scale tradeoffs.
A general overview of the tradeoffs can be found in Knuth "The Art of Computer Programming" vol 3.