I have cells for whom the numeric value can be anything between 0 and Integer.MAX_VALUE
. I would like to color code these cells correspondingly.
If the val
I figured a log fit would be good for this, but looking at the results, I'm not so sure.
However, Wolfram|Alpha is great for experimenting with this sort of thing:
I started with that, and ended up with:
r(x) = floor(((11.5553 * log(14.4266 * (x + 1.0))) - 30.8419) / 0.9687)
Interestingly, it turns out that this gives nearly identical results to Artelius's answer of:
r(x) = floor(255 * log(x + 1) / log(2^31 + 1)
IMHO, you'd be best served with a split function for 0-10000 and 10000-2^31.
Note that if you want brighter and brighter, that luminosity is not linear so a straight mapping from value to color will not give a good result.
The Color class has a method to make a brighter color. Have a look at that.
In general (since it's not clear to me if this is a Java or Language-Agnostic question) you would divide the value you have by Integer.MAX_VALUE
, multiply by 255
and convert to an integer.
For a linear mapping of the range 0-2^32 to 0-255, just take the high-order byte. Here is how that would look using binary &
and bit-shifting:
r = value & 0xff000000 >> 24
Using mod 256 will certainly return a value 0-255, but you wont be able to draw any grouping sense from the results - 1, 257, 513, 1025 will all map to the scaled value 1, even though they are far from each other.
If you want to be more discriminating among low values, and merge many more large values together, then a log expression will work:
r = log(value)/log(pow(2,32))*256
EDIT: Yikes, my high school algebra teacher Mrs. Buckenmeyer would faint! log(pow(2,32))
is the same as 32*log(2)
, and much cheaper to evaluate. And now we can also factor this better, since 256/32 is a nice even 8:
r = 8 * log(value)/log(2)
log(value)/log(2)
is actually log-base-2 of value
, which log does for us very neatly:
r = 8 * log(value,2)
There, Mrs. Buckenmeyer - your efforts weren't entirely wasted!
The "fairest" linear scaling is actually done like this:
floor(256 * value / (Integer.MAX_VALUE + 1))
Note that this is just pseudocode and assumes floating-point calculations.
If we assume that Integer.MAX_VALUE + 1 is 2^31, and that / will give us integer division, then it simplifies to
value / 8388608
Why other answers are wrong
Some answers (as well as the question itself) suggsted a variation of (255 * value / Integer.MAX_VALUE)
. Presumably this has to be converted to an integer, either using round()
or floor()
.
If using floor()
, the only value
that produces 255 is Integer.MAX_VALUE itself. This distribution is uneven.
If using round()
, 0 and 255 will each get hit half as many times as 1-254. Also uneven.
Using the scaling method I mention above, no such problem occurs.
Non-linear methods
If you want to use logs, try this:
255 * log(value + 1) / log(Integer.MAX_VALUE + 1)
You could also just take the square root of the value (this wouldn't go all the way to 255, but you could scale it up if you wanted to).
Here are a bunch of algorithms for scaling, normalizing, ranking, etc. numbers by using Extension Methods in C#, although you can adapt them to other languages:
http://www.redowlconsulting.com/Blog/post/2011/07/28/StatisticalTricksForLists.aspx
There are explanations and graphics that explain when you might want to use one method or another.