When I run the code
hist(1:5)
or
hist(c(1,2,3,4,5))
The generated histogram shows that the first number \
Taking your first example, hist(1:5)
, you have five numbers, which get put into four bins. So two of those five get lumped into one.
The histogram has breaks at 2
, 3
, 4
, and 5
, so you can reasonably infer that the definition of hist
for where a number is plotted, is:
#pseudocode
if (i <= break) { # plot in bin }
You can specify the breaks manually to solve this:
hist(1:5, breaks=0:5)
What you are seeing is that hist is placing 1:5
into four bins. So there will be one bin with 2 counts.
If you specify the cutoff points like so:
hist(1:5, breaks=(c(0.5, 1.5, 2.5, 3.5, 4.5 , 5.5)))
then you will get the behaviour that you expect.
Try this:
> trace("hist.default", quote(print(fuzzybreaks)), at = 25)
Tracing function "hist.default" in package "graphics"
[1] "hist.default"
>
> out <- hist(1:5)
Tracing hist.default(1:5) step 25
[1] 0.9999999 2.0000001 3.0000001 4.0000001 5.0000001
> out$count
[1] 2 1 1 1
which shows the actual fuzzybreaks
value it is using as well as the count in each bin. Clearly there are two points in the first bin (between 0.9999999
and 2.0000001
) and one point in every other bin.
Compare with:
> out <- hist(1:5, breaks = 0:5 + 0.5)
Tracing hist.default(1:5, breaks = 0:5 + 0.5) step 25
[1] 0.4999999 1.5000001 2.5000001 3.5000001 4.5000001 5.5000001
> out$count
[1] 1 1 1 1 1
Now there is clearly one point in each bin.