Histogram of uniform distribution not plotted correctly in R

前端未结

关注

 3  1870

When I run the code

hist(1:5)

hist(c(1,2,3,4,5))

The generated histogram shows that the first number \

相关标签:

3条回答

鱼传尺愫

2020-12-20 23:25
Taking your first example, hist(1:5), you have five numbers, which get put into four bins. So two of those five get lumped into one.

The histogram has breaks at 2, 3, 4, and 5, so you can reasonably infer that the definition of hist for where a number is plotted, is:
```
#pseudocode
if (i <= break) { # plot in bin }
```
You can specify the breaks manually to solve this:
```
hist(1:5, breaks=0:5)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
耶瑟儿～

2020-12-20 23:39
What you are seeing is that hist is placing 1:5 into four bins. So there will be one bin with 2 counts.

If you specify the cutoff points like so:
```
 hist(1:5, breaks=(c(0.5, 1.5, 2.5, 3.5, 4.5 , 5.5)))
```
then you will get the behaviour that you expect.
0 讨论(0)
发布评论:

提交评论
- 加载中...

无人及你

2020-12-20 23:47

Try this:

> trace("hist.default", quote(print(fuzzybreaks)), at = 25)
Tracing function "hist.default" in package "graphics"
[1] "hist.default"
>
> out <- hist(1:5)
Tracing hist.default(1:5) step 25 
[1] 0.9999999 2.0000001 3.0000001 4.0000001 5.0000001
> out$count
[1] 2 1 1 1

which shows the actual fuzzybreaks value it is using as well as the count in each bin. Clearly there are two points in the first bin (between 0.9999999 and 2.0000001) and one point in every other bin.

Compare with:

> out <- hist(1:5, breaks = 0:5 + 0.5)
Tracing hist.default(1:5, breaks = 0:5 + 0.5) step 25 
[1] 0.4999999 1.5000001 2.5000001 3.5000001 4.5000001 5.5000001
> out$count
[1] 1 1 1 1 1

Now there is clearly one point in each bin.

0 讨论(0)