R: summary() returns strange 1st Qu

倾然丶 夕夏残阳落幕 提交于 2020-01-02 23:13:06

问题


There's an exercise in Khan Academy' Probability and Statistics course on creating box-and-whisker plot. Here's screenshot representing correct solution. But when I tried to check solution in R I got the following:

d <- c(11, 4, 1, 4, 2, 2, 6, 10, 5, 6, 0, 6, 3, 3)
summary(d)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    2.25    4.00    4.50    6.00   11.00 

You can see 2.25 value for 1st Qu. But the correct value is 2. Any other values returned with summary() are correct. Any ideas why summary() returns wrong result?


回答1:


In a nutshell, there are many reasonable ways to compute quantiles. This is evidenced by the nine (!) different methods supported by the quantile function.

summary is not wrong, it is just using a different method to the one you're expecting. It likely is using the default method 7 (called "type 7" in the help page). Like most other methods, it is performing linear interpolation between two adjacent values, 2 and 3.

You could try experimenting with the other methods by calling quantile with the appropriate type argument:

> quantile(s, type=1)
  0%  25%  50%  75% 100% 
   0    2    4    6   11 



回答2:


I had this very same issue as well. I think this is to do with the type of the quantile calculation used.

This article explains it better than I can: http://datapigtechnologies.com/blog/index.php/why-excel-has-multiple-quartile-functions-and-how-to-replicate-the-quartiles-from-r-and-other-statistical-packages/

To see examples:

quantile(d, probs=0.25)
25% 
2.25 
quantile(d, probs=0.25, type=6)
25% 
2 


来源:https://stackoverflow.com/questions/27210745/r-summary-returns-strange-1st-qu

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!