问题
A bit confused. I have a few Loadrunner Analysis from a report I've run. I'm new to testing. My understanding of the 90th percentile is that, given that it takes the 90th percentile and leaves out the outliers, it presents a truer picture. Although I'm looking at two different reports and in both, the 90th percentile response time is higher than the average response time given in the Summary Report. How can that be possible?
I'm looking at the graph of transaction response times (Percentile) and the last 10% shoot up, therefore telling me that taking the 90% should see a lower response time.
Example
Transaction 1
Min 0.012
Avg 1.919
Max 20.935
SD 2.718
90 Percentile 6.412
A lot of the transactions look like this, more-or-less. Why is the 90th percentile higher than the average?
回答1:
The 90th percentile means that 90% of the values fall below this value. The value in this case would be your response time. So if you had 1000 values and the 90th percentile is n
, 900 of those values would be below n
, and only 100 above n
-- so it makes sense that the average is less than the 90th percentile.
回答2:
The median is the 50th percentile. It will always be below 90th percentile. The average can actually be higher than the 90th percentile if you have a small percentage of your data set which is significantly long, dragging the average for the entire data set higher.
#FoundationSkills #Statistics
回答3:
Giles says: "The 90th percentile means that 90% of the values fall below this value. The value in this case would be your response time. So if you had 1000 values and the 90th percentile is n, 900 of those values would be below n, and only 100 above n -- so it makes sense that the average is less than the 90th percentile." Sorry, I fail to see how it makes sense. I would say, if you cut away the longest responses, what is left is the shorter response times, so in this case, when you calculate the average of the smaller numbers, you will get a smaller amount: the 90percentile would always be less than average, which is certainly not the case!
Isn't 90 percentile invented to show how the site performs for 90% of the customers? So, gather all most often occuring results and cut off some rare extremes (on both ends), which don't happen to often? This would explain why in the output of loadrunner average is almost always smaller than 90percentile. ? I think this is how it works: Pic: 90% calculation
回答4:
Mean is very different from n-quantile / median / quartiles / percentiles.
It is possible to have a set a values with mean > median or even 90th percentile < mean. They are just not the same thing.
See this https://math.stackexchange.com/questions/382117/average-is-higher-than-percentile-90
General assumption people make on their data rely on the unsaid assumption that the data follow a centered distribution with mean ~= median (like Gaussian).
Just look at Power law / Pareto law and see how wrong this assumption can be. Same applies for multimodal distributions. It is crucial to not do such assumptions without proper analysis, otherwise this is just some kind of « data bullshit»
(Btw that is why mean income gives less information than median income)
来源:https://stackoverflow.com/questions/40874994/loadrunner-analysis-how-can-the-90th-percentile-be-higher-than-the-average