问题
I have a SQL below which is able to get the interval average of timestamp column grouped by icao_address, flight_number, flight_date. I'm trying to do the same for standard deviation and although I get a figure, it is wrong. The standard deviation that I get back is 14.06 (look at image below to see) while it should be around 1.8.
Below is what I'm using for stddev calculation.
STDDEV_POP(UNIX_SECONDS(timestamp))as standard_deviation
Below is my SQL
#standardSQL
select DATE(timestamp) as flight_date, safe_divide(timestamp_diff(max(timestamp), min(timestamp),SECOND), (COUNT(DISTINCT(timestamp)) - 1))as avg_interval_message, STDDEV_POP(UNIX_SECONDS(timestamp))as standard_deviation,
icao_address, flight_number, min(timestamp) as firstrecord, max(timestamp) as lastrecord, count(timestamp) as target_updates
from `ais-data-analysis._analytics._aoi_table`
group by icao_address, flight_number, flight_date
having avg_interval_message is not null and flight_number is not null and icao_address = '4B8E41'
order by flight_date, avg_interval_message ASC
The timestamp column is what I'm trying to get the standard deviation of, of the intervals between them, it's 10 records
回答1:
You can use STDDEV_POP(<FLOAT>)
to calculate the standard deviation as you can see here
Description
Returns the population (biased) standard deviation of the values. The return result is between 0 and +Inf.
This function ignores any NULL inputs. If all inputs are ignored, this function returns NULL.
If this function receives a single non-NULL input, it returns 0.
Supported Input Types
FLOAT64
Optional Clauses
The clauses are applied in the following order:
OVER: Specifies a window. See Analytic Functions. This clause is currently incompatible with all other clauses within STDDEV_POP(). DISTINCT: Each distinct value of expression is aggregated only once into the result.
Return Data Type
FLOAT64
I hope it helps
来源:https://stackoverflow.com/questions/59724358/grouping-records-and-getting-standard-deviation-intervals-for-grouped-records-in