Moving Average based on Timestamps in PostgreSQL

后端 未结 3 1390
南方客
南方客 2020-12-08 16:04

I wanted to perform moving average through timestamps. I have two columns: Temperature and timestamps (time-date) and I want to perform the moving average based on every 15

相关标签:
3条回答
  • 2020-12-08 16:23

    You can join your table with itself:

    select l1.id, avg( l2.Temperature )
    from l l1
    inner join l l2 
       on l2.id <= l1.id and
          l2.Timestamps + interval '15 minutes' > l1.Timestamps
    group by l1.id
    order by id
    ;
    

    Results:

    | ID |            AVG |
    -----------------------
    |  1 |           5.39 |
    |  2 |          5.365 |
    |  3 | 5.296666666667 |
    |  4 |         5.3575 |
    |  5 |          5.346 |
    |  6 | 5.321666666667 |
    |  7 | 5.331428571429 |
    

    Notice: Only 'hard work' is made. You should join result with original table or append new columns to query. I don't know your final query needed. Adapt this solution or ask for more help.

    0 讨论(0)
  • 2020-12-08 16:26

    Assuming you want to restart the rolling average after each 15 minute interval:

    select id, 
           temp,
           avg(temp) over (partition by group_nr order by time_read) as rolling_avg
    from (       
      select id, 
             temp,
             time_read, 
             interval_group,
             id - row_number() over (partition by interval_group order by time_read) as group_nr
      from (
        select id, 
               time_read, 
               'epoch'::timestamp + '900 seconds'::interval * (extract(epoch from time_read)::int4 / 900) as interval_group,
               temp
        from readings
      ) t1
    ) t2
    order by time_read;
    

    It is based on Depesz's solution to group by "time ranges":

    Here is an SQLFiddle example: http://sqlfiddle.com/#!1/0f3f0/2

    0 讨论(0)
  • 2020-12-08 16:49

    Here's an approach that utilises the facility to use an aggregation function as a window function. The aggregate function keeps the last 15 minutes' worth of observations in an array, along with the current running total. The state transition function shifts elements off the array that have fallen behind the 15-minute window, and pushes on the latest observation. The final function simply computes the mean temperature in the array.

    Now, as to whether this is a benefit or not... it depends. It focuses on the plgpsql-execution part of postgresql rather than database-access part, and my own experiences is that plpgsql is not fast. If you can easily do lookups back to the table to find the previous 15 minutes' rows for each observation, a self-join (as in @danihp answer) will do well. However, this approach can deal with the observations coming from some more complex source, where those lookups aren't practical. As ever, trial and compare on your own system.

    -- based on using this table definition
    create table observation(id int primary key, timestamps timestamp not null unique,
                             temperature numeric(5,2) not null);
    
    -- note that I'm reusing the table structure as a type for the state here
    create type rollavg_state as (memory observation[], total numeric(5,2));
    
    create function rollavg_func(state rollavg_state, next_in observation) returns rollavg_state immutable language plpgsql as $$
    declare
      cutoff timestamp;
      i int;
      updated_memory observation[];
    begin
      raise debug 'rollavg_func: state=%, next_in=%', state, next_in;
      cutoff := next_in.timestamps - '15 minutes'::interval;
      i := array_lower(state.memory, 1);
      raise debug 'cutoff is %', cutoff;
      while i <= array_upper(state.memory, 1) and state.memory[i].timestamps < cutoff loop
        raise debug 'shifting %', state.memory[i].timestamps;
        i := i + 1;
        state.total := state.total - state.memory[i].temperature;
      end loop;
      state.memory := array_append(state.memory[i:array_upper(state.memory, 1)], next_in);
      state.total := coalesce(state.total, 0) + next_in.temperature;
      return state;
    end
    $$;
    
    create function rollavg_output(state rollavg_state) returns float8 immutable language plpgsql as $$
    begin
      raise debug 'rollavg_output: state=% len=%', state, array_length(state.memory, 1);
      if array_length(state.memory, 1) > 0 then
        return state.total / array_length(state.memory, 1);
      else
        return null;
      end if;
    end
    $$;
    
    create aggregate rollavg(observation) (sfunc = rollavg_func, finalfunc = rollavg_output, stype = rollavg_state);
    
    -- referring to just a table name means a tuple value of the row as a whole, whose type is the table type
    -- the aggregate relies on inputs arriving in ascending timestamp order
    select rollavg(observation) over (order by timestamps) from observation;
    
    0 讨论(0)
提交回复
热议问题