Linear Interpolation in Oracle with special cases

 ̄綄美尐妖づ 提交于 2021-01-29 10:36:20

问题


In order to fill missing values I need to interpolate those missing ones.

I have gotten a dataset like the following (example):

Country    Year    Value
A          2000    1.5
A          2001    2.5
A          2002    null
A          2003    4.5
B          2000    null
B          2000    null    
B          2002    5.3
B          2003    6.3
C          2000    1
C          2001    null
C          2002    null
C          2003    4

As a result I would excpect:

Country    Year    Value
A          2000    1.5
A          2001    2.5
A          2002    3.5
A          2003    4.5
B          2000    3.3
B          2000    4.3    
B          2002    5.3
B          2003    6.3
C          2000    1
C          2001    2
C          2002    3
C          2003    4

How can i possible interpolate this values by linear interpolation. I really cannot come to any idea how to efficiently do it in oracle.


回答1:


Oracle has various functions for linear interpolation. REGR_SLOPE and REGR_INTERCEPT are helpful here.

The trick in your case is that it is not a linear regression between value and year. It is a linear regression between value and the row number within the country group. So we need to calculate that row number before we can calculate the interpolation.

with input_data (country, year, value) AS (
  SELECT 'A',          2000,    1.5  FROM DUAL UNION ALL
  SELECT 'A',          2001,    2.5  FROM DUAL UNION ALL
  SELECT 'A',          2002,    null FROM DUAL UNION ALL
  SELECT 'A',          2003,    4.5  FROM DUAL UNION ALL
  SELECT 'B',          2000,    null FROM DUAL UNION ALL
  SELECT 'B',          2000,    null FROM DUAL UNION ALL
  SELECT 'B',          2002,    5.3  FROM DUAL UNION ALL
  SELECT 'B',          2003,    6.3  FROM DUAL UNION ALL
  SELECT 'C',          2000,    1    FROM DUAL UNION ALL
  SELECT 'C',          2001,    null FROM DUAL UNION ALL
  SELECT 'C',          2002,    null FROM DUAL UNION ALL
  SELECT 'C',          2003,    4    FROM DUAL
), ordered_input as (   
  SELECT
    i.*,
    row_number() over ( partition by country order by year) rn
  FROM input_data i
)
SELECT 
  country,
  year,
  value, 
  rn * regr_slope(value, rn) over ( partition by country) +
       regr_intercept(value, rn) over ( partition by country)
    as interpolated_value
FROM ordered_input
ORDER BY country, year, rn;
+---------+------+-------+--------------------+
| COUNTRY | YEAR | VALUE | INTERPOLATED_VALUE |
+---------+------+-------+--------------------+
| A       | 2000 |   1.5 |                1.5 |
| A       | 2001 |   2.5 |                2.5 |
| A       | 2002 |       |                3.5 |
| A       | 2003 |   4.5 |                4.5 |
| B       | 2000 |       |                3.3 |
| B       | 2000 |       |                4.3 |
| B       | 2002 |   5.3 |                5.3 |
| B       | 2003 |   6.3 |                6.3 |
| C       | 2000 |     1 |                  1 |
| C       | 2001 |       |                  2 |
| C       | 2002 |       |                  3 |
| C       | 2003 |     4 |                  4 |
+---------+------+-------+--------------------+


来源:https://stackoverflow.com/questions/54596147/linear-interpolation-in-oracle-with-special-cases

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!