Weighted sum of a column vector and a derived bit vector

后端 未结 1 883
野趣味
野趣味 2021-01-28 14:06

We have a table of bid prices and sizes of two buyers. Bid price p with size s means that the buyer is open to buy s number of product at price p<

相关标签:
1条回答
  • 2021-01-28 14:52

    Below is for BigQuery Standard SQL an dis generic enough to not depend on number of buyers as well as naming for price and size fields. The only expectation is for all prices go first and then all respective sizes as it is in your example. Also i assume all numbers are integers (as in example in question) but this can be adjust to deal with FLOATs

    #standardSQL
    WITH t_ext AS (
      SELECT * EXCEPT(arr), 
        ARRAY(SELECT CAST(val AS INT64) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET < 4) AS prices,
        ARRAY(SELECT CAST(val AS INT64) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET >= 4) AS sizes,
        (SELECT MAX(CAST(val AS INT64)) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET < 4) AS bestPrice
      FROM (
        SELECT *, REGEXP_EXTRACT_ALL(TO_JSON_STRING(T), r':(\d+)') AS arr
        FROM `project.dataset.table` t
      )
    )
    SELECT * EXCEPT(prices, sizes), 
      (SELECT SUM(size)
        FROM UNNEST(prices) price WITH OFFSET
        JOIN UNNEST(sizes) size WITH OFFSET
        USING(OFFSET) 
        WHERE price = bestPrice
      ) AS bS
    FROM t_ext  
    

    The only what you need to change in above query is number of buyers - in below expressions (in those below - 4 can be replaced with ARRAY_LENGTH(arr) / 2

    WHERE OFFSET < 4
    WHERE OFFSET >= 4
    WHERE OFFSET < 4
    

    For example, for below dummy data (4 buyers)

    #standardSQL
    WITH `project.dataset.table` AS (
      SELECT 1 pA, 2 pB, 3 pC, 4 pD, 1 sA, 1 sB, 1 sC, 5 sD UNION ALL
      SELECT 1, 4, 2, 4, 1, 6, 1, 5 UNION ALL
      SELECT 4, 4, 2, 1, 7, 1, 1, 1
    ), t_ext AS (
      SELECT * EXCEPT(arr), 
        ARRAY(SELECT CAST(val AS INT64) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET < 4) AS prices,
        ARRAY(SELECT CAST(val AS INT64) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET >= 4) AS sizes,
        (SELECT MAX(CAST(val AS INT64)) FROM UNNEST(arr) val WITH OFFSET WHERE OFFSET < 4) AS bestPrice
      FROM (
        SELECT *, REGEXP_EXTRACT_ALL(TO_JSON_STRING(T), r':(\d+)') AS arr
        FROM `project.dataset.table` t
      )
    )
    SELECT * EXCEPT(prices, sizes), 
      (SELECT SUM(size)
        FROM UNNEST(prices) price WITH OFFSET
        JOIN UNNEST(sizes) size WITH OFFSET
        USING(OFFSET) 
        WHERE price = bestPrice
      ) AS bS
    FROM t_ext
    

    result is

    Row pA  pB  pC  pD  sA  sB  sC  sD  bestPrice   bS   
    1   1   2   3   4   1   1   1   5   4           5    
    2   1   4   2   4   1   6   1   5   4           11   
    3   4   4   2   1   7   1   1   1   4           8    
    
    0 讨论(0)
提交回复
热议问题