Calculating acceptance_ratio with LEFT JOIN and SELF JOIN and aggregate function

老子叫甜甜 提交于 2020-01-05 07:56:14

问题


Trying to calculate daily acceptance ratios from the 'connecting' table which has 4 fields with sample values:

date          action         sender_id        recipient_id 
'2017-01-05', 'request_link', 'frank', 'joe' 
'2017-01-06', 'request_link', 'sally', 'ann' 
'2017-01-07', 'request_link', 'bill', 'ted' 
'2017-01-07', 'accept_link', 'joe', 'frank' 
'2017-01-06', 'accept_link', 'ann', 'sally' 
'2017-01-06', 'accept_link', 'ted', 'bill' 

Because there are 0 accepts and 1 request on 01-05, its daily acceptance ratio should be 0/1 = 0. Similarly, the ratio for 01-06 should be 2/1, and it should be 1/1 for 01-07.

It is important however that each accept_link has a corresponding request_link where the sender_id of the request_link = the recipient_id of the accept_link (and vice versa). So here a self-join is required I believe to ensure that Joe accepts Frank's request, regardless of the date.

How can the below query be corrected so that the aggregation works correctly while retaining the required join conditions? Will the query calculate correctly as is if the two WHERE conditions are removed, or are they necessary?

SELECT f1.date, 
    SUM(CASE WHEN f2.action = 'accept_link' THEN 1 ELSE 0 END) /
    SUM(CASE WHEN f2.action = 'request_link' THEN 1 ELSE 0 END) AS acceptance_ratio
FROM connecting f1
LEFT JOIN connecting f2
ON f1.sender_id = f2.recipient_id
LEFT JOIN connecting f2
ON f1.recipient_id = f2.sender_id
WHERE f1.action = 'request_link'
AND f2.action = 'accept_link'
GROUP BY f1.date
ORDER BY f1.date ASC

Expected output should look something like:

date          acceptance_ratio
'2017-01-05'  0.0000
'2017-01-06'  2.0000
'2017-01-07'  1.0000

Thanks in advance.


回答1:


Once again, I don't think you need to be using a self join here. Instead, just use conditional aggregation over the entire table, and count the number of requests and accepts which happened on each day:

SELECT t.date,
       CASE WHEN t.num_requests = 0
            THEN 'No requests available'
            ELSE CAST(t.num_accepts / t.num_requests AS CHAR(50))
       END AS acceptance_ratio
FROM
(
    SELECT c1.date,
           SUM(CASE WHEN c1.action = 'accept_link' AND c2.action IS NOT NULL
                    THEN 1 ELSE 0 END) AS num_accepts,
           SUM(CASE WHEN c1.action = 'request_link' THEN 1 ELSE 0 END) AS num_requests
    FROM connecting c1
    LEFT JOIN connecting c2
        ON c1.action       = 'accept_link'   AND
           c2.action       = 'request_link'  AND
           c1.sender_id    = c2.recipient_id AND
           c2.recipient_id = c1.sender_id
    GROUP BY c1.date
) t
ORDER BY t.date

Note here that I use a CASE expression to handle divide by zero, which could occur should a certain day no requests. I also assume here that the same invitation will not be sent out more than once.



来源:https://stackoverflow.com/questions/42130476/calculating-acceptance-ratio-with-left-join-and-self-join-and-aggregate-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!