Left outer join acting like inner join

后端未结

关注

 2  1571

Summary

My goal is to find every user who has ever been assigned to a task, and then generate some statistics over a particular date range, and associate the stats with

相关标签:

2条回答

终归单人心

2021-01-23 05:49
The query can probably be simplified to:
```
SELECT u.name AS user_name
     , p.name AS project_name
     , tl.created_on::date AS changeday
     , coalesce(sum(nullif(new_value, '')::numeric), 0)
     - coalesce(sum(nullif(old_value, '')::numeric), 0) AS hours
FROM   users             u
LEFT   JOIN (
        tasks            t 
   JOIN fixins           f  ON  f.id = t.fixin_id
   JOIN projects         p  ON  p.id = f.project_id
   JOIN task_log_entries tl ON  tl.task_id = t.id
                           AND  tl.field_id = 18
                           AND (tl.created_on IS NULL OR
                                tl.created_on >= '2013-09-08' AND
                                tl.created_on <  '2013-09-09') -- upper border!
       ) ON t.assignee_id = u.id
WHERE  EXISTS (SELECT 1 FROM tasks t1 WHERE t1.assignee_id = u.id)
GROUP  BY 1, 2, 3
ORDER  BY 1, 2, 3;
```
This returns all users that have ever had any task.
Plus data per projects and day where data exists in the specified date range in task_log_entries.

Major points
- The aggregate function sum() ignores NULL values. COALESCE() per row is not required any more as soon as you recast the calculation as the difference of two sums:
```
 ,coalesce(sum(nullif(new_value, '')::numeric), 0) -
  coalesce(sum(nullif(old_value, '')::numeric), 0) AS hours
```
  However, if it is possible that all columns of a selection have NULL or empty strings, wrap the sums into COALESCE once.
  I am using numeric instead of float, safer alternative to minimize rounding errors.
- Your attempt to get distinct values from the join of users and tasks is futile, since you join to task once more further down. Flatten the whole query to make it simpler and faster.
- These positional references are just a notational convenience:
```
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
```
  ... doing the same as in your original query.
- To get a date from a timestamp you can simply cast to date:
```
tl.created_on::date AS changeday
```
  But it's much better to test with original values in the WHERE clause or JOIN condition (if possible, and it is possible here), so Postgres can use plain indices on the column (if available):
```
 AND (tl.created_on IS NULL OR
      tl.created_on >= '2013-09-08' AND
      tl.created_on <  '2013-09-09')  -- next day as excluded upper border
```
  Note that a date literal is converted to a timestamp at 00:00 of the day at your current time zone. You need to pick the next day and exclude it as upper border. Or provide a more explicit timestamp literal like '2013-09-22 0:0 +2':: timestamptz. More on excluding upper border:
  - Calculate number of concurrent events in SQL
  - Find overlapping date ranges in PostgreSQL
- For the requirement every user who has ever been assigned to a task add the WHERE clause:
```
WHERE EXISTS (SELECT 1 FROM tasks t1 WHERE t1.assignee_id = u.id)
```
- Most importantly: A LEFT [OUTER] JOIN preserves all rows to the left of the join. Adding a WHERE clause on the right table can void this effect. Instead, move the filter expression to the JOIN clause. More explanation here:
  - Query with LEFT JOIN not returning rows for count of 0
- Parentheses can be used to force the order in which tables are joined. Rarely needed for simple queries, but very useful in this case. I use the feature to join task, fixins, projects and task_log_entries before left-joining all of it to users - without subquery.
- Table aliases make writing complex queries easier.
0 讨论(0)
发布评论:

提交评论
- 加载中...
说谎

2021-01-23 06:02
It doesn't work because the first query is inner joined with tasks. The same table is than used to perform outer join (through subquery but nevertheless) but the first query (tasked users) doesn't have the relevant records in the first place (that lack the match).

Try using
```
....
FROM (
  SELECT DISTINCT
    users.id,
    users.name AS user_name
  FROM users    
) tasked_users
...
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

Left outer join acting like inner join

Summary

Major points