I need to calculate all Employees that have X number of consecutive absences within a date range in SQL.
We have an Absences Table with 1 record for each day an emp
Tested in PostgreSQL; SQL uses example values from posting.
There are no primary keys defined on the provided tables, and the code below addresses that. Better would be to add primary keys and optimize below code to leverage them: better data quality, better performance, cleaner code, happier people.
Removing the tbl prefix provides more flexibility in the database implementation. Table, view, and synonym can then be used interchangeably without impacting code that references the database object or breaking the naming convention.
/* period length as the desired number of consecutive days */
/* time window as the period to be analyzed */
SELECT DISTINCT
/* Consolidate employees with multiple periods */
a.employeeid
FROM
(SELECT
/* Generate all possible periods */
pk_c.workday begin_date,
/* End date for given period length; less one for closed boundaries */
LEAD(pk_c.workday,3-1,NULL) OVER (ORDER BY pk_c.workday) end_date
FROM (SELECT DISTINCT
/* No calendar PK, remove dupes; if PK, pull in-line view up */
c.workday
FROM sandbox.calendar c) pk_c
) p
INNER JOIN sandbox.absences a ON
/* Match absences with periods */
(a.absencedate BETWEEN p.begin_date AND p.end_date)
WHERE
/* In time window desired; exclude periods extending beyond boundaries */
(p.begin_date BETWEEN '2009-01-01' AND '2009-03-01'
AND /* NOT NULL exclusion implied for periods beyond calendar boundaries */
p.end_date BETWEEN '2009-01-01' AND '2009-03-01')
GROUP BY
a.employeeid,
/* Also group period, display only employee */
p.begin_date
HAVING
/* Number of absence days to match to the period length */
/* DISTINCT due to missing absences PK; if PK, drop DISTINCT */
COUNT(DISTINCT a.absencedate) = 3
;
Enjoy. Stripped version below:
SELECT DISTINCT
a.employeeid
FROM
(SELECT
pk_c.workday begin_date,
LEAD(pk_c.workday,3-1,NULL) OVER (ORDER BY pk_c.workday) end_date
FROM (SELECT DISTINCT c.workday FROM sandbox.calendar c) pk_c) p
INNER JOIN sandbox.absences a ON
(a.absencedate BETWEEN p.begin_date AND p.end_date)
WHERE
(p.begin_date BETWEEN '2009-01-01' AND '2009-03-01'
AND p.end_date BETWEEN '2009-01-01' AND '2009-03-01')
GROUP BY
a.employeeid, p.begin_date
HAVING
COUNT(DISTINCT a.absencedate) = 3
;
This should work for you. GROUP BY on ConsecDates to find who was absent more than X number of times.
select a.*,
(
select min(b.absenceDate) from tblAbsences b where a.employeeId = b.employeeId
and b.absenceDate >= a.absenceDate
and not exists (
select 1 from tblabsences c where c.employeeId = b.employeeId and dateadd( dd, 1, b.absenceDate) = c.absenceDate
)
) ConsecDates
from dbo.tblAbsences a
order by a.AbsenceDate asc