问题
I have a sample table (table_name: track_task) like below:
task_id stage log_date
----------------------------------------
3 1 2011-06-01 08:36:21
9 1 2011-06-03 12:35:47
3 2 2011-06-05 14:25:42
21 1 2011-06-11 13:03:34
9 2 2011-06-11 15:25:57
3 3 2011-06-12 10:16:09
21 2 2011-06-15 15:30:29
3 4 2011-06-22 15:34:33
21 3 2011-06-23 12:53:49
9 4 2011-06-25 16:25:08
The data above is automatically populated when a task stage is progressed by some action in the application code. The stages run from 1 to 4. However, due to some logic, a task may skip stage 3. But all tasks end at stage 4. Probable task paths are like so:
(1,2,3,4) / (1,2,4) - Completed tasks
(1,2,3) / (1,2) - In progress tasks
I need to query and retrieve a report that shows how long (in days) a task takes in each stage at a given time. I have come up with the following query so far:
SELECT z.task_id, a.log_date begin_date, d.log_date end_date,
DATEDIFF( b.log_date, a.log_date ) step1_days,
DATEDIFF( c.log_date, b.log_date ) step2_days,
DATEDIFF( d.log_date, c.log_date ) step3_days,
DATEDIFF( d.log_date, a.log_date ) cycle_days
FROM track_task z
LEFT JOIN track_task a ON ( z.task_id = a.task_id AND a.staging_id =1 )
LEFT JOIN track_task b ON ( z.task_id = b.task_id AND b.staging_id =2 )
LEFT JOIN track_task c ON ( z.task_id = c.task_id AND c.staging_id =3 )
LEFT JOIN track_task d ON ( z.task_id = d.task_id AND d.staging_id =4 )
GROUP BY z.oppty_id
to derive a result set like below:
task_id begin_date end_date step1_days step2_days step3_days cycle_days
-------------------------------------------------------------------------------
3 2011-06-01 2011-06-22 4 7 10 21
9 2011-06-03 2011-06-25 8 NULL NULL 22
21 2011-06-11 NULL 4 8 NULL NULL
Is this a good way to go about it or there is a better way? How can I have the NULL values reported as zero? How can I retrieve then end_date for a task that is still in progress?
回答1:
That looks about How I would do something like this, assuming there's only going to ever be one instance of a stage
used per task
(if not, you need more work, both in defining your requirements, and writing the query).
To make null
appear as 0
, use the COALESCE()
function:
SELECT z.task_id, COALESCE(DATEDIFF(b.log_date, a.log_date), 0) as step1_days
回答2:
No suggestions to improve the general approach to the problem. However, you can report NULL values as 0 (or anything else for that matter) using MySQL's COALESCE function http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_coalesce.
Coalesce takes a list as input and returns the first non-null value.
For example:
SELECT z.task_id, a.log_date begin_date, d.log_date end_date,
COALESCE(DATEDIFF( b.log_date, a.log_date ),0) step1_days,
COALESCE(DATEDIFF( c.log_date, b.log_date ),0) step2_days,
...
Will return 0
if step1_days
or step2_days
is NULL
.
回答3:
About the third question, the following edit to the query resolves it.
SELECT z.task_id, a.log_date begin_date, e.log_date end_date,
COALESCE(DATEDIFF( b.log_date, a.log_date ),0) step1_days,
COALESCE(DATEDIFF( c.log_date, b.log_date ),0) step2_days,
COALESCE(DATEDIFF( d.log_date, c.log_date ),0) step3_days,
COALESCE(DATEDIFF( d.log_date, a.log_date ),0) cycle_days
FROM track_task z
LEFT JOIN track_task a ON ( z.task_id = a.task_id AND a.staging_id =1 )
LEFT JOIN track_task b ON ( z.task_id = b.task_id AND b.staging_id =2 )
LEFT JOIN track_task c ON ( z.task_id = c.task_id AND c.staging_id =3 )
LEFT JOIN track_task d ON ( z.task_id = d.task_id AND d.staging_id =4 )
LEFT JOIN track_task e ON ( z.task_id = e.task_id AND e.staging_id IN
(
SELECT max( staging_id )
FROM z_table_trial
GROUP BY oppty_id
)
)
GROUP BY z.oppty_id
来源:https://stackoverflow.com/questions/8566278/compute-dates-and-durations-in-mysql-query