SQL Server find datediff between different rows, sum

问题

I am trying to build a query that analyzes data in our time tracking system. Every time a user swipes in or out, it makes a row recording the swipe time and On or Off site (entry or exit). In user 'Joe Bloggs' case there are 4 rows, which I want to pair and calculate a total time spent on site for Joe Bloggs.

The problem is that there are records that are not as easy to pair. In the example given, the second user has two consecutive 'on's, and I need to find a method for ignoring repeated 'on' or 'off' rows.

ID  | Time                    |OnOffSite| UserName   
------------------------------------------------------
123 | 2011-10-25 09:00:00.000 | on      | Bloggs Joe |
124 | 2011-10-25 12:00:00.000 | off     | Bloggs Joe |
125 | 2011-10-25 13:00:00.000 | on      | Bloggs Joe |
126 | 2011-10-25 17:00:00.000 | off     | Bloggs Joe |
127 | 2011-10-25 09:00:00.000 | on      | Jonesy Ian |
128 | 2011-10-25 10:00:00.000 | on      | Jonesy Ian |
129 | 2011-10-25 11:00:00.000 | off     | Jonesy Ian |
130 | 2011-10-25 12:00:00.000 | on      | Jonesy Ian |
131 | 2011-10-25 15:00:00.000 | off     | Jonesy Ian |

My System is MS SQL 2005. The reporting period for the query is Monthly.

Can anyone suggest a solution? my data is already grouped in a table by Username and time, with the ID field being Identity.

回答1:

-- =====================
-- sample data
-- =====================
declare @t table
(
    ID int,
    Time datetime,
    OnOffSite varchar(3),
    UserName varchar(50)
)

insert into @t values(123, '2011-10-25 09:00:00.000', 'on', 'Bloggs Joe')
insert into @t values(124, '2011-10-25 12:00:00.000', 'off', 'Bloggs Joe')
insert into @t values(125, '2011-10-25 13:00:00.000', 'on', 'Bloggs Joe')
insert into @t values(126, '2011-10-25 17:00:00.000', 'off', 'Bloggs Joe')
insert into @t values(127, '2011-10-25 09:00:00.000', 'on', 'Jonesy Ian')
insert into @t values(128, '2011-10-25 10:00:00.000', 'on', 'Jonesy Ian')
insert into @t values(129, '2011-10-25 11:00:00.000', 'off', 'Jonesy Ian')
insert into @t values(130, '2011-10-25 12:00:00.000', 'on', 'Jonesy Ian')
insert into @t values(131, '2011-10-25 15:00:00.000', 'off', 'Jonesy Ian')

-- =====================
-- solution
-- =====================
select
    UserName, timeon, timeoff, diffinhours = DATEDIFF(hh, timeon, timeoff)
from
(
    select
        UserName,
        timeon = max(case when k = 2 and OnOffSite = 'on' then Time end),
        timeoff = max(case when k = 1 and OnOffSite = 'off' then Time end)
    from
    (
        select
            ID,
            UserName,
            OnOffSite,
            Time,
            rn = ROW_NUMBER() over(partition by username order by id)
        from
        (
            select
                ID,
                UserName,
                OnOffSite,
                Time,
                rn2 = case OnOffSite 
                -- '(..order by id)' takes earliest 'on' in the sequence of 'on's
                -- to take the latest use '(...order by id desc)'
                when 'on' then 
                    ROW_NUMBER() over(partition by UserName, OnOffSite, rn1 order by id)
                -- '(... order by id desc)' takes the latest 'off' in the sequence of 'off's
                -- to take the earliest use '(...order by id)'
                when 'off' then
                    ROW_NUMBER() over(partition by UserName, OnOffSite, rn1 order by id desc)
                end,
                rn1
            from
            (
                select
                    *,
                    rn1 = ROW_NUMBER() over(partition by username order by id) +
                        ROW_NUMBER() over(partition by username, onoffsite order by id desc)
                from @t
            ) t
        ) t
        where rn2 = 1
    ) t1
    cross join
    (
        select k = 1 union select k = 2
    ) t2
    group by UserName, rn + k
) t
where timeon is not null or timeoff is not null
order by username

回答2:

First you need to talk with the business side and decide on a set of matching rules.

After that I suggest that you add a status field to the table where you record the status of each row (matched, unmatched, deleted etc). Whenever a row is added you should try to match it to make a pair. A successful match sets the status of both rows to matched, otherwise the new row will be unmatched.

来源：https://stackoverflow.com/questions/7899408/sql-server-find-datediff-between-different-rows-sum

标签

sql

sql-server

sum

datediff