I track web visitors. I store the IP address as well as the timestamp of the visit.
ip_address time_stamp
180.2.79.3 1301654105
180.2.79.3 1301654106
180.2.
For giggles sakes, here is an UPDATE hack that accomplishes what you need. There are a myriad of reasons not to implement this, including but not limited to the fact that it may simply stop working some day. Anyway, assuming you have your table initially ordered by ip -> timestamp, this should (usually) give you the correct answers. Again, this is for completeness, if you implement this, look up the risks beforehand.
CREATE TABLE #TestIPs
(
ip_address varchar(max),
time_stamp decimal(12,0),
cnt int
)
INSERT INTO #TestIPs (ip_address, time_stamp)
SELECT '180.2.79.3', 1301654105 UNION ALL
SELECT '180.2.79.3', 1301654106 UNION ALL
SELECT '180.2.79.3', 1301654354 UNION ALL
SELECT '180.2.79.3', 1301654356 UNION ALL
SELECT '180.2.79.3', 1301654358 UNION ALL
SELECT '180.2.79.3', 1301654366 UNION ALL
SELECT '180.2.79.3', 1301654368 UNION ALL
SELECT '180.2.79.3', 1301654422 UNION ALL
SELECT '180.2.79.4', 1301654105 UNION ALL
SELECT '180.2.79.4', 1301654106 UNION ALL
SELECT '180.2.79.4', 1301654354 UNION ALL
SELECT '180.2.79.4', 1301654356 UNION ALL
SELECT '180.2.79.4', 1301654358 UNION ALL
SELECT '180.2.79.4', 1301654366 UNION ALL
SELECT '180.2.79.4', 1301654368 UNION ALL
SELECT '180.2.79.4', 1301654422
DECLARE @count int; SET @count = 0
DECLARE @ip varchar(max); SET @ip = 'z'
DECLARE @timestamp decimal(12,0); SET @timestamp = 0;
UPDATE #TestIPs
SET @count = cnt = CASE WHEN time_stamp - @timestamp > 10 THEN @count + 1 ELSE CASE WHEN @ip <> ip_address THEN 1 ELSE @count END END,
@timestamp = time_stamp,
@ip = ip_address
SELECT ip_address, MAX(cnt) AS 'Visits' FROM #TestIPs GROUP BY ip_address
Results:
ip_address Visits
------------ -----------
180.2.79.3 3
180.2.79.4 3
As usual with SQL there are many solution for your problem. I would use following query which is simple and should be "good enough":
SELECT COUNT(*) AS tracks
FROM (
SELECT ip_address
FROM tracking
GROUP BY ip_address, FLOOR(time_stamp / 10)
)
The sub query groups visits of a single user in 10s intervals so that they are counted as one visit.
Of cause it is possible to find cases in which two visits will appear in different 10s window even though the interval between this visits will be less than 10s. It would require much more complex logic to eliminate such cases and the analytical value of this added complexity would be dubious (10s interval sounds like an arbitrary value anyway).
The simplest way to do this is to divide the timestamps by 10, and count the distinct combinations of those values and the ip_address values. That way each 10 second period is counted separately.
If you run this on your sample data it will give you 4 tracks, which is what you want I think.
Give it a try and see if it gives you the desired results on your full data set:
SELECT COUNT(DISTINCT ip_address, FLOOR(time_stamp/10)) AS tracks
FROM tracking
The following logic will only count a visit as a 'unique visit' if there wasn't a preceding record from the same ip address within the preceding 10 seconds.
This means that {1,11,21,32,42,52,62,72} will count as 2 visits, with 3 and 5 tracks each, respectively.
It accomplishes this by first identifying the unique visits. Then it counts all visits that happened between that unique visit and the next unique visit.
WITH
unique_visits
(
SELECT
ip_address, time_stamp
FROM
visitors
WHERE
NOT EXISTS (SELECT * FROM visitors AS [previous]
WHERE ip_address = visitors.ip_address
AND time_stamp >= visitors.timestamp - 10
AND time_stamp < visitors.timestamp)
)
SELECT
unique_visitors.ip_address,
unique_visitors.time_stamp,
COUNT(*) AS [total_tracks]
FROM
unique_visitors
INNER JOIN
visitors
ON visitors.ip_address = unique_visitors.ip_address
AND visitors.time_stamp >= unique_visitors.time_stamp
AND visitors.time_stamp < ISNULL(
(SELECT MIN(time_stamp) FROM unique_visitors [next]
WHERE ip_address = unique_visitors.ip_address
AND time_stamp > unique_visitors.ip_address)
, visitors.time_stamp + 1
)
You will also want either an index or primary key on (ip_address, time_stamp)
Select Z.IP, Count(*) As VisitCount
From (
Select V.IP
From visitors As V
Left Join visitors As V2
On V2.IP = V.IP
And V2.time_stamp > V.time_stamp
Group By V.IP, V.time_stamp
Having (Min(V2.time_stamp) - V.time_stamp) >= 10
) As Z
Group By Z.IP
This counts any visit where the next entry is more than 10 seconds away as a new visit.
Make a left join against the records with the same ip and a close time, and filter out the records where there is a match:
select count(*) as visits
from (
select t.ip_address
from tracking t
left join tracking t2
on t2.ip_address = t.ip_address
and t2.timestamp > t.timestamp and t2.timestamp <= t.timestamp + 10
where t2.ip_address is null
) x