I have a very large MySQL table containing data read from a number of sensors. Essentially, there\'s a time stamp and a value column. I\'ll omit the sensor id, indexes other
I suppose it's not an option for you to switch DB engine. In case it might be, then window functions would allow you to write things like this:
SELECT d.*
FROM (
SELECT d.*, lag(d.value) OVER (ORDER BY d.time) as previous_value
FROM data d
) as d
WHERE d.value IS DISTINCT FROM d.previous_value;
If not, you could try to rewrite the query like so:
select data.*
from data
left join (
select data.measure_id,
data.time,
max(prev_data) as prev_time
from data
left join data as prev_data
on prev_data.time < data.time
group by data.measure_id, data.time, data.value
) as prev_data_time
on prev_data_time.measure_id = data.measure_id
and prev_data_time.time = data.time
left join prev_data_value
on prev_data_value.measure_id = data.measure_id
and prev_data_value.time = prev_data_time.prev_time
where data.value <> prev_data_value.value or prev_data_value.value is null
You might try this - I'm not going to guarantee that it will perform better, but it's my usual way of correlating a row with a "previous" row:
SELECT
* --TODO, list columns
FROM
data d
left join
data d_prev
on
d_prev.time < d.time --TODO - Other key columns?
left join
data d_inter
on
d_inter.time < d.time and
d_prev.time < d_inter.time --TODO - Other key columns?
WHERE
d_inter.time is null AND
(d_prev.value is null OR d_prev.value <> d.value)
(I think this is right - could do with some sample data to validate it).
Basically, the idea is to join the table to itself, and for each row (in d
), find candidate rows (in d_prev
) for the "previous" row. Then do a further join, to try to find a row (in d_inter
) that exists between the current row (in d
) and the candidate row (in d_prev
). If we cannot find such a row (d_inter.time is null
), then that candidate was indeed the previous row.