I have a feed application that I am trying to group results from consecutively. My table looks like this:
postid | posttype | target | action |
Here's another version that works with MySQL Variables and doesn't require 3 level nesting deep. The first one pre-sorts the records in order by postID and Date and assigns them a sequential number per group whenever any time a value changes in one of the Post ID, Type and/or action. From that, Its a simple group by... no comparing record version T to T2 to T3... what if you wanted 4 or 5 criteria... would you have to nest even more entries?, or just add 2 more @sql variables to the comparison test...
Your call on which is more efficient...
select
PreQuery.postID,
PreQuery.PostType,
PreQuery.Target,
PreQuery.Action,
PreQuery.Title,
min( PreQuery.Date ) as FirstActionDate,
max( PreQuery.Date ) as LastActionDate,
count(*) as ActionEntries,
group_concat( PreQuery.content ) as Content
from
( select
t.*,
@lastSeq := if( t.action = @lastAction
AND t.postID = @lastPostID
AND t.postType = @lastPostType, @lastSeq, @lastSeq +1 ) as ActionSeq,
@lastAction := t.action,
@lastPostID := t.postID,
@lastPostType := t.PostType
from
t,
( select @lastAction := ' ',
@lastPostID := 0,
@lastPostType := ' ',
@lastSeq := 0 ) sqlVars
order by
t.postid,
t.date ) PreQuery
group by
PreQuery.postID,
PreQuery.ActionSeq,
PreQuery.PostType,
PreQuery.Action
Here's my link to SQLFiddle sample
For the title, you might want to adjust the line...
group_concat( distinct PreQuery.Title ) as Titles,
At least this will give DISTINCT titles concatinated... much tougher to get let without nesting this entire query one more level by having the max query date and other elements to get the one title associated with that max date per all criteria.
There is no primary key in your table so for my example I used date
. You should create an auto increment value and use that instead of the date
in my example.
This is a solution (view on SQL Fiddle):
SELECT
postid,
posttype,
target,
action,
COALESCE((
SELECT date
FROM t t2
WHERE t2.postid = t.postid
AND t2.posttype = t.posttype
AND t2.action = t.action
AND t2.date > t.date
AND NOT EXISTS (
SELECT TRUE
FROM t t3
WHERE t3.date > t.date
AND t3.date < t2.date
AND (t3.postid != t.postid OR t3.posttype != t.posttype OR t3.action != t.action)
)
), t.date) AS group_criterion,
MAX(title),
GROUP_CONCAT(content)
FROM t
GROUP BY 1,2,3,4,5
ORDER BY group_criterion
It basically reads:
For each row create a group criterion and in the end group by it.
This criterion is the highestdate
of the rows following the current one and having the same postid, posttype and action as the current one but there may be not a row of different postid, posttype or action between them.
In other words, the group criterion is the highest occurring date in a group of consecutive entries.
If you use proper indexes it shouldn't be terribly slow but if you have a lot of rows you should think of caching this information.