Note: you can find my previous question and its answer here - MySQL: Writing a complex query
I have 3 tables.
Table Words_Learned
contains
I would be tempted to have a sub query that gets all the words a person has learned and join that against itself, with the words GROUP_CONCAT together along with a count. So giving:-
Octopus, NULL, 0
Dog, "Octopus", 1
Spoon, "Octopus,Dog", 2
So the sub query would be something like:-
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
giving
idwords excl_words older_words_cnt
1 NULL 0
2 1 1
3 1,2 2
Then join the results of this against the other tables, checking for articles where the main idwords matches but none of the others are found.
Something like this (although not tested as no test data):-
SELECT sub_words.idwords, words_inc.idArticle
(
SELECT sub0.idwords, SUBSTRING_INDEX(GROUP_CONCAT(sub1.idwords), ',', 10) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100
EDIT - updated to exclude articles with more than 10 words that are not already learned.
SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM
(
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
INNER JOIN
(
SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
FROM Article a
INNER JOIN words b
ON a.idArticle = b.idArticle
LEFT OUTER JOIN words_learned c
ON b.idwords = c.idwords
AND c.userId = 1
GROUP BY a.idArticle, a.count, a.content
HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100
EDIT - attempt at commenting the above query:-
This just selects the columns
SELECT sub_words.idwords, words_inc.idArticle,
sub2.idArticle, sub2.count, sub2.content
FROM
This sub query gets each of the words learnt, along with a comma separated list of the words with a larger order_learned. This is for a particular user id
(
SELECT sub0.idwords, GROUP_CONCAT(sub1.idwords) AS excl_words, COUNT(sub1.idwords) AS older_words_cnt
FROM words_learned sub0
LEFT OUTER JOIN words_learned sub1
ON sub0.userId = sub1.userId
AND sub0.order_learned < sub1.order_learned
WHERE sub0.userId = 1
GROUP BY sub0.idwords
) sub_words
This is just to get the articles the words (ie, the words learned from the above sub query) are used in
INNER JOIN words words_inc
ON sub_words.idwords = words_inc.idwords
This sub query gets the articles which have a less than 10 words in them that are not yet learnt by the particular user.
INNER JOIN
(
SELECT a.idArticle, a.count, a.content, SUM(IF(c.idwords_learned IS NULL, 1, 0)) AS unlearned_words_count
FROM Article a
INNER JOIN words b
ON a.idArticle = b.idArticle
LEFT OUTER JOIN words_learned c
ON b.idwords = c.idwords
AND c.userId = 1
GROUP BY a.idArticle, a.count, a.content
HAVING unlearned_words_count < 10
) sub2
ON words_inc.idArticle = sub2.idArticle
This join is to find articles that have words in the comma separted list from the 1st sub query (ie words with a larger order_learned). This is done as a LEFT OUTER JOIN as I want to exclude any words that are found (this is done in the WHERE clause by checking for NULL)
LEFT OUTER JOIN words words_exc
ON words_inc.idArticle = words_exc.idArticle
AND FIND_IN_SET(words_exc.idwords, sub_words.excl_words)
WHERE words_exc.idwords IS NULL
ORDER BY older_words_cnt
LIMIT 100