问题
I have a MySQL 5.6 database and fail to make a complex query, which (i think) involve multiple joins.
The first table (T1) is Hobby_ideas_articles
and records looks like this:
hobby_idea_article_id= 1,
hobby_id = 6
url= 'http://exo.example.com',
author = 'john@example.com'
hobby_idea_article_id= 2,
hobby_id = 3
url= 'http://exo.example2.com',
author = 'john@example.com'
hobby_idea_article_id= 3,
hobby_id = 6
url= 'http://exo.example3.com',
author = 'eric@example.com'
hobby_idea_article_id= 4,
hobby_id = 7
url= 'http://exo.example4.com',
author = 'john@example.com'
hobby_idea_article_id= 5,
hobby_id = 6
url= 'http://exo.example5.com',
author = 'nestor@example.com'
hobby_idea_article_id= 7,
hobby_id = 6
url= 'http://exo.example6.com',
author = 'ginger@example.com'
hobby_idea_article_id= 8,
hobby_id = 6
url= 'http://exo.example8.com',
author = 'derek@example.com'
hobby_idea_article_id= 9,
hobby_id = 6
url= 'http://exo.example9.com',
author = 'derek@example.com'
The second table (T2) called Past_Customer_sent_messages
lists ALL messages from the past, notably with the information of who sent it (customer_id
-> pointing to another table called Customers
which is not needed in this current question) and to whom (recipient
), and records look like this
past_customer_sent_message_id = 5
hobby_id = 7,
customer_id = 4,
recipient = "john@example.com",
sent_at= "2019-09-10 00:00:00"
past_customer_sent_message_id = 6
hobby_id = 999,
customer_id = 4,
recipient = "eric@example.com",
sent_at= "2019-09-18 00:00:00"
past_customer_sent_message_id = 7
hobby_id = 999,
customer_id = 4,
recipient = "nestor@example.com",
sent_at= "2019-07-18 00:00:00"
Even if there is currently no foreign key between T1 and T2, one critical information is that
the T1 'author'
is the 'recipient"
that will appear on T2 if a customer sends a message to him.
I'm not sure if it matters for the present question, but there is a Customer
table (a past_customer_sent_messages
belongs to a customer
, a customer
has multiple past_customer_sent_messages
) and a Hobbies
table (a Hobby_ideas_article
belongs to a hobby
but a hobby
can have multiple Hobby_ideas_articles
).
In my javascript file, before writing the SQL statement, I have 2 inputs with already given values I'll need to use at some point in the SQL statement:
hobby_id = HOBBY_ID_INPUT_I_HAVE , for example
6
customer_id = CUSTOMER_ID_INPUT_I_HAVE, for example
4
What I want is to retrieve ALL the records from the Hobby_ideas_articles
table (T1) where hobby_id = HOBBY_ID_INPUT_I_HAVE
with 2 conditions/restrictions:
condition1: the recipient (ex:
'john@example.com
'), no matter WHAT (i.e no matter thehobby_id-S = plural
it's connected to via'author'
column on T1, and no matter thecustomer_id-S = plural
it's connected to via'recipient'
on T2), has never been sent an message (sent_at
column on T2) in the past 3 hours. Again, this applies across all past messages to this'recipient'
(very important it means not only on the T1 'authors' that would matchhobby_id = HOBBY_ID_INPUT_I_HAVE = 6
).condition2: the author has not been messaged in the past 30 days by the customer who has
customer_id = CUSTOMER_ID_INPUT_I_HAVE
. Here it applies to this specific customer (but across any hobby!).
A part I find particularly challenging intellectually as I would be tempted to put as a WHERE hobby_id = HOBBY_ID_INPUT_I_HAVE
(see below) but it feels wrong because then, how to do I implement the condition1 which applies to ALL the messages, and should not be applied to only those with hobby_id = HOBBY_ID_INPUT_I_HAVE
...hard! Same kind of difficulty for condition2
Maybe I could/should add a new column called hobby_idea_id
that I would put on T2 which would be a foreign key between the 2 tables as a Past_Customer_sent_message
belongs in a way to a Hobby_ideas_article and a Hobby_ideas_article has many Past_Customer_sent_messages ?
Today I have this but I don't even know on what to put the 'ON' clause of the join...and can't set the correct statement.
SELECT
hia.query_result_id,
hia.url,
hia.title,
hia.article_publication_day,
hia.media_name,
hia.author_full_name,
hia.author_first_name,
hia.author_last_name,
hia.author_email,
ces.sent_at
FROM
Hobby_ideas_articles hia
LEFT JOIN
Past_Customer_sent_messages ces
ON
# WHAT CLAUSE ???
WHERE
hia.hobby_id = HOBBY_ID_INPUT_I_HAVE AND
hia.author IS NOT NULL
# more conditions but how to assert them ??
How to achieve this ?
The desired output would be:
hobby_idea_article_id= 5,
hobby_id = 6
url= 'http://exo.example5.com',
author = 'nestor@example.com'
last_email_sent_at = "2019-07-18 00:00:00"
hobby_idea_article_id= 7,
hobby_id = 6
url= 'http://exo.example6.com',
author = 'ginger@example.com'
last_email_sent_at = null
hobby_idea_article_id= 8,
hobby_id = 6
url= 'http://exo.example8.com',
author = 'derek@example.com'
last_email_sent_at = null
As you can see with my inputs values given as examples (hobby_id = 6 , customer_id = 4),
'john@example.com' and any hobby_ideas related to him are not part of the output because john has been messaged by customer_id = 4 in the past 30 days already
'Eric' and the hobby_ideas related to him could have been part of the output because Eric was messaged by customer_id=6 8 days ago (and not customer_id=4 which is the value we target), BUT unfortunately it was excluded/filtered out because of condition1 : he was messaged at some point 2 hours ago by another customer, so he's out too.
nestor and the hobby_idea related to him are inside the output because he's unaffected by both because he indeed is related to hobby_id= 6, but his only mail was sent a few months ago.
ginger and the hobby_idea related to her are inside the output because condition1 and2 are ok given that ginger never emailed anybody: ther's no message at all where she is the recipient
Also note that if an author would be in the output but has multiple hobby_ideas related to him, then i only want one, that's why derek only appear once with let's say the FIRST hobby_idea but the second one he's one is removed from the output
Disclaimer: I come from a Rails background where I used to have ORM (Active Record) making all the chains/joins/ easier more automagic and am a bit lost here with the present SQL statement I must write inside my javascript aws lambda file.
回答1:
Use recipient NOT IN (SELECT ...)
and author NOT IN (SELECT ...)
to exclude the participants that match your conditions 1 and 2.
SELECT
hia.query_result_id,
hia.url,
hia.title,
hia.article_publication_day,
hia.media_name,
hia.author_full_name,
hia.author_first_name,
hia.author_last_name,
hia.author_email,
ces.sent_at
FROM
Hobby_ideas_articles hia
LEFT JOIN
Past_Customer_sent_messages ces
ON
hia.author = ces.recipient
AND ces.recipient NOT IN (
SELECT recipient
FROM Past_Customer_sent_messages
WHERE sent_at > DATE_SUB(NOW(), INTERVAL 3 HOUR)
)
WHERE
hia.hobby_id = HOBBY_ID_INPUT_I_HAVE AND
hia.author IS NOT NULL
AND hia.author NOT IN (
SELECT recipient
FROM Past_Customer_sent_messages
WHERE customer_id = CUSTOMER_ID_INPUT_I_HAVE
AND sent_at > DATE_SUB(NOW(), INTERVAL 30 DAY)
)
The condition on ces
has to be in the ON
clause so that you'll get sent_at = NULL
for articles that don't have any matching rows.
来源:https://stackoverflow.com/questions/58001908/mysql-select-data-with-a-join-but-with-where-clauses-applying-to-complex-and-d