问题
Thanks for your help i'm stuck on this problem.
Let me explain it, i have this kind of table :
| domain | creationdate | value 1 | value 2 |
|--------|---------------------|---------|---------|
| abc | 2013-05-28 15:35:01 | value 1 | value 2 |
| abc | 2013-04-30 12:10:10 | value 1 | value 2 |
| aaa | 2011-04-02 13:10:10 | value 1 | value 2 |
| bbb | 2012-02-12 10:48:10 | value 1 | value 2 |
| bbb | 2013-04-15 07:15:23 | value 1 | value 2 |
And i want to select (with subqueries) this :
| domain | creationdate | value 1 | value 2 |
|--------|---------------------|---------|---------|
| abc | 2013-04-30 12:10:10 | value 1 | value 2 |
| aaa | 2011-04-02 13:10:10 | value 1 | value 2 |
| bbb | 2012-02-12 10:48:10 | value 1 | value 2 |
I tried to do a combinaison of subqueries with IN/NOT IN in WHERE clause and group by/having but i'm not able to obtain a proper result.
I also have another question to ask, if someone already faced this kind of problem i would be glad to hear how he managed to figure it out.
The records in the first table you see above are frequently (every ten mins) deleted/inserted. My aim is to make a copy (or maybe a view) of the result (without the duplicates entries) which will be used 24/7 by a postfix mail server. I heard that big views (with many subqueries) decreases performances which means a table would be a preferable option. The thing is if i have to make a new table every ten mins there will be a little down time and postfix will not be able to read the table.
Waiting for your advices, thanks already.
EDIT :
Based on @Ed Gibbs answer, there is a better sample :
Source table :
| domain | creationdate | value 1 | value 2 |
|------------|---------------------|---------|---------|
| google.com | 2013-05-28 15:35:01 | john | mary |
| google.com | 2013-04-30 12:10:10 | patrick | edward |
| yahoo.fr | 2011-04-02 13:10:10 | britney | garry |
| ebay.com | 2012-02-12 10:48:10 | harry | mickael |
| ebay.com | 2013-04-15 07:15:23 | bill | alice |
With your query the result is the source table.
Desired result :
| domain | value 1 | value 2 |
|------------|---------|---------|
| google.com | patrick | edward |
| yahoo.fr | britney | garry |
| ebay.com | harry | mickael |
I want to keep the oldest domain (with the min creation date) with its own value1 and 2.
New question !
I made a view of the desired result based on your anwser.
The result look like this :
| domain | value 1 | foreign_key |
|------------|---------|-------------|
| google.com | patrick | X |
| yahoo.fr | britney | Y |
| ebay.com | harry | Z |
I also have a table with this kind of entries :
| email | value 1 | foreign_key |
|--------------------|---------|-------------|
| john@google.com | patrick | X |
| john@google.com | britney | Y |
| harry@google.com | mary | X |
| mickael@google.com | jack | X |
| david@ebay.com | walter | Z |
| alice@yahoo.com | brian | Y |
Assume that (in this sample) emails %@google.com from Y foreign_key aren't good records (only %google.com from X foreign are the good ones and also because its domain is the one i choose with the creationdate selection) how could i manage to select only emails from domain/fk referenced in my new view ?
Desired result :
| email | value 1 | foreign_key |
|--------------------|---------|-------------|
| john@google.com | patrick | X |
| harry@google.com | mary | X |
| mickael@google.com | jack | X |
| david@ebay.com | walter | Z |
| alice@yahoo.com | brian | Y |
I tried with a CONCAT('%','@',domain) and a foreign_key=foreign_key join but it doesn't give me what i want.
回答1:
Based on your sample data and results, a GROUP BY
will give you the results you're after:
SELECT
domain,
MIN(creationdate) AS creationdate,
value1,
value2
FROM mytable
GROUP BY domain, value1, value2
Addendum: @Arka provided updated sample data where the value 1
and value 2
columns have different values (in the original they were the same). That changes the query to this:
SELECT domain, creationdate, value1, value2
FROM mytable
WHERE (domain, creationdate) IN (
SELECT domain, MIN(creationdate)
FROM mytable
GROUP BY domain)
The subquery gets a list of the earliest creationdate
for each domain
, and the outer query only selects rows where the domain
and creationdate
match the subquery values.
来源:https://stackoverflow.com/questions/16799799/select-duplicate-and-keep-the-oldest-not-based-on-id