mysql getting result from two identical table

问题

I have two identical tables. I want to compare these two tables and getting the result from them. The condition are:

each record in TABLE1 grouped by TID will be compared to all records in TABLE2 grouped by their each TID.
if each grouped record in TABLE1 are to be discovered in TABLE2 (records in TABLE2 that grouped by each tid, too), as many as N (N is the user input variable), then that record will be inserted into new table.

For example, like the ss below, ITEM C-F-A grouped by TID 2 has 3 occurrences in table2, thus they will be inserted into new table:

I've already tried writing the code for this and it worked (vb.net), but the compiler takes ridiculous time to complete. The main cause is I'm processing a huge database.

The method I've done in program is populate the two table into 2d array. assigning value to array while comparing the two element with if clause.

Below is the 2d array that I've created:

But this method is really expensive, my real database on pic above is 1st 2d array has 2k records and 2nd 2d array has 800 records, and when I try to calculate the estimate time for compiling to completed, it showed a fantastic number, about 16 hours.. gosh!!

So I was wondering, whether this problem can be solved with mysql query, or other method that is more effective than what I have done?

回答1:

INSERT INTO tbl3 
  SELECT tbl1.TID, tbl1.ITEM 
  FROM tbl1 
    JOIN tbl2 ON tbl2.TID = tbl1.TID AND tbl2.ITEM = tbl1.ITEM

This will insert a record into tbl3 for each record in tbl1 that has a corresponding record in tbl2 identified by TID and ITEM.

This assumes that TID/ITEM is a unique index in both tbl1 and tbl2.

回答2:

Ok, here's a wild, untested, guess (WUG).

The approach goes like this:

You need a list of TID's from table1. So you build a distinct list (inner most query).
You use that list in a where clause when selecting from table2, so that you only get rows that have TIDs in table1. You group that query, and use HAVING to then limit the rows to only those with a count > X.
Now you have a list of TIDs that match those in table1 and have more than X entries in table2. You select those rows.
Those are used a the source of an insert statement into table1.

The SQL might looks something like:

    insert into table1 
      values (select * from table2 where tid in 
        (select tid, count(*) as cnt 
            from table2 
            where tid in (select distinct tid from table1) 
            group by tid
            having cnt > 10)));

I doubt the syntax is correct (cant remember the exact syntax for an insert from a select), and make no claim it will work off the bat, but its what my first shot would be if I wanted to do it all in one query.

来源：https://stackoverflow.com/questions/21316034/mysql-getting-result-from-two-identical-table

标签

mysql

performance

compare