I have 2 tables, say table A and table B and I want to perform a join, but the matching condition has to be where a column from A \'is like\' a column from B meaning that an
Using conditional criteria in a join is definitely different than the Where clause. The cardinality between the tables can create differences between Joins and Where clauses.
For example, using a Like condition in an Outer Join will keep all records in the first table listed in the join. Using the same condition in the Where clause will implicitly change the join to an Inner join. The record has to generally be present in both tables to accomplish the conditional comparison in the Where clause.
I generally use the style given in one of the prior answers.
tbl_A as ta
LEFT OUTER JOIN tbl_B AS tb
ON ta.[Desc] LIKE '%' + tb.[Desc] + '%'
This way I can control the join type.
If this is something you'll need to do often...then you may want to denormalize the relationship between tables A and B.
For example, on insert to table B, you could write zero or more entries to a juncion table mapping B to A based on partial mapping. Similarly, changes to either table could update this association.
This all depends on how frequently tables A and B are modified. If they are fairly static, then taking a hit on INSERT is less painful then repeated hits on SELECT.
When writing queries with our server LIKE or INSTR (or CHARINDEX in T-SQL) takes too long, so we use LEFT like in the following structure:
select *
from little
left join big
on left( big.key, len(little.key) ) = little.key
I understand that might only work with varying endings to the query, unlike other suggestions with '%' + b + '%', but is enough and much faster if you only need b+'%'.
Another way to optimize it for speed (but not memory) is to create a column in "little" that is "len(little.key)" as "lenkey" and user that instead in the query above.
Using INSTR:
SELECT *
FROM TABLE a
JOIN TABLE b ON INSTR(b.column, a.column) > 0
Using LIKE:
SELECT *
FROM TABLE a
JOIN TABLE b ON b.column LIKE '%'+ a.column +'%'
Using LIKE, with CONCAT:
SELECT *
FROM TABLE a
JOIN TABLE b ON b.column LIKE CONCAT('%', a.column ,'%')
Mind that in all options, you'll probably want to drive the column values to uppercase BEFORE comparing to ensure you are getting matches without concern for case sensitivity:
SELECT *
FROM (SELECT UPPER(a.column) 'ua'
TABLE a) a
JOIN (SELECT UPPER(b.column) 'ub'
TABLE b) b ON INSTR(b.ub, a.ua) > 0
The most efficient will depend ultimately on the EXPLAIN plan output.
JOIN
clauses are identical to writing WHERE
clauses. The JOIN
syntax is also referred to as ANSI JOINs because they were standardized. Non-ANSI JOINs look like:
SELECT *
FROM TABLE a,
TABLE b
WHERE INSTR(b.column, a.column) > 0
I'm not going to bother with a Non-ANSI LEFT JOIN example. The benefit of the ANSI JOIN syntax is that it separates what is joining tables together from what is actually happening in the WHERE
clause.
In MySQL you could try:
SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');
Of course this would be a massively inefficient query because it would do a full table scan.
Update: Here's a proof
create table A (MYCOL varchar(255));
create table B (MYCOL varchar(255));
insert into A (MYCOL) values ('foo'), ('bar'), ('baz');
insert into B (MYCOL) values ('fooblah'), ('somethingfooblah'), ('foo');
insert into B (MYCOL) values ('barblah'), ('somethingbarblah'), ('bar');
SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');
+-------+------------------+
| MYCOL | MYCOL |
+-------+------------------+
| foo | fooblah |
| foo | somethingfooblah |
| foo | foo |
| bar | barblah |
| bar | somethingbarblah |
| bar | bar |
+-------+------------------+
6 rows in set (0.38 sec)