问题
In essence I want to pick the best match of a prefix from the "Rate" table based on the TelephoneNumber field in the "Call" table. Given the example data below, '0123456789' would best match the prefix '012' whilst '0100000000' would best match the prefix '01'.
I've included some DML with some more examples of correct matches in the SQL comments.
There will be circa 70,000 rows in the rate table and the call table will have around 20 million rows. But there will be a restriction on the Select from the Call table based on a dateTime column so actually the query will only need to run over 0.5 million call rows.
The prefix in the Rate table can be up to 16 characters long.
I have no idea how to approach this in SQL, I'm currently thinking of writing a C# SQLCLR function to do it. Has anyone done anything similar? I'd appreciate any advice you have.
Example Data
Call table:
Id TelephoneNumber
1 0123456789
2 0100000000
3 0200000000
4 0780000000
5 0784000000
6 0987654321
Rate table:
Prefix Scale
1
01 1.1
012 1.2
02 2
078 3
0784 3.1
DML
create table Rate
(
Prefix nvarchar(16) not null,
Scale float not null
)
create table [Call]
(
Id bigint not null,
TelephoneNumber nvarchar(16) not null
)
insert into Rate (Prefix, Scale) values ('', 1)
insert into Rate (Prefix, Scale) values ('01', 1.1)
insert into Rate (Prefix, Scale) values ('012', 1.2)
insert into Rate (Prefix, Scale) values ('02', 2)
insert into Rate (Prefix, Scale) values ('078', 3)
insert into Rate (Prefix, Scale) values ('0784', 3.1)
insert into [Call] (Id, TelephoneNumber) values (1, '0123456789') --match 1.2
insert into [Call] (Id, TelephoneNumber) values (2, '0100000000') --match 1.1
insert into [Call] (Id, TelephoneNumber) values (3, '0200000000') --match 2
insert into [Call] (Id, TelephoneNumber) values (4, '0780000000') --match 3
insert into [Call] (Id, TelephoneNumber) values (5, '0784000000') --match 3.1
insert into [Call] (Id, TelephoneNumber) values (6, '0987654321') --match 1
Note: The last one '0987654321' matches the blank string because there are no better matches.
回答1:
Since this is based on partial matching, a subselect would be the only viable option (unless, like LukeH assumes, every call is unique)
select
c.Id,
c.TelephoneNumber,
(select top 1
Scale
from Rate r
where c.TelephoneNumber like r.Prefix + '%' order by Scale desc
) as Scale
from Call c
回答2:
SELECT t.Id, t.TelephoneNumber, t.Prefix, t.Scale
FROM
(
SELECT *, ROW_NUMBER() OVER
(
PARTITION BY c.TelephoneNumber
ORDER BY r.Scale DESC
) AS RowNumber
FROM [call] AS c
INNER JOIN [rate] AS r
ON c.TelephoneNumber LIKE r.Prefix + '%'
) AS t
WHERE t.RowNumber = 1
ORDER BY t.Id
回答3:
Try this one:
select Prefix, min(c.TelephoneNumber)
from Rate r
left outer join Call c on c.TelephoneNumber like left(Prefix + '0000000000', 10)
or c.TelephoneNumber like Prefix + '%'
group by Prefix
回答4:
You can use a left join to try to find a "better" match, and then eliminate such matches in your where clause. e.g.:
select
*
from
Call c
inner join
Rate r
on
r.Prefix = SUBSTRING(c.TelephoneNumber,1,LEN(r.Prefix))
left join
Rate r_anti
on
r_anti.Prefix = SUBSTRING(c.TelephoneNumber,1,LEN(r_anti.Prefix)) and
LEN(r_anti.Prefix) > LEN(r.Prefix)
where
r_anti.Prefix is null
来源:https://stackoverflow.com/questions/3709323/what-is-a-good-approach-in-ms-sql-server-2008-to-join-on-a-best-match