I\'m OK with either a PL/SQL solution or an Access VBA/Excel VBA (though Access VBA is preferred over Excel VBA) one. so, PL/SQL is the first choice, Access VBA is second a
This gets you most of the way there in standard SQL, it's not quite perfect and I expect that the MODEL clause is what would work best, but...
What this does, is:
all_possible
work out every possible combinationsome_counting
pivot this round and count the number of unique otherid
s per fax
. We can also restrict this to 6 here, so that we exclude any fax
s which are never going to qualifyuniquify
use row_number()
to ensure that we can split records that have the same number of otherid
s per fax
later and also work out the greatest. If this is 6 then you've got a simple win.cumulative_sum
work out the running sum of the number of otherid
s per fax. The trick here is the order in which you do it. I've chosen to pick the greatest first and then add in the smaller ones. I'm sure there's a cleverer way to do this... I did this because if the greatest is 6, you win. If it's 4, say, then you can fill it in with 2 fax
s which only have 1 associated otherid
etc.Assuming a table as follows, filled with your data:
create table tmp_table (
r number
, otherid number
, fax number
);
the code would look like this:
with all_possible as (
select t.r as t_r, t.otherid as t_otherid, t.fax as t_fax
, u.r as u_r, u.otherid as u_otherid, u.fax as u_fax
from tmp_table t
left outer join tmp_table u
on t.fax = u.fax
and t.r <> u.r
)
, some_counting as (
select fax
, count(distinct otherid) as no_o_per_fax
from all_possible
unpivot ( (r, otherid, fax)
for (a, b, c)
in ( (t_r, t_otherid, t_fax)
, (u_r, u_otherid, u_fax)
))
group by fax
having count(distinct otherid) < 6
)
, uniquify as (
select c.*
, row_number() over (order by no_o_per_fax asc) as rn
, max(no_o_per_fax) over () as m_fax
from some_counting c
)
, cumulative_sum as (
select u.*, sum(no_o_per_fax) over (order by case when no_o_per_fax = m_fax then 0 else 1 end
, no_o_per_fax asc
, rn ) as csum
from uniquify u
)
, candidates as (
select a.*
from cumulative_sum a
where csum <= 6
)
select b.*
from tmp_table a
join candidates b
on a.fax = b.fax
SQL Fiddle
I make extensive use of common table expressions here to make the code look cleaner
This is not a full answer, but I don't want to write a lot of queries in comments.
Your main goal is to send information to people, and to avoid the situation when one person receives fax twice. So you first you need a list of unique recipients, like this:
select distinct otherid
from NR_PVO_120
If one person has two fax numbers, you need to decide, which one to choose:
select otherid, fax
from (select otherid, fax, row_number() over (partition by otherid order by <choosing rule>) rn
from NR_PVO_120)
where rn = 1
(All of this you have in answers of previous question)
If you take this list of fax numbers, all of your recipients receive the fax, and only one fax for every person. But some fax numbers will not be used. You can easily find them:
select otherid, fax
from (select otherid, fax, row_number() over (partition by otherid order by <choosing rule>) rn
from NR_PVO_120)
where rn > 1
If you send fax to any of this numbers, some of people get one fax twice.
English is not my native language, so I don't understand what you mean when say "without breaking up fax numbers". As I can see in your question, possibly you need to use order of fax numbers in your question as number priority (the higher number is situated in the table - the higher probability to use it). It seems like you can use following:
select otherid, fax
from (select otherid, fax, row_number() over (partition by otherid order by row) rn
from NR_PVO_120)
where rn = 1
here row
in order by
clause is a Row
from your example table.
UPD
P. S. About my last query: we have a table with certain order, and the order is important. We take rows of the table line by line. Take first row and put its otherid
and fax
to result table. Then take next row. If it contains another fax
number and otherid
, we take it, if otherid
already in our result table, we skip it. Did you ask this algorithm?
Not sure about your requirement but this is the best I have understood your question. First the code is sorting the data on Fax and then extracting the IDs where Fax is appearing for the the first time, even after that because of the data, there are duplicates IDs, so again sorting and removing duplicates is being done.
Sub Unique_fax()
Finding the last row so that loop can run that many times
lastrow = Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
Copying the data to new rows so that your original data remains intact
For i = 1 To lastrow
Worksheets("Sheet1").Cells(i, 5).Value = Trim(Worksheets("Sheet1").Cells(i, 1))
Worksheets("Sheet1").Cells(i, 6).Value = Trim(Worksheets("Sheet1").Cells(i, 2))
Worksheets("Sheet1").Cells(i, 7).Value = Trim(Worksheets("Sheet1").Cells(i, 3))
Next
Sorting the data based on Fax
Range("E1:G" & lastrow).Select
Selection.Sort Key1:=Range("G1"), Order1:=xlAscending, _
Header:=xlNo, OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom
Copying the IDs where the Fax is different to a new row
x = 1
For i = 1 To lastrow
If Cells(i, 7) <> Cells(i + 1, 7) Then
Cells(x, 9) = Cells(i, 6)
x = x + 1
End If
Next
Sorting the list of IDs and removing duplicates
lastrowUnq = Worksheets("Sheet1").Cells(Rows.Count, 9).End(xlUp).Row
Range("I1:I" & lastrowUnq).Select
Selection.Sort Key1:=Range("I1"), Order1:=xlAscending, _
Header:=xlNo, OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom
y = 1
For j = 1 To lastrow
If Cells(j, 9) <> Cells(j + 1, 9) Then
Cells(y, 11) = Cells(j, 9)
y = y + 1
End If
Next
End Sub
Column - A,B,C is your original Data. Column - E,F,G is the data sorted on Fax. Column - I contains the list of IDs where Fax was unique. Column - K contains the final list of IDs(as required).
If I understand the requirements correctly, this should do it.
EDIT: I missed the uniqueness requirement. So, I've updated the code to account for that.
EDIT2: Added fax to the output, using a record type.
declare
input_number int := 6;
cursor get_faxes is
select fax, count(*) num_ids from listofids
group by fax
order by fax;
cursor get_ids (p_fax in int) is
select otherid from listofids
where fax = p_fax;
type idrec is record(id listofids.otherid%type, fax listofids.fax%type);
type idlist is table of idrec;
output_list idlist := idlist();
v_memberof boolean;
begin
for fax_rec in get_faxes loop
if output_list.count + fax_rec.num_ids <= input_number then
for id_rec in get_ids(fax_rec.fax) loop
v_memberof := False;
for i in 1..output_list.count loop
if output_list(i).id = id_rec.otherid then
v_memberof := true;
end if;
end loop;
if not v_memberof then
output_list.extend(1);
output_list(output_list.count).id := id_rec.otherid;
output_list(output_list.count).fax := fax_rec.fax;
end if;
end loop;
end if;
end loop;
for i in 1..output_list.last loop
dbms_output.put_line('id: ' || output_list(i).id || ' fax:' || output_list(i).fax);
end loop;
end;
This now returns the following:
id: 11098554 fax:2063504752
id: 56200936 fax:2080906666
id: 56166614 fax:7180930966
id: 56159509 fax:7180930966
id: 25138850 fax:7182160901
id: 56148974 fax:7182232046
If you actually need a random selection, you can change the order by to use dbms_random.random instead of fax.
EDIT 2/13/2015 after using the accepted answer for a few months i came across a scenario that hasn't happened yet and realized that his solution only works if i need to get a number that's not too close to the total. for example, if my total number of records is 15000 and i'm asking for 12000 then his code will give 10 or 11k. if i ask for 8k then i will probably get the 8.
i don't understand what his code does and he never replied so i can't explain why this is happening, my guess is that he's taking the counts in a certain order and since the results are dependent on the order the faxes are sorted in - he won't necessarily get the best results every time. when there's enough room (asking 8l out of 15k) he has enough room for any combination to yield the acceptable result but once you ask for a tighter number (12k out of 15k) he's locked into his order and runs out of acceptable counts fast enough.
so this is the code that will give correct result no matter what. it's not nearly as elegant and is extremely slow but it works.
12/13/14 i think i got it, PL/SQL, not the best solution by far but it gives better results than what they currently get by hand. actually, would be really interested to hear about possible problems
12/13/14 EDIT the accepted answer is the way to do it, i'm only leaving this here for contrast, so people can see how not to code lol.
DECLARE
CountsNeededTotal NUMBER;
CountsNeededRemaining NUMBER;
CurCountsTotal NUMBER;
CurFaxCount NUMBER;
CurFaxCountPicked NUMBER;
BEGIN
CountsNeededTotal := 420;
CurCountsTotal := 0;
CurFaxCount := 0;
CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_121';
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--START BLOCK
--this block jsut gets the first fax, the fax with the largest number of people
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--get the first fax with the most people as long as thta number isn't larger than the number needed
SELECT MAX(CountOfPeople) CountOfPeople
INTO CurFaxCount
FROM (SELECT fax
,COUNT(1) CountOfPeople
FROM NR_PVO_120
GROUP BY Fax
HAVING COUNT(1) <= CountsNeededRemaining);
COMMIT;
--if there is a number that's not larger then add to the table and keep looping
--if there isn't then there's no providers from this campaign that can be used
IF CurFaxCount >= 0 THEN
--insert into the 121 table (final list of faxes)
INSERT INTO NR_PVO_121
SELECT fax
,COUNT(1) CountOfPeople
FROM NR_PVO_120
HAVING COUNT(1) = (SELECT MAX(CountOfPeople) CountOfPeople
FROM (SELECT fax
,COUNT(1) CountOfPeople
FROM NR_PVO_120
GROUP BY Fax
HAVING COUNT(1) <= CountsNeededTotal))
GROUP BY Fax;
COMMIT;
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--START BLOCK
--this block loops through remaining faxes
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
SELECT SUM(CountOfPeople) INTO CurCountsTotal FROM NR_PVO_121;
IF CurCountsTotal < CountsNeededTotal THEN
CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
--loop until counts needed remaining is 0 or as close as 0 as possible without going in the negative
WHILE CountsNeededRemaining >= 0 LOOP
--clear 122 table
EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_122';
--loop through all faxes in 120 table MINUS the ones in the 121 table
DECLARE
CURSOR CurRec IS
SELECT DISTINCT Fax
FROM NR_PVO_120
WHERE Fax NOT IN (SELECT Fax FROM NR_PVO_121);
PVO CurRec%ROWTYPE;
BEGIN
OPEN CurRec;
LOOP
FETCH CurRec INTO PVO;
SELECT DISTINCT COUNT(OtherID) CountOfPeople
INTO CurFaxCount
FROM NR_PVO_120
WHERE Fax = PVO.fax
AND OtherID NOT IN (SELECT DISTINCT OtherID
FROM NR_PVO_120
WHERE fax IN (SELECT Fax FROM NR_PVO_121));
-- DBMS_OUTPUT.put_line('CurFaxCount ' || CurFaxCount);
-- DBMS_OUTPUT.put_line('CountsNeededRemaining ' || CountsNeededRemaining);
IF CurFaxCount <= CountsNeededRemaining THEN
--record their unique counts in 122 table IF THEY'RE NOT LARGER THAN CountsNeededRemaining
INSERT INTO NR_PVO_122
SELECT PVO.fax
,CurFaxCount
FROM DUAL;
COMMIT;
END IF;
EXIT WHEN CurRec%NOTFOUND;
--end fax loop
END LOOP;
CLOSE CurRec;
END;
--pick the highest count from 122 table
SELECT MAX(CountOfPeople) CountOfPeople INTO CurFaxCountPicked FROM NR_PVO_122;
--add this fax to the 121 table
INSERT INTO NR_PVO_121
SELECT MIN(Fax) Fax
,CurFaxCountPicked
FROM NR_PVO_122
WHERE CountOfPeople = CurFaxCountPicked;
COMMIT;
--add the counts to the CurCountsTotal
CurCountsTotal := CurCountsTotal + CurFaxCountPicked;
--recalc CountsNeededRemaining
CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
--
-- DBMS_OUTPUT.put_line('CurCountsTotal ' || CurCountsTotal);
-- DBMS_OUTPUT.put_line('CurFaxCountPicked ' || CurFaxCountPicked);
-- DBMS_OUTPUT.put_line('CurFaxCount ' || CurFaxCount);
-- DBMS_OUTPUT.put_line('CountsNeededRemaining ' || CountsNeededRemaining);
-- DBMS_OUTPUT.put_line('CountsNeededTotal ' || CountsNeededTotal);
--clear 122 table
EXECUTE IMMEDIATE 'TRUNCATE TABLE NR_PVO_122';
--end while loop
END LOOP;
END IF;
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--END BLOCK
--this block loops through remaining faxes
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
END IF;
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--END BLOCK
--this block jsut gets the first fax, the fax with the largest number of people
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
--############################################################################################
END;
here's a better version, MUCH faster than the above but it probably won't return perfect results in some cases. i wasn't able to get wrong results while testing but there is a possibility because i'm not trying every possible combination (as in the first version), that takes days to finish for a dataset of 20K records
DECLARE
CountsNeededTotal NUMBER;
CountsNeededRemaining NUMBER;
CurCountsTotal NUMBER;
BEGIN
CurCountsTotal := 0;
SELECT NoOfProvToKeep INTO CountsNeededTotal FROM NR_PVO_121;
CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
EXECUTE IMMEDIATE 'TRUNCATE TABLE nr_pvo_122';
COMMIT;
IF CurCountsTotal <= CountsNeededTotal THEN
--loop until counts needed remaining is 0 or as close as 0 as possible without going in the negative
WHILE CountsNeededRemaining > 0 LOOP
--clear 122 table
INSERT INTO NR_PVO_122
SELECT Fax
,CountOfPeople
FROM (SELECT DISTINCT COUNT(OtherID) CountOfPeople
,Fax
FROM NR_PVO_120
WHERE OtherID NOT IN (SELECT DISTINCT OtherID
FROM NR_PVO_120
WHERE fax IN (SELECT Fax FROM NR_PVO_122))
HAVING COUNT(1) <= CountsNeededRemaining
GROUP BY fax
ORDER BY 1 DESC)
WHERE ROWNUM = 1;
SELECT SUM(CountOfPeople) INTO CurCountsTotal FROM NR_PVO_122;
COMMIT;
--recalc CountsNeededRemaining
CountsNeededRemaining := CountsNeededTotal - CurCountsTotal;
--
--DBMS_OUTPUT.put_line('CurCountsTotal ' || CurCountsTotal || ', CountsNeededRemaining ' || CountsNeededRemaining);
--end while loop
END LOOP;
END IF;
DELETE FROM NR_PVO_112
WHERE NVL(Fax, '999999999999') NOT IN (SELECT Fax FROM NR_PVO_122);
END;
Data Tested at Beginning. Note OtherID is in Col A and Fax in Col B:
First we are going to find the number of Unique IDs you want NOTE: YOU WILL NEED A NEW SHEET CALLED "Use Me". We will need a custom function for this. This function can be run as an cell formula with syntax =UniqueItems(B2:D5)
but we are going to use it in our Sub:
Function UniqueItems(ArrayIn, Optional Count As Variant) As Variant
' Accepts an array or range as input
' If Count = True or is missing, the function returns the number of unique elements
' If Count = False, the function returns a variant array of unique elements
Dim Unique() As Variant ' array that holds the unique items
Dim Element As Variant
Dim i As Integer
Dim FoundMatch As Boolean
' If 2nd argument is missing, assign default value
If IsMissing(Count) Then Count = True
' Counter for number of unique elements
NumUnique = 0
' Loop thru the input array
For Each Element In ArrayIn
FoundMatch = False
' Has item been added yet?
For i = 1 To NumUnique
If Element = Unique(i) Then
FoundMatch = True
Exit For '(exit loop)
End If
Next i
AddItem:
' If not in list, add the item to unique list
If Not FoundMatch And Not IsEmpty(Element) Then
NumUnique = NumUnique + 1
ReDim Preserve Unique(NumUnique)
Unique(NumUnique) = Element
End If
Next Element
' Assign a value to the function
If Count Then UniqueItems = NumUnique Else UniqueItems = Unique
End Function
Here is the sub you need to find your Unique IDs and copy them over to the sheet "Use Me"
Sub FaxesToUse()
Dim LastRow As Long, CurRow As Long, UniqueTotal As Long, SubTotal As Long
UniqueTotal = InputBox("How Many Unique OtherIDs is Max?")
If Not UniqueTotal > 0 Then
Exit Sub
End If
LastRow = Range("A" & Rows.Count).End(xlUp).Row
SubTotal = 0
For CurRow = 2 To LastRow
SubTotal = UniqueItems(Range("A2:A" & CurRow))
If SubTotal > UniqueTotal Then
SubTotal = UniqueItems(Range("A2:A" & CurRow - 1))
Range("A1:B" & CurRow - 1).Copy
Sheets("Use Me").Cells.Clear
Sheets("Use Me").Range("A1").PasteSpecial xlPasteValues
Sheets("Use Me").Activate
MsgBox "Use Me Sheet rows contain " & SubTotal & " Unique OtherIDs"
Exit Sub
End If
Cells(CurRow, 1).EntireRow.Interior.Color = RGB(255, 255, 0)
Next CurRow
End Sub
That will get you a page that looks like this: Now we just need to remove all the duplicate Faxes using this macro:
Sub RemoveDups()
Dim CurRow As Long, LastRow As Long, LastCol As Long, DestLast As Long, DestRng As Range, ws As Worksheet
Set ws = Sheets("Use Me")
LastRow = ws.Range("A" & Rows.Count).End(xlUp).Row
For CurRow = LastRow To 3 Step -1
Set DestRng = ws.Range("B2:B" & CurRow - 1).Find(ws.Range("B" & CurRow).Value, LookIn:=xlValues, LookAt:=xlWhole, SearchDirection:=xlNext)
If DestRng Is Nothing Then
'Do Nothing
Else
DestLast = ws.Cells(DestRng.Row, Columns.Count).End(xlToLeft).Column + 1
ws.Cells(DestRng.Row, DestLast).Value = ws.Cells(CurRow, 1).Value
ws.Cells(CurRow, 1).EntireRow.Delete xlShiftUp
End If
Next CurRow
ws.Columns("B:B").Cut
ws.Columns("A:A").Insert Shift:=xlToRight
Application.CutCopyMode = False
LastRow = ws.Range("A" & Rows.Count).End(xlUp).Row
LastCol = 0
For CurRow = 2 To LastRow
If ws.Cells(CurRow, Columns.Count).End(xlToLeft).Column > LastCol Then
LastCol = ws.Cells(CurRow, Columns.Count).End(xlToLeft).Column
End If
Next CurRow
MsgBox "Use Me Sheet Rows contain " & UniqueItems(ws.Range(Cells(2, 2), Cells(LastRow, LastCol))) & " Unique OtherIDs"
End Sub
Leave you with this: