问题
I'm creating an SSIS package wherein a user needs to be notified if there is a duplicate productcode for one productlabel. We retrieve the products thru a csv flat file being sent to us in a shared location.
In this example I have Productcode = 1a for both productlabel Physio Ormix and Nixen.
So basically the productcode = 1a was a duplicate because it was also used by the productlabel=Nixen. Hence, notify users for the duplicate ProductCode and Productlabel used. I tried creating an aggregate that performs group by ProductCode and Counts it.
Can someone give me tips on how to do this?
回答1:
I think you can use a script component and a conditional split to get the duplicates without all this logic:
- Inside the data flow task add a Script Component
- Add an output column of type
DT_BOOL
(example name isFlag
) Inside the script component write a similar script:
using System.Collections.Generic; public class ScriptMain: UserComponent { List<string> lstKey = new List<string>; List<string> lstKeylabel = new List<string>; public override void Input0_ProcessInputRow(InputBuffer0 Row) { if(!lstKey.Contains(Row.ProductCode){ lstKey.Add(Row.ProductCode); lstKeylabel.Add(Row.ProductCode + ";" + Row.ProductLabel); Row.Flag = true; }else if(lstKeylabel.Contains(Row.ProductCode + ";" + Row.ProductLabel)) { Row.Flag = true; }else{ Row.Flag = false; } } }
Add a conditional split after the script component with a similar expression:
[Flag] == true
All records that are passed thru the true path are unique, all rows passed in the false path are the duplicates.
来源:https://stackoverflow.com/questions/55036780/how-to-get-desired-output-in-ssis-by-grouping-by-and-count