How to get desired output in SSIS by grouping by and count?

问题

I'm creating an SSIS package wherein a user needs to be notified if there is a duplicate productcode for one productlabel. We retrieve the products thru a csv flat file being sent to us in a shared location.

In this example I have Productcode = 1a for both productlabel Physio Ormix and Nixen.

So basically the productcode = 1a was a duplicate because it was also used by the productlabel=Nixen. Hence, notify users for the duplicate ProductCode and Productlabel used. I tried creating an aggregate that performs group by ProductCode and Counts it.

Can someone give me tips on how to do this?

回答1:

I think you can use a script component and a conditional split to get the duplicates without all this logic:

Inside the data flow task add a Script Component
Add an output column of type DT_BOOL (example name is Flag)

Inside the script component write a similar script:

using System.Collections.Generic;

public class ScriptMain:  
    UserComponent  

{  

    List<string> lstKey = new List<string>;
    List<string> lstKeylabel = new List<string>;

    public override void Input0_ProcessInputRow(InputBuffer0 Row)  
    {  

        if(!lstKey.Contains(Row.ProductCode){

            lstKey.Add(Row.ProductCode);
            lstKeylabel.Add(Row.ProductCode + ";" + Row.ProductLabel);
            Row.Flag = true;

        }else if(lstKeylabel.Contains(Row.ProductCode + ";" + Row.ProductLabel)) {

            Row.Flag = true;

        }else{

            Row.Flag = false;

        }

    }  

}

Add a conditional split after the script component with a similar expression:
```
[Flag] == true
```
All records that are passed thru the true path are unique, all rows passed in the false path are the duplicates.

来源：https://stackoverflow.com/questions/55036780/how-to-get-desired-output-in-ssis-by-grouping-by-and-count

标签

sql

sql-server

ssis

ssis-2012

ssis-2008