问题
I make some transformations and validation on a flat .CSV file to import data. I'd like to add a column with the number of time an employee is present in the flat file for exemple :
Input Data Flow :
Output Data Flow :
I don't know how to transform my dataflow... have an idea?
回答1:
This is how I would do it:
- If your data is not already sorted, sort it on Employee_Id.
- Use a Multicast to split your data flow into two streams.
- In one of the streams, add an Aggregate transformation that Groups by Employee_Id and adds a new count column that contains COUNT(*) for each Employee_Id. Time will be ignored and discarded in this stream.
- Merge Join the two streams back together on Employee_Id, keeping only the Count column from the aggregated stream.
This should leave you with the desired output of one row for every row in the source data, but with the Count per Employee_Id on each row.
来源:https://stackoverflow.com/questions/33696647/ssis-perform-group-by-and-count-on-flat-file