Hi,
I have created a SSIS package which includes a Data Flow that will transform my data. The Data Flow Source for this Data Flow is a SQL Server table with approx. 60 mill. rows. Theses rows ought to be unique but isn't quite, so I would like to ensure that only one of each row makes it throug to my Data Flow Destination, another SQL Server table.
What is the most efficient way to ensure this - is there some Data Flow Transformation that can assure this, should I write a SQL statement that deletes all duplicates before the Data Flow or should I take a third approch
- Sune

Only transform unique rows
Andria1974
1. OLE DB source -> Sort -> OLE DB dest
Configure Sort transform to "Remove rows with duplicate sort values".
2. OLE DB source -> Aggregate -> OLE DB dest
Configure Aggregate transform to group by all columns.
No.2 might be more efficient. Would you like to try and tell us the results I'm very interested to know.
S. Khawar
Thanks,
Matt
Glenn0305
I hope to get around to make some benchmarks during the weekend and promise to post the results.
Thanks for your help so far.
- Sune