Data Flow limitation?

I have a unique situation where I need to process 40 mb (100,000 rows) worth of data from a flat file and order the data by a key and place it in SQL server table. My problem is that SSIS data flow processes data in batches defined by BufferSize and BufferMaxRows and I cannot seem to process 40 mb worth of data ALL AT Once.

Ie. My flat file has data that is in fixed width. First column includes id such as AA, BB, CC. These rows are not orded in the flat file. So I use Conditional Split to order the rows, however, since SSIS buffer is not large enough to process all the rows at once, it takes first 10,000 rows orders them, then next 10k orders them, etc. How do I force it so that all 40mb (100,000 rows) are read at once before leaving conditional split



Answer this question

Data Flow limitation?

  • Joel Clipperton

    I am not sorting data ascending or descendingly, I am using conditional split to group data together then provide order based on requirements.

    I am looking for a data flow solution that will allow me to extract ALL data from my source before going into transformation phase. This is because I need to have all rows from the file into memory before conditionally splitting rows. I figured if I increase the buffer it would do that for me since the max file size is 50 MB. If I cannot get ALL data before entering transformation, I will have to create a for loop in control flow and call the data flow several times to process one file which I want to avoid if I can process the whole file in one data flow.

  • R4BI

    Trying to control the buffer size is a non-starter, but if you want to sort the data, why not use the Sort Transform, after that is what it is all about.

  • aJoe

     N.S. wrote:
    This is because I need to have all rows from the file into memory before conditionally splitting rows.

    Why do you need this Are you writing and reading to/from the same file in one data flow

    The conditional split processes rows one by one, regardless of buffer. The buffer size does not affect output of conditional split, only the timing between source and transform, so I don't understand why would you need this.

    Can you give even bigger picture

  • Data Flow limitation?