Here's the senario:
We have 3000 stores each sending detail sales data polling files to our FTP site. Most of these files arrive between 10 pm and 4:00 am, with stragglers coming in all day. We receive several different files from each store, containing different types of data (payroll, order header, order detail etc...). Detail files could be up to 5000 rows per store.
We have built several test SSIS packages that will process these files using multiple flat file connectors to load each file thru to one or more destination tables in our data warehouse.
My question: What is the best way to execute these packages to minimize the latency between the file arriving at our FTP site and it's delivery to the final destination tables
Option One: Use SQL job scheduler to execute the package every x minutes, processing all files that are found.
Option two: Have the package execute continuously using a loop, so that it will pickup any files as they arrive.
Option Three: Have a VB.net application watch the FTP site and execute the appropriate package using DTExec for each file, passing the file name to the package as a parameter.
I'm tending to lean towards option one. Does anyone have any feedback on the pros and cons of these options, or suggestions of other ways to address
Thanks, Mike

Watching an incoming directory for files to process
Sean Connolly
With option 3, what is the practical limit to how many packages I could spawn at one time We've written the vb.net component to monitor our ftp site and copy files to a working directory. We could use this routine to execute the appropriate package. However, would it need to monitor the number of active packages, compare to some preset limit, and only spawn a new one when system capacity would allow
Thanks, Mike
Keith Henkel
Mike,
No, it can't be used in a .net app. Its SSIS only.
Its Darren's baby so he can tell you more about it!!
-Jamie
Znarkus
Srinivasalu
1) Use SQLIS.com's File Watcher task: http://www.sqlis.com/default.aspx 23
2) Use WMI Event Watcher Task as described by Kirk: http://www.sqljunkies.com/WebLog/knight_reign/search.aspx q=wmi&p=1
-Jamie
Chris Jiang
Darren,
I thought there might be some restriction on the number of packages that could be simultaneously running from an SSIS architecture perspective. Obviously we'll need to determine from a pure system resources viewpoint if we need to throttle the process or not.
Is there any material performance difference between running n files thru a single execution, i.e. Option 1, vs. spawning n simultaneous packages running one file each, i.e. Option 3 Intuitively it seems like option 1 would be more efficient, especially as the number of files increases.
Perhaps a dual option would be the best solution for minimal latency. Option 1 in the early AM processing files in large gulps, Option 3 running thru the day to push straggler files all the way thru with minimum latency.
I play around and let you know what we find out.
Thanks for all the posts....
Mike
EricN
The problem is that once a change has been detected, it will not watch for any more files. This is by design, as I belive for high volumes of files, a sequential process, all inside one package would not work. Yes we have events, but the development bar gets higher to use it corrctly, so that is why I think for high volume you woudl be better off with a custom application external from SSIS. High volume, higher complexity, was something I decided would be too specific for me to implement in a simple and generic task. From a process point of view I think it would be wrong to architect this inside a package anyway.
Jim Fox
Can the file watcher component be used within a vb.net application We've had some issues in our .net "watcher" application with files still being written on the FTP site.
We now have the watcher logging the files in MSMQ with a timestamp, then a separate "Getter" monitors the queue and picks up files after a preset delay. This seems to work, but it's not foolproof and adds to the latency....
Sounds like the file watcher component you referenced deals with the issue of the file being "available".
Mike
joostvaningen
If resources are an issue for concurrent execution, then you may need to implement a queue mechanism to throttle the concurrent threads, and spawn when capacity allows as you mention.