Threading in Custom Components and Script Components

Ok so its reall easy to use threading in .net. But what about doing it properly especially in SSIS.

The following are my questions,

Are there any implications of adding rows to a buffer from multiple threads

How can the script components be written to close any spawned threads when running in Dtexec. It seems that only the top level thread is aborted leaving spawned threads running Is this an issue with dtexec

Which of the core components use multiple threads i.e. does the multi flat file, I understand the sort does

Cheers

Simon


Answer this question

Threading in Custom Components and Script Components

  • Diogo Mendonca Verissimo

    Hi Matt
    The reason I want threading is I am waiting on a resource that is not at is limit and my machine processing is not at its limit i.e. waiting on IO, memory, network etc. I therefore want to use threading to ease design without the need to have multiple components. In this situation threading is the best way, I am happy to be proved wrong, to achieving the best performance.

    I agree using multiple threads is easy but writing a good multi threaded app is not (I think that is what you were getting at). And therfore i am after guidance on the best practices on achieving this in the SSIS framework.

    I will look into the Cancel Event information you mention.

    Any more information would be great.

  • Macolino

    Simon,

    Yes, that is exactly what I was getting at.  You can thread in a script component but I would advise that if you want to thread then you should write your own custom component as you have more control over what is happening.  I understand threading can make certain tasks "easier".  However, remember that unless you have a processor to work on that thread then threading actually makes an application less performant not more, which is why I advise against threading if possible.  Obviously, certain things just make sense to thread and then you should by all means do so.  In your case it does sound like it indeed makes sense if you are waiting on a resource and can still do something else while you are waiting.  You will have to track the CancelEvent and also ensure that only one row is added at a time to the output buffer via some synchronization object (mutex, semaphore, etc).  Other than that normal good threading practices need to apply but nothing special for SSIS.

    HTH,
    Matt

  • __Jim__

    I think your statement is not precise.  It is easy to start a thread in .NET but that doesn't mean it is easy.  Threading should be avoided unless absolutely needed.

    Furthermore, you should not thread in a script component.  If you want to thread then you should write your own custom component.  That being said if you thread (in a script component or otherwise) then you are responsible for shutting down your threads not us.  There is a system variable that contains a handle to an event (CancelEvent I believe it is called) which gets signalled when the package is requesting a shutdown and you have to use that to shut down any threads you spawn.  There are several components that use threading such as sort, mergejoin, etc, but in all cases only one thread adds rows to the output buffer since the synchronization overhead removes most of the perf gain and adds significant complexity.

    You can only add rows from multiple threads if you guarrantee serialization, i.e. that one thread has completed a whole row before another thread calls AddRow.  This is because the buffer manager can push a buffer downstream on any AddRow call and if that happens while another thread is still filling up a row then in the best case you will get an incomplete row and in the worse case you will cause an AV.

    Matt

  • Threading in Custom Components and Script Components