Monday, March 10, 2014

Handling Multiple Errors in SSIS

One actual failure in SSIS can trigger a whole series of error messages. For example, failure to convert a column value from a string to an integer in a Derived Column transform generates the following messages:
[Data Conversion [70]] Error: Data conversion failed while converting column “Fiscal year” (18) to column “NumericFiscalYear” (83). The conversion returned status value 2 and status text “The value could not be converted because of a potential loss of data.”. 
[Data Conversion [70]] Error: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The “output column “NumericFiscalYear” (83)” failed because error code 0xC020907F occurred, and the error row disposition on “output column “NumericFiscalYear” (83)” specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure. 
[DTS.Pipeline] Error: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component “Data Conversion” (70) failed with error code 0xC0209029. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure. 
[DTS.Pipeline] Error: SSIS Error Code DTS_E_THREADFAILED. Thread “WorkThread0″ has exited with error code 0xC0209029. There may be error messages posted before this with more information on why the thread has exited. 
If you are logging errors to a flat file or an error logging table, then recording each error may be fine. However, if you’re writing the errors to the Windows event log, or sending them via email, you may not want to record multiple messages each time an error occurs. You might want to record only the first message, or you might want to group all the errors into a single log entry or email. Fortunately, the event model in SSIS allows you to easily customize how errors are handled.
I’ve put together a small sample package that shows how you might accomplish this. The package contains a single data flow that loads a text file, attempts to convert a column from string to numeric, and writes it to a Trash destination (see www.SQLIS.com to get this component).
The text file has an invalid value in one of the columns, which will cause the data flow to fail, and generate the four messages listed above. The package is set up to capture all of the error messages generated, store them in a collection, and concatenate them into a single string when the package is finished executing. Once that is done, the resulting string could be emailed or recorded to a log.
As mentioned, the data flow is very straightforward:

I’ve also created two variables at the package level: errorMessages as an Object, and emailText as a String. I’ll explain why later in the post.
The real work occurs in the event handlers. SSIS raises events for all executables(packages and tasks are both executables). The event we’re interested in is the OnError event, which is raised once for each error that occurs.
You get to the event handlers by selecting the Event Handlers tab in the SSIS designer. Once there, the Executable for which you want to capture events needs to be selected.

Since I want to handle errors for anything in the package, I’m setting the executable to CaptureErrors (the name of the package). By default, any event raised by a child executable (that is, an executable that is nested inside another executable) will also be raised in its parent. You can disable that behavior by setting the Propagate system variable, but that’s a topic for another post. I’m also using “OnError” from the list of events and have added a Script Task to the event handler.

The Script Task has two variables passed in: the System::ErrorDescription, which contains the text of the error message, and User:errorMessages, which I’ll use to keep track of all the error messages.

Here’s the script used in the Script Task:
Dim messages As Collections.ArrayList 
Try
    messages = CType(Dts.Variables(“errorMessages”).Value, Collections.ArrayList)
Catch ex As Exception
    messages = New Collections.ArrayList()
End Try 

messages.Add(Dts.Variables(“ErrorDescription”).Value.ToString()) 
Dts.Variables(“errorMessages”).Value = messages 
Dts.TaskResult = Dts.Results.Success
I’m first attempting to retrieve the ArrayList from the errorMessages variable. If the value of the variable can’t be cast to an ArrayList, it indicates that it hasn’t been initialized yet. Once that has been handled, the error description is added to the ArrayList. This handles capturing the list of all error messages.
The next step is to process all the messages in order to email or log them. Since I only want to do this once, I’m using the OnPostExecute event, which fires when the executable is finished running.

There is another Script Task present in this event handler. This one has the User::errorMessages and User:emailText variables passed in.

The script in this task is concatenating a long string based on the error messages captured and returning it in the emailText variable:
Dim errorDesc As String
Dim messages As Collections.ArrayList 

Try
    messages = CType(Dts.Variables(“errorMessages”).Value, Collections.ArrayList)
Catch ex As Exception
    ‘If there is an exception – the object was never initialized, so there were no errors
    Return
End Try 

For Each errorDesc In messages
    Dts.Variables(“emailText”).Value = Dts.Variables(“emailText”).Value.ToString + errorDesc + vbCrLf
Next 

Dts.TaskResult = Dts.Results.Success
Once that has been done, the resulting string could be emailed or logged as desired. Since SSIS can generate fairly verbose error messages, I chose to store the messages in an ArrayList object. I then use the ArrayList to build the actual string message.
Hopefully, this information is helpful to anyone who wants to customize SSIS event handling. I’ve attached the sample package, and the text file used in the sample. If you have any feedback or suggestions for improvement, please leave them in the comments.

http://agilebi.com/jwelch/2007/05/05/handling-multiple-errors-in-ssis/


Previously, I posted about how multiple errors in SSIS can be combined into a single unit for emailing or logging. Recently, a thread on the MSDN forums pointed out that the example wasn’t quite complete. While it does show how to collect all the error messages and get them into a single string, it doesn’t show how to email or log that string only once. So I’m giving it another shot, and show two options for handling this. This post does build on the previous one, so please read it before proceeding.

If you download the sample package in the previous post and examine it, you’ll see that there is a script task in the OnPostExecute event. This script task is where the error messages are assembled, and the code to deliver the bundled error message would be placed. Please note – the script does not actually email or log the message, that was left open for the reader to implement.

If you run the package, you’ll notice that the OnPostExecute event fires twice. It’s firing once for the data flow, and once for the package. Since the OnPostExecute event handler is where the script is located, this causes the message to be delivered twice – not what we want. To work around this, we need to make sure the OnPostExecute only fires once. One way to handle this is to set the Propagate system variable to false. The Propagate variable controls whether the event is raised to the next container. In this case, if Propagate is set to false on the data flow’s OnPostExecute event, the package’s OnPostExecute will only be fired once.

image

However, this approach requires that all tasks in the package need to have the Propagate variable set to false in their OnPostExecute event handlers. To work around this, I recommend incorporating a single Sequence Container, with the Propagate variable set to false on it’s OnPostExecute. Any tasks in the package should be added inside the Sequence Container. If you do this, the Propagate variable only needs to be set once.

image

Another approach is to eliminate the use of the OnPostExecute event altogether. To do this, the Script Task needs to be moved to the control flow, and the OnPostExecute event handler can be deleted altogether. A Sequence Container should still be used to hold all of the tasks that need to be grouped for error handling. The Script Task should be connected to the Sequence Container with a Failure constraint.

image

This approach also ensures that the Script Task will only execute once, if there is a failure in the Sequence Container.

Hopefully, this helps clarify how to leverage collecting the error messages to only send a single error result. I’ve uploaded the samples here, so please feel free to take a look at them and let me know if you have questions.

http://agilebi.com/jwelch/2008/01/15/handling-multiple-errors-in-ssis-revisited/

No comments: