Skip Navigation LinksHome > View Post

SSIS Xmlify Data Flow Task

UPDATE: This has now been updated by the legendary Jamie Thomson and is available on codeplex at: http://xmlify.codeplex.com/

"What?" I hear you say. This is actually my favourite custom task to date (after the recent Custom Logging, Set Variable and Compress File Tasks) and a big thanks has to go to Steve McHugh for providing the inspiration for this.

The basic idea is simple. The Xmlify component takes a series of columns and turns them into a single XML column, that looks like this:

<row>
    <col name="Column 1">Value 1</col>
    <col name="Column 2">Value 2</col>
    <col name="Column 3" null="true" />
</row>

This is useful if you want a consistent way of logging error rows for analysis later. Take this sample data flow which reads from a data file and converts the types of some of the columns.

Xmlify Data Flow

If, for example, we have a string column in the source that we want to convert into a datetime then there is always the possibility of some rogue data that will make the conversion fail. However, we don't always want this to make our package fail, we may want to store the failed input for analysis later and let everything else continue.

The Xmlify task allows you to achieve this by creating a simple table which could be used by all tasks in all packages:

CREATE TABLE FailedRows(
    errorCode varchar(50),
    errorColumn varchar(2048),
    [xml] xml NULL -- we use the xml datatype but you could use text if you prefer
)

(Obviously, I'd suggest you enhance this table with more columns such as a timestamp, execution instance id and whatever else takes your fancy...)

Now we simple configure the component to redirect errors rather than fail the component. Simply open the editor for the data flow task in question and click 'Configure Error Output'.

Configure Error Output

To change all dropdowns to 'Redirect row' in one go, click the top-left dropdown and then shift+click the bottom-right dropdown (highlighted in blue in the screenshot above). Now choose 'Redirect row' in the 'Set this value to selected cells' dropdown and click 'Apply'. Easy, your task is now ready to redirect failures down the red line toward the Xmlify task.

In the Xmlify task editor you simply need to check the columns that you want Xmlified. Whenever you use a red error pipe two columns, ErrorCode and ErrorColumn, are automatically added to your flow. Since we have a space for these in our table there's no need to Xmlify them...

Choose columns to be Xmlified

Finally you just point the Xmlify output at a suitable destination and map the columns. Remember, since the data shape is always the same you can use a Union to combine errors from multiple tasks (as described in this post).

Xmlify Data Flow

The msi installer is available for download. Looking at our server logs, the source code hasn't been very popular so far so I won't post it this time. However, if you want it just drop me a line.

As usual, this task is provided as is and without any warranty - you use it at your own risk. Be sure to let me know if you have any comments or suggestions - we love feedback. Enjoy.

Update: Due to popular demand, here is the full source code for download. In future, I'll endeavour to share the source to start with.

Tags: SQL SSIS

 
Josh Post By Josh Twist
6:11 AM
22 Jun 2006

» Next Post: DataBinding on SelectedIndexChanged in WinForms.NET
« Previous Post: Fun with SQL joins and predicates

Comments are closed for this post.

Posted by Gerald @ 04 Jul 2006 6:41 AM
Hi Josh,

what a great component I have found today :-)

There's just one thing confusing me: why are the error columns in the table FailedRows of type varchar while the Xmlify's error columns are of type four byte signed integer?

What about a feature for adding additional columns having a fixed value? So I can add an error source field to each Xmlify component and later easily filter the log table.

Of course now I can do this using the derived column component, but it would be more convenient in the Xmlify Task.

kind regards,
Gerald

Posted by Josh @ 04 Jul 2006 2:31 PM
Hi Gerald,

Glad you like it, and great question:

"why are the error columnsin the table FailedRows of type varchar while the Xmlify's error columns are of type four bytes signed integer?"

It's not in fact the Xmlify's error columns that are four byte integer - it's the red error path that adds these columns and they just pass through the Xmlify component. However, I thought that you may want to use the Xmlify component without a red error path (as a result of a certain conditional split for example) and then you could populate the error column in the FailedRows table with anything you like (using a derived column :)

I agree that extra columns might be useful and would be pretty easy to add. I'll add it to the wishlist for now when I get some time...

Posted by Jamie Thomson @ 12 Jul 2006 1:07 AM
Josh,
An idea for you. How about, when you post your custom components and tasks, post a package as well that uses the task so as to demo what it does!!

By the way - I'm just checking out XMLify now. Finally!!

-Jamie

Posted by Josh @ 12 Jul 2006 1:21 AM
Thanks Jamie and good idea, I shall endeavour to do that. If I can find time I'll try and post one for the Xmlify component.

Josh

Posted by Kristian Wedberg @ 03 Sep 2006 11:00 AM
Hi,

I've tried out your spiffy XMLify task, but quickly found a showstopper. Setup & symptoms:

A script component generates records. After a few Derived Column, Conditional
Split & Union All, XMLify gets some records and transforms two columns into XML,
the output goes directly into an OLEDB destination which does a bulk insert into
a local sql server.

* Processing 20000 records without steering any through XMLify takes less than a second

* Processing 20000 records and steering 400 through XMLify takes less than a second if
DefaultBufferMaxRows=50000 (i.e. a single buffer reaches XMLify)

* Processing 20000 records and steering 400 through XMLify takes _30_ seconds, if
DefaultBufferMaxRows=10000 (i.e. multiple buffers reaches XMLify)


- Have you seen anything like this before?

- Any ideas on what's happening?


Thanks!
Kristian

Posted by Kristian Wedberg @ 03 Sep 2006 11:07 AM
Btw, I should add:

When it takes 30 seconds, it seemingly does nothing for all that time:

- No cpu or memory consumption
- No pipeline events generated

And it's the same both in VS and with dtexec.

Cheers/Kristian

Posted by Josh @ 05 Sep 2006 8:56 AM
Hi Kristian,

Thanks for your feedback. I'm on vacation at the moment but i will have a look at this when I get back. I haven't seen it before though. Thanks again for the detailed info.

Josh

Posted by Bruno Figueiredo @ 27 Mar 2008 5:21 AM
Hi Josh. Can you help me out...I need to have two input collections (like for instance the union all). I started by creating a new IDTSInput90 input2 = ComponentMetaData.InputCollection.New(); and then I went right over to the package and when I tried to use it I got an error... Can you help me out?

Posted by Josh @ 27 Mar 2008 8:31 AM
Hi Bruno,

I'm not sure I fully understand the issue. Also, it's been a while since I looked at this stuff so not sure how quickly I'll be able to respond.

Josh

Posted by RA @ 23 Apr 2008 6:56 AM
hello josh,
can you please mail me the entire code /zipped folder at this address :-- rubyengg1@yahoo.com

Regards
Ruby

Posted by Josh @ 23 Apr 2008 7:24 AM
Hi Ruby,

You can download the full source at the link above (or this url: http://www.thejoyofcode.com/uploads/XmlifyTask_source.zip).

Josh

Posted by surender @ 25 Aug 2008 6:27 AM
hai

from xml file data can be insert into table.but i have tried its given only one column in ssis xml source component...please give me an idea how to retrieve the values for total data

Posted by Todd McDermid @ 19 Jan 2009 2:02 PM
I've included your task in a directory of custom tasks and components for SSIS on CodePlex. Hopefully more people can benefit from the effort you've put in. Thanks for contributing!

Please visit the SSIS Community Tasks and Components project at http://www.codeplex.com/ssisctc.

PS: I hope you can find the time to put up your components on an open-source host like CodePlex. I think it might expose your contribution to more collaboration, and ease source/binary management issues.

Posted by CozyRoc @ 27 Jan 2010 11:11 AM
There is a third-party solution available, which can generate arbitrary type of text documents. It is called CozyRoc Template Task. The setup is similar to the creation of ASP or JSP page where you create text template and then you can connect variables, files and data flows as inputs. You can find more information about it here: http://www.cozyroc.com/ssis/template-task

Posted by Rey @ 10 Feb 2012 1:18 PM
Great inimroatfon. Looks like just what I was looking for.I would like to add the url to my user list table but not sure how to do it. I am new to coding and do not want to mess anything up. Could you email me some detailed instructions on what, where and how to do this, or direct me to more inimroatfon.Thanks

© 2005 - 2014 Josh Twist - All Rights Reserved.