Geeks With Blogs

Arthur Zubarev Compudicted

Microsoft has just released a new SSIS 2008 transform component called SQL Server Integration Services Balanced Data Distributor.

As the download site states “this transform takes a single input and distributes the incoming rows to one or more outputs uniformly via multithreading”.

Sounds like we got a component to evenly distribute some data across multiple destinations. Frankly, I though we could do this even before using either a Script Component or the Conditional Split, alas this component has a different logic.

Unlike anything before, after a quick download and an interaction with a license agreement Word document I get a link to the actual component download, install and I am finally ready to take a closer look at it after I added it to my Data Flow Destination items.

In my experiments I decided to re-use one of my relatively simple sandbox SSIS packages. As a side note, this component works with SSIS 2008 and SSIS 2008 R2 versions only.

The properties of this new component look rather spartan and do not offer customizations of any kind other than its name, internal metadata validation and description. I was a bit surprised to see such a limited set of properties exposed.

I decided to capture some data from the Adventure Works database and distribute it to several flat files. Balanced Data Distributor to the rescue!

Configuring this transform is a simple drag and drop one source output to a destination then another becomes available, simple!

Next I loaded a subset of data (rows-wise) from the Adventure Works database Sales.OrderDetail table into my sandbox table and fired the package. I got a result that I almost expected:

Buffer10K

Well… I did not expect to see so few rows in the left file. The a question popped up in my head: what will happen if I reduce the number of rows even further?

Here is the result:

Buffer9K

No rows were loaded into the 2nd destination!

This made me realize that the component is using an internal buffer of 10,000 rows and it is pre-set. No property page to override it. Changing the row width does not seem to affect this behavior either. At least in my experiments.

So the conclusion is component has some pre-defined logic how to distribute rows.

As a closure, a word of caution, making this component input data onto the same flat file destination does not make sense because the file will be locked and the package will fail after a warning message: “[… File Destination [nnn]] Warning: The process cannot access the file because it is being used by another process.“

Related: if you want to see how this new component performs in terms of speed you can visit Boyan Penev’s post here.

Posted on Saturday, May 21, 2011 12:25 AM | Back to top


Comments on this post: Playing Around With the New SSIS Balanced Data Distributor Transform Component

# re: Playing Around With the New SSIS Balanced Data Distributor Transform Component
Requesting Gravatar...
You actually need to change the DefaultBufferSize on the Dataflow - reduce it so that it is full on Ole DB Source and sends to distributor many chunks to send to different paths.

Reduce DefaultBufferSize
Left by Kgopotso on Feb 25, 2014 8:30 AM

# re: Playing Around With the New SSIS Balanced Data Distributor Transform Component
Requesting Gravatar...
This is absolutely correct Kgopotso.

Thank you for your comment.
Left by Arthur Zubarev on Feb 25, 2014 3:03 PM

Your comment:
 (will show your gravatar)


Copyright © Compudicted | Powered by: GeeksWithBlogs.net