Thursday, July 9, 2009

Presenting at Vancouver's PASS Chapter in July

A few weeks back Richard Baumet, the leader of my local PASS chapter, asked if I was interested in presenting something on SSIS in October.  I was just putting together some thoughts on topics when he realized his scheduled presentation to the chapter in July conflicted with another engagement.  He put out the emergency call, and since that night happened to be free for me, I offered to present.  Fortunately, Richard allowed me to do my presentation on SSIS - not FILESTREAM as he had been planning.  A win-win in my books - as I know nothing of it!  Hopefully the chapter thinks so as well, and I'm not left talking to crickets.
The Topic
I'll be doing a variation on a topic I presented at DevTeach/SQLTeach in June, and that I just happened to blog about recently.  It was well-received at the conference, but I still think I tried to cover too much - talking about custom tasks and data flow components.  (That does seem to be a problem for me!)  Since I already have those resources under my belt, I thought I'd stick my neck out a little bit further on this one..
Are you developing and supporting Integration Services packages? Five bucks says you have almost identical Script Tasks littered among your packages – they’re the “quick and easy” solution to a lot of common file handling, auditing, and other needs. But just like most “quick and easy” solutions, maintenance can quickly get out of hand – there’s no nice way to version Scripts, know which Scripts are “up to date”, or even know where a particular Script is being used so that you can update it. Every time you create a new package, you find yourself copying and pasting that same Script – and then modifying code and variable access to it. There is an alternative to that mess – Custom Tasks – and it’s really not as difficult as you might think.
Todd will lead you through the process of creating a Custom Task - turning your copy-paste-then-edit-script drudgery into drag-and-drop bliss. It’s no harder than the copy-paste-edit cycle you already go through – except you only have to do it once. In addition to finding out exactly how to do it yourself, Todd is inviting attendees (actual, virtual, or otherwise) to send in their favourite Script Task to be converted during the presentation! You could walk in to work tomorrow with a drag and drop solution for your Script maintenance nightmare...
What a deal!  You come for pizza, learn some stuff, AND go away with something you can take to the (time)bank!  I'll be trying to keep the opening slides to under ten minutes - hopefully that leaves enough time to get two or more custom Script Tasks built for attendees.  They'll all be available after the presentation - definitely on my SkyDrive, and maybe even on CodePlex in the handy SSIS Custom Tasks and Components directory.
When and Where
If you're in Vancouver on July 23rd, RSVP to Richard (rbaumet "@t" shaw.ca), and come to Sierra Systems' offices at 5:30.  The raucous party is on the 25th floor of 1177 West Hastings.  Pizza and drinks are being graciously provided by Idera and Sierra Systems.  (That reminds me that I have to try Idera's SharePoint toolset - SharePoint's built in backup/restore bites.)

Saturday, July 4, 2009

Using DtsMappingSurfaceControl in your Custom SSIS Component

As one of the "wishes" I'd had when creating my custom Kimball Method SCD component for SSIS, I'd wanted to give myself the same kind of column mapping interface that Microsoft had given other components - like the Merge Join or the OLE DB Destination components.  It's a relatively elegant interface (even if somewhat prone to drag and drop misses) providing a decent view of the columns involved, and an easy way to link them up.
I had posted a question in the forums about how to go about doing that, and hadn't gotten much response until KimYda did a little more digging than I had, and came up with a gem.  The control was "hidden" away in a Microsoft assembly they used internally.  Kim didn't provide any other details, so I had to plug away at it a bit until I felt satisfied that I could reasonably use the control.  Well, I can only hope I'm doing some extra legwork because I don't have access to internal Microsoft documentation and/or object model - because if not, I truly pity the developers on the SSIS team that have to wire up this control manually.  It's way too involved, and I wrote a wrapper even though I only use it in two places!  (You'll find the link below.)
The Short Version
You'll need to add six assembly references, add the mapping control to the toolbox, add the control to your design surface, and add a tooltip control to your UI.  In the code, you'll have to add drag and drop events and handling code, and should add handling for tooltips for the all important metadata.  Last but not least is translating "mappings" to and from the control so you can persist them in your component.
If you're looking for a shortcut, you can download two C# source files from my SkyDrive.  I've extracted the wrapper from my Kimball Method SCD project, dumping some XmlSerialization code out.  The SkyDrive version ought to work, but the "latest" bugfixed/new featured version will always be in the Kimball Method SCD project source - it's in there in the "SSIS Custom Component Utilities" project, in "DtsMappingSurfaceControlWrapper.cs" and "ColumnInformation.cs" source files.  (At least, it will be once I release v1.5...)  It's quite involved (read on to see why), and you'll have to remove some code if you don't want to use my property persistence framework.
The Details
Reference The Assemblies
KimYda noted that the "Microsoft.DataWarehouse" assembly contained the controls I was looking for.  Over the course of actually making this thing work, I found five other assemblies which are essential.  The six assemblies you'll need to add as references are:
  • Microsoft.DataTransformationServices.Controls* (in the GAC)
  • Microsoft.DataTransformationServices.Design (in \Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\)
  • Microsoft.DataWarehouse (in \Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\)
  • Microsoft.DataWarehouse.Interfaces (in \Program Files (x86)\Microsoft SQL Server\100\SDK\Assemblies\)
  • Microsoft.SqlServer.DlgGrid* (in the GAC)
  • Microsoft.SqlServer.GridControl  (in \Program Files (x86)\Microsoft SQL Server\90\Tools\Publishing\1.2\)
Four of those (non-starred) you should be able to find via the standard "add references" method.  Two of them you'll have to go to extra lengths with.  Keep in mind that I'm running Windows 7 x64 - that's why there's references in the paths above to Program Files (x86).  If you're running a 32-bit Windows, just refer to Program Files, or use your favourite file search facility.
Microsoft.DataTransformationServices.Controls is in the GAC... but you can't see it (at least I couldn't) using Windows Explorer or Windows 7's integrated file search.  You'll have to fire up an (Administrator-privileged) command line, manually navigate to C:\Windows\Assembly, and down into GAC_MSIL.  There, you'll see a folder with the name you're looking for - now you just have to (painfully) drill down in there until you uncover the DLL.  Now you'll have to copy the DLL out of that folder into another location and reference it from there - because even though you've located it, the Add Reference browse won't see subfolders in the Assembly folder like the command line will.  I copied it right out to my source folder so I wouldn't have to look very far in the future.
The same song and dance occurs with Microsoft.SqlServer.DlgGrid - it's in the GAC, but inaccessible to the Add Reference wizard.  The oddity here is that it's in the "90" subfolder, even though I don't have SQL 2005 installed.  Well, I'm sure there's a reason - but I have no clue why it's there.
Add the DtsMappingSurfaceControl To Toolbox
Open up your VStudio toolbox, right-click and "Choose Items..."  Scroll around and find DtsMappingSurfaceControl and check the box to the left.  You'll get an inspired looking gear icon in your toolbox.  (They never intended these tools for public consumption, so why waste time on icons?)
Add a DtsMappingSurfaceControl To Your UI
Drag that new control onto your design surface.  It doesn't look like much - but you should see the a scrollable pane with a "subwindow" in it titled "Available Input Columns".  Excellent - we're almost 10% of the way there... ;)
Add a Tooltip Control
I recommend you add a tooltip control to your form to supply the metadata info (column data type, length) to the user when they float the mouse over them.  It's optional, but remember how irritating the lack of that feature is on the Derived Column component!
Initialize the DtsMappingSurfaceControl
Time to get into code!  From here on, there will be a lot of use of the words "first" and "second" when referring to the column lists and parts of the surface mapping control.  The "first" table/control/grid refers to the one that shows up to the left of the surface mapping control, and the "second" table/control/grid refers to the one that shows up on the right hand side of the control.  Well, initially anyway - the user can move them around wherever they like - that's probably why the SSIS team refers to them as "first" and "second". 
When I'm exposing what code to use, I'll be mentioning "FirstMovableGridControl" and "SecondMovableGridControl" a lot.  To keep things short, when possible, I'll refer to "XMovableGridControl" when what I'm talking to applies to either of those properties/objects.
Step one, set up some properties on the control.  For my purposes, I wanted to show two tables to link columns between them, so I set:
  • TwoTables property to true
  • XMovableGridControl.GridLayout to Check_Name
  • XMovableGridControl.Text to the first and second table names, respectively.
You'll have to supply the control with the column names for the first and second tables by constructing an array of ColumnInfo structures, and passing those arrays to InitializeFirstTable and InitializeSecondTable.  It's unfortunate that the ColumnInfo structure isn't more comprehensive and contain data type information - if it did, then the control could handle tooltip events itself.
Add Drag/Drop Events and Handlers
Next, you'll have to set up the drag and drop event ecosystem.  This was the difficult part (for me) as I'd never used drag and drop extensively, AND the sub-controls involved in this composite control were not consistent.  I'll try to highlight the inconsistency as we go. 
In broad strokes, you'll need to enable dragging and dropping, add a MouseDown, MouseUp, and MouseMove for each "side" to handle the "start" of the drag, and add a DragOver and DragDrop for each side to handle the "drop" part of the drag.  Almost all of those event handlers get placed on the GridControl of the XMovableGridControl, like this:
dtsMappingSurfaceControl.FirstMovableGridControl.GridControl.MouseDown += new MouseEventHandler(YourEventHandlerHere)
I say "almost all" because the DragOver and DragDrop event handlers for the FirstMovableGridControl are tied directly to that control, not to the GridControl property of that control.  To restate, add event handlers for all the above mentioned events to the XMovableGridControl.GridControl objects except for FirstMovableGridControl's DragOver and DragDrop.  Add those event handlers directly to FirstMovableGridControl, not to its GridControl property.
Now, in order to determine if the user is dragging instead of just moving the mouse around, we'll need to keep track of where the user was when they initially clicked the mouse.  So in the MouseDown event handler, store the mouse location (supplied in the MouseEventArgs) into a class-level variable.  The MouseUp event handler should clear that variable (by setting it to Point.Empty).
The (first) complicated part occurs in the MouseMove event handler.  Here, you'll have to first determine if the mouse has moved sufficiently to initiate a drag/drop operation.  You'd do that by comparing the location you stored in the MouseDown event to the location passed in the MouseEventArgs of the MouseMove, and seeing if that wanders outside the box prescribed by SystemInformation.DragSize.
If you have moved enough with the mouse button down to start dragging, you need to know what you're dragging.  Call the XMovableGridControl.GridControl.HitTest() method to get back a HitTestInfo structure.  One of the properties on that structure is the RowIndex.  If that property is zero or greater, the drag started on a valid column (row? :) ).  If it's a negative value, the user tried to drag the column header or other part of the control.  Use the RowIndex to retrieve the column name by using the XMovableGridControl.GridControl.GetCellString method, passing the RowIndex as well as the column index for the column name.  Use XMovableGridControl.ColumnNameIndex to supply that index.  Doing all that should have retrieved the column name that the user's started to drag.  Next you need to call XMovableGridControl.DoDragDrop() to start the drag operation.
Great!  You've got the drag started, so it's time to code the event(s) to handle the drop.  The DragOver event changes the icon to the appropriate "can't drop here" or "drop here" icon.  I've chosen to set the DragEventArgs.Effect to DragDropEffects.None when being dragged over anything but the destination control.  (Watch out here for the inconsistency between First and Second.)
The important part - and second complex part in this drag/drop dance - is the destination control's DragDrop event handler.  First, extract the source column name from the drag/drop operation by calling DragEventArgs.Data.GetData(typeof(System.String)).  Second, we have to find out where the user's decided to drop the link.  Do that by interpreting the drop location using XMovableGridControl.PointToClient.  (We haven't had to interpret locations like this anywhere else before - so don't think I missed this step elsewhere.)  Then perform the same HitTest method call as you did in the MouseMove event handler - except on the receiving XMovableGridControl.  The HitTestInfo structure returned will tell you your destination column name.
Now that you have the source column name and the destination, you can instruct the control to draw the "mapping" line between those two items.  You can do that by using the DtsMappingSurfaceControl's AddConnector method.  There are two things to note with this:
  1. You need to supply the arguments to the AddConnector method in the "first", "second" order of the tables, not the "source", "destination" order of the drag/drop operation.  To restate, if the user dragged a column from the "right" to the "left", you would have to reverse the arguments to the AddConnector method.
  2. If the second table's column is involved in any other connection, the AddConnection call will (silently) fail.  You need to call the DeleteConnectors method first - but to do that, you need to know what "first" column the second column is connected to, which means you need to find it in the Connections collection.
Whew.  That was a moutful - and a handful of work for you to do.
Add Tooltip Handlers
I've set up a tooltip object in my wrapper class.  If you choose to implement tooltips for your control, there are a few issues you'll have to deal with:
  • You'll need to dynamically set the tooltip message in your MouseMove event.
  • You'll have to trigger the tooltip to display "manually", because it only displays itself automatically when you move from control to control... which doesn't include moving from row to row (column to column?) within the surface mapping control.
  • You'll have to supply the metadata information, because the surface mapping control doesn't have it.
Add Mapping Code
At this point, you should have something you can "play" with at design-time in SSIS - but you don't have any way to set the connections before you show the control, nor retrieve the connections the user has set.
To set up the connections before the control is shown, call the control's AddConnector method, passing it column names.
To retrieve the columns involved in connections after the user has used the control, iterate over the DtsSurfaceConnection objects in the Connections collection of the control.  You're interested in the SourceConnectorObject and DestinationConnectorObject.  The "source" and "destination" here do not refer to any directionality in the mapping (to my knowledge) - the "source" refers to the "first" table, and the "destination" refers to the "second".
I'm Done!
Wow - you read all of that and built your own "mapping" interface?  Congratulations are in order - or maybe we can commiserate with each other on how much punishment we're willing to inflict on ourselves in order to get this far.  You could have always taken the short-cut of downloading the source from my SkyDrive or CodePlex... I wish I could have!
Using this, I was able to turn two tabs in my component editor that looked like this:
 To something I think looks (and operates) a lot nicer:
Have fun with your component!

Tuesday, June 30, 2009

Converting Your Script Task Into a Custom Task, Part III - Create Your UI

Spectacular.  If you've made it this far, then you've read Part One and agreed you felt too much pain from maintaining copy-and-paste Scripts.  You've said enough is enough, taken one last copy of your Script code and pasted it into Part Two's custom task harness, and massaged it to get a shiny new tool in your SSIS Control Flow toolbox.
This is the "last mile" and the place where you can change your workmanlike custom task into something that helps you get your packages designed properly, the first time, every time.  (Freebies at the bottom...)
UI Choice
Those crafty developers on the SSIS team gave us two very good ways of hanging a UI on a Task.  There's the completely custom way of creating your own Windows Forms interface - that's what I'm going to explain here.  There's also a neat way of extending built-in SSIS classes to get a very "stock" flavour to your UI.  Matt Masson does a great job detailing how to do the latter on the SSIS Team Blog: Creating a custom task with a default UI
Personally, I like the completely custom way of presenting the UI.  Despite not getting the expression editor UI included for "free," you do get a ton of control over how properties are presented and interact with each other - as they tend to do.  And really, it's not all that hard - the "hard" part is coding the UI, but if you've ever coded a Windows Forms app, that's a piece of cake.  (I'm not going to detail that here.)
The Parts of a Custom UI
In order to construct and attach a custom UI that BIDS knows how to use, you have to follow some rules.  First, you need to create a class that implements the IDtsTaskUI interface.  Second, mark up your component's class attributes to tell SSIS you've created that interface implementation.  Third, construct your UI class(es).
The Interface Class
The IDtsTaskUI interface describes a contract between your component and BIDS.  It tells BIDS how it must interact with your component at design time when the user edits your component, and how your component is expected to respond to those requests.  In order to provide an object for BIDS to communicate with using this interface, you need to define a new class that will implement your Task's handling of that interface.  This may get a little confusing - because we're talking about a C# "interface" construct here (the "contract" definition), and the class we're going to be creating to implement that is going to be called the "interface class".  I'll do my best to separate the two.
To get at the IDtsTaskUI interface definition, you have to include "Microsoft.SqlServer.Dts.Runtime.Design" in your "using" statements at the top of your code file, as well as "Microsoft.SqlServer.Dts.Runtime".  In order to get at those, you'll need to have added references to "Microsoft.SQLServer.ManagedDTS" (you should already have that reference from constructing your Task), and "Microsoft.SqlServer.Dts.Design".
The IDtsTaskUI interface requires that you implement four methods in your class: Initialize, GetView, New, and Delete.  All of them are very bare, if not empty.  The key ones are Initialize and GetView.  Initialize passes you references to the metadata of your Task so that you know what you're editing.  GetView launches your editor.
Since Initialize and GetView are called separately, the TaskHost and IServiceProvider arguments that Initialize gets passed to it aren't available in the GetView method.  And it's this GetView method that actually launches your editor.  In order to handle that, you need to store the TaskHost and IServiceProvider arguments (or parts of them) inside the interface class, and pass them on to your editor class instance you create in the GetView method.  Declare these at the top of your class:
private TaskHost _taskHost;

private Connections _connections;
Then place this inside your Initialize method:
this._taskHost = taskHost;

IDtsConnectionService cs = serviceProvider.GetService(typeof(IDtsConnectionService)) as IDtsConnectionService;
this._connections = cs.GetConnections();
And this inside your GetView method:
return new YourTaskEditor(this._taskHost, this._connections);
Take a peek at the template sample to get this spelled out exactly.
Decorating Your Task To Call Your UI
By default, the Task you created will tell BIDS that it has no editor, and will only be able to be configured using the Properties.  In order to tell BIDS that your Task has a full editor, you need to "decorate" your Task's class with attribute values.
Open your Task class and look at the DtsTask attribute that's applied to it.  You probably have a few arguments specified to that attribute, such as DisplayName and TaskType.  We need to add a new UITypeName argument, and specify what it is.  This argument is a string type, and it needs to specify the fully qualified assembly name of your interface class - this consists of five parts: Your full interface class name, the class name of the Task itself, the assembly version, culture, and public key token.  Here's how you find all that information.
The Full Interface Class Name
In your interface class, you'll have a "namespace" statement just under the "using" statements.  That's the first part of your class' full name.  Then you'll see your class' name right after the "class" statement.  Here's an example:
namespace Sample.SSIS.Namespace
{
    class YourTaskInterface : IDtsTaskUI
    {
From the above, you would get this as your full interface class name: Sample.SSIS.Namespace.YourTaskInterface
The (Short) Class Name of Your Task
This is the name of the class that you've converted your Script Task into, simple as that.  As an example:
namespace Sample.SSIS.Namespace
{
    [DtsTask(DisplayName = "MyTask", TaskType = "General")]
    class YourTask : Task
    {
From the above, you would get your Task class name as: YourTask
Assembly Version
Most likely (especially if you're working from the template), this is "1.0.0.0" - and it's typically a "bad idea" to change your assembly version - it will break (beyond automatic upgrading) any packages that used a different version of your Task.
You can find this value by opening the project properties of your Task and looking at the Application tab.  Once there, press the "Assembly Information" button, and copy the "Assembly Version" value.  You can also locate this value by navigating to the Windows GAC (Global Assembly Cache), finding your installed Task, and looking at the properties there.
Again, the most likely value is: 1.0.0.0
Culture
This part is also most likely set to "Neutral" - but you can check to make sure by looking at your Task DLL's properties in the GAC.
The most likely value is: Neutral
Public Key Token
This is the public half of the asymmetric key you used to sign your DLL.  You may have it recorded somewhere - but you can also find this easily by looking at your Task DLL's properties in the GAC.
Tasks signed with my key have this public key token: 8b0551303405e96c
All Together Now - the Fully-Qualified UI Type Name
Now that you've located all that, string it together separated by commas, like this (taken from the above example values) and add it in to the DtsTask attribute argument list:
[DtsTask(DisplayName = "MyTask", TaskType = "General",
    UITypeName = "Sample.SSIS.Namespace.YourTaskInterface,YourTask,Version=1.0.0.0,Culture=Neutral,PublicKeyToken=8b0551303405e96c")]
Your UI
This one is (almost) all up to you - use Windows Forms to design your interface.  You can get as elaborate as you desire - have multiple windows, make a "wizard" style interface, use a tabbed window - anything.  Your only requirement is to have a constructor that accepts a TaskHost and IServiceProvider (or just a Connections list, as I've done), as shown in the GetView method of the interface class.
In order to accomplish this from scratch, you need to add a reference to System.Windows.Forms, and place "System.Windows.Forms" in your using statements.  You'll also (likely) need to add "Microsoft.SqlServer.Dts.Runtime" in your using statements in order to accept and use the TaskHost and Connections arguments you're passing your constructor.
Wrapping It Up
Again, lots of windbag explanations for something that turns out to be fairly simple.  Add an argument to your DtsTask attribute, create a virtually empty "interface" class, then code your UI however you want.
Download the EmptyTaskWithUITemplate project from my SkyDrive to get started, or have a look at the multitudes of samples available on CodePlex.

Monday, June 22, 2009

Converting Your Script Task Into a Custom Task, Part II - How?

In Part One of this three-part series, we covered the background on why you might want to (need to) construct your own Custom Task in SQL Server Integration Services.  Unless you've already read that and come back here to refer to these instructions, I strongly suggest you read over Part One to put the following in context.
This post covers all the steps and issues you'll need to cover off to translate your Integration Services Script Task into a Custom Task.  I swear that with these instructions, it should take about ten minutes for you to convert your Script Task to a Custom Task.  Part Three will cover the steps required to extend your task with a Custom UI - which also isn't as difficult as you may think...
The Tools You'll Need
In order to build your own Custom Task from the contents of your Script Task, you need to know SSIS - at least enough to have coded scripts and feel the pain of maintaining them. You'll also need access to a full Visual Studio install - I believe that Standard edition is sufficient, but I use Professional, so don't take that as gospel. You'll also need to know your way around Visual Studio and the .Net Framework - but I'm assuming that if you actually coded your Script, you'll be able to get by. You'll need administrator access to your development machine - we'll be adding things to the GAC. Finally, you'll need the SQL Server Integration Services SDK installed on your development machine - it comes on the SQL install media.
Understanding The Differences
Life in the world of Custom Tasks is a barren wilderness... you can get started by reading MSDN (Books Online), and by downloading samples (see the Tasks section)... but those can be a little inaccessible with no firm guarantee of understanding what's going on, or when you'll "be done".  That's what this series aims to provide, and one of the essentials is knowing what's different in a Custom Task from a Script Task.
Code Harness
The SSIS team did a great job of building that "harness" for the Script Task - the behind the scenes code and comments that help you get started with the Script Task.  There's no such thing for a Custom Task... until now.  I've created an uploaded an "empty" Visual Studio 2008 C# project that has much of what I'm going to tell you all done, ready to get started with your itchy copy and paste fingers.  Download the EmptyTaskTemplate from my SkyDrive for the quick start.
Variable Access Syntax
One of the things your Script Task is all but guaranteed to do is reference SSIS variables.  You're likely passing in a file name, or a SQL statement, or text for an email, and might be passing some kind of a result back out.  The syntax for accessing and updating those variables is different in a Custom Task - but only slightly.
In the Script Task, you had to tell the Script Editor UI which variables you wanted access to - before you got into the script itself.  That was to enable the SSIS dev team to create that harness for you.  They found those variables using the COM API, and locked them for your use in the Script.  "Locked" meaning it informed SSIS that your Script was going to read or write to them, so it shouldn't allow any other Task to do the same at the same time - or you might get some corruption or other very weird issues cropping up.  Now that you're in a Custom Task - you need to take care of that for yourself, but it's not all that hard.
Instead of adding a variable to the ReadOnly list in the Script Task Editor UI, then referring to it via code like this:
string fileName = (string)Dts.Variables["FileName"].Value;
Dts.Variables["ModifiedDate"].Value = DateTime.Now;
You have to do a few steps (all in code), like this:
Variables vars = null// creates a repository for us to store references to the variable(s) we want
variableDispenser.LockForRead("User::FileName"); // adds a variable to the list we want SSIS to lock as ReadOnly
variableDispenser.LockForWrite("User::ModifiedDate"); // adds a variable to the list we want SSIS to lock as ReadWrite

variableDispenser.GetVariables(ref vars); // instructs SSIS to lock the variables in our list, and retrieve references to them

string fileName = (string)vars["User::FileName"].Value; // reads the value of a variable
vars["User::ModifiedDate"].Value = DateTime.Now; // writes to a variable

vars.Unlock(); // releases the variables so other Tasks can use them
There's a new object in play in the above sample code - variableDispenser.  This object is passed in to the Execute method - you'll see that later.
Connection Manager Access Syntax
Sometimes, you'll need your Task to communicate with your database in a way that SSIS might not support elegantly, so you'll decide to do so in a Script.  Hopefully, you don't hardcode the connection into your Script, and you use a Connection Manager to get at the connection.  In much the same vein as for variables, the syntax used to access those changes in a Custom Task - but locking isn't an issue here.  Instead of:
ConnectionManager cm = Dts.Connections["MyConnection"];
You'd say:
ConnectionManager cm = connections["MyConnection"];
Again, like the variables, the connections object is a parameter to the Execute method seen later.
Properties
One of the new capabilities you're sure to use in a Custom Task is the possibility of using properties.
Remember in the last article, I said that the only way a Script Task had to communicate with the package containing it was with variables?  If you wanted to make your Script reusable, you'd have to make it expect and respect directives that could only be given to it via passed in variables.  In the above example of retrieving variable values, I referred to a variable name directly - it was hardcoded into the custom task.  That's a little unrealistic - you can't expect the person using your task (even if it's you) to remember exactly what variables the task is expecting to use, and the precise spelling and case of the variable.
Custom Tasks give you a much better method of making your Task easy to configure.  Properties are drop-dead simple - just make a public property get/set pair in your class, and you're done.  The only caveat: the type of the property has to be a simple type - integer, numeric, or string.  If you want to expose an object as a property, you need to jump through a few more hoops - you'll need to provide a custom UI to edit it, and override how the Task saves its metadata to the package DTSX file.
It's quite likely that most of the variable usages in your Script Task will be replaced with properties, making configuration that much easier, and making your packages that much cleaner too.  When you have properties on your Task, you can specify the property explicitly, or use property expressions to fill it - it's up to the user of your Task.  Adding a custom UI (see part three) delivers even greater ease of use by allowing you to present the user with choices for some of your "restricted" properties - like an "action" property that only has a few acceptable values.
Validation
This concept is probably something that your Script Task never considered, but can be very important for a Custom Task.  Script Tasks I've seen in the wild rarely check that expected variables have been placed in the ReadOnly or ReadWrite lists.  Even more rare are Scripts that check to ensure that the variables are the correct data type.  Why is this?  It probably boils down to one reason - laziness.  Although it does help assuage your concience to realize you can rationalize your decision by recognizing that it wouldn't improve the experience much anyway.
I say laziness accounts for lack of validation in Script Tasks because it happens to me too.  You test your script code with the one condition you expect, and that's enough.  It's not too cumbersome to do some validation on your Script - but what value would that add to your Script Task?  If you checked for existence and data type of passed-in variables, what value could you (the coder) give to the experience?  Almost nothing except for a clearer error message at runtime.  There's absolutely nothing you can do in your Script Task to assist the user at design-time.  All of the variables and connections are specified by literal string - and the compiler can't check that those exist at design time.  Only at runtime can your Task do any checking - and by then it's too late.  Your only option is to fail the Script Task explicitly (by returning ScriptResults.Failure), or implicitly (via an unhandled exception which automatically fails the Task).  So, in the end, your laziness was prescient.  There was nothing you could do, so why try?
Fortunately for you, there is quite a bit of value you can add with a Custom Task.  First, the environment of use is different, which will probably dispel some of that laziness factor.  With the Script, you knew everything about the context that the Task would run in.  With a Custom Task, you have no idea what situations it will be placed in - so you'd better make it bulletproof.  (And you never know what morons will be using it either.  Oops - did I say that out loud?)  Add to that that you can validate your Custom Task at design time - which means that any problems with the Task configuration can be communicated to the package designer as warnings or errors before the package is run.  If you signal an error condition with your Custom Task - you'll actually prevent the package from being run entirely, just like OotB Tasks.  That's a fantastic benefit, even for tasks that you intend to use only for yourself.
The Custom Task Coding Environment
With Script Tasks, you placed your code in one method - Main().  You could construct as elaborate a script as you wanted - but SSIS called Main to execute your task, and that's the end of it.
A Custom Task inherits from the Microsoft.SqlServer.Dts.Runtime.Task, which has several overrideable methods: InitializeTask, Validate, and Execute.  Up at the top (as seen in the sample EmptyTaskTemplate) there are several attributes of the Task class that you can specify to give SSIS some valuable metadata for its toolbox.
InitializeTask
This method is rarely overridden in a Custom Task.  It can set default values for your properties (although you can also set those in the declarations), and can declare custom events for your Task.
Validate
Overriding this method is completely optional - but would be a big mistake to pass over.  This method gets called quite frequently at design time, and at least once at runtime.  It provides you (the coder) the opportunity to examine the values the package designer has set in your properties, check whatever SSIS variables and connections you have access to, and report back a status to SSIS and the user.  More importantly, you can report verbose errors and warnings using the passed in IDTSComponentEvents object.  Those warnings and errors are seen by the package designer in the output and error windows, and give invaluable information as to what they need to fix in order for your Task to operate correctly at runtime.
For example, suppose you had a string property that expected the package designer to specify an SSIS string variable.  You could check that the property was not empty, that it referenced an SSIS variable that actually existed, and that the variable was of the correct type.  Here's a sample of doing just that:
if (this._fileNameVariable == null)

{
    componentEvents.FireError(0, "", "No variable is specified for the file name.", "", 0);
    result = DTSExecResult.Failure;
}
else if (!variableDispenser.Contains(this._fileNameVariable))
{
    componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' does not exist.", "", 0);
    result = DTSExecResult.Failure;
}
else
{
    variableDispenser.LockForRead(this._fileNameVariable);
    Variables vars = null;
    variableDispenser.GetVariables(ref vars);
    if (vars[0].DataType != TypeCode.String)
    {
        componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' is not a String type.", "", 0);
        result = DTSExecResult.Failure;
    }
    vars.Unlock();
}
Instead of the task simply failing at runtime, the user gets some very informative messages as to why the task will fail, before they even try to execute it.
Execute
This is the meat of the Task - the runtime method that gets the work done.  Instead of the interaction with the outside SSIS environment being accomplished with the Dts object (as in Dts.Variables), those points of interaction are passed in as parameters to the method.
For the most part, this is the place where you'll paste the bulk of your code from the Script Task you're converting.  After you paste, you'll clean up the SSIS variable and connection manager accesses.  After that, you'll convert variable accesses into properties of the task.  And after that, you'll likely do some more work in Validate that didn't exist at all before.  Your Execute method will look a little cleaner and simpler than it did in the Script Task - hopefully meaning it's easier to debug and maintain.
Compiling, Debugging, and Distributing Your Task
Compiling
The next step is to compile your Task into a DLL.  If you've used the sample EmptyTaskTemplate project, all you should need to do is change the assembly name in the project properties, sign it with a key file, and compile.  If you've launched Visual Studio in administrator mode (in Vista/Windows 7) then the post-build event will automatically copy the DLL to the required folders and register it in the GAC.  (For those of you that run 32-bit OSes, you'll need to remove one line in that post-build event - the one that refers to "%ProgramFiles(x86)%".)
Debugging
You can now launch another instance of Visual Studio (BIDS, actually) and open an IS project.  In the toolbox, right-click and Choose Items.  Your new Task should be listed there along with all the other OotB and third party Tasks.  Drag a new instance on to your Control Flow, and design away!
If you want to debug the InitializeTask and/or Validate methods in your Task, switch to the first instance of Visual Studio, and select "Attach to Process" in the debug menu.  Pick the instance of "devenv.exe" that shows the title of your IS package.  You can now place breakpoints in your Validate and InitializeTask methods, and debug as normal.
To debug the Execute event, you need to place a breakpoint in the IS package on your Task (on the OnPreExecute event, to be precise - but that's the default).  Run your package, and when the breakpoint is hit, use the first instance of Visual Studio to attach to the instance of DtsDebugHost.exe that shows a Type of "T-SQL, Managed, ..."
Distributing
To distribute your Task to production servers or other developers' machines, you basically need to follow the steps in the post-build event of your project, or you could follow the relatively easy directions on the SSIS Team blog about creating an installer.
That's It?
Um.  Yes.  Pretty much.  If you think I missed something, please post a comment.
But "in for a penny, in for a pound," or "go big or go home!"  Read on to part three - creating a custom task UI.

Monday, June 15, 2009

Converting Your Script Task Into a Custom Task, Part I - Why?

I said I'd post about this about a month ago (maybe longer) and here it is.  I constructed a lot of this content to accompany a presentation I did at DevTeach Vancouver in June '09 - it should go a long way to helping you make your Task a drag-and-drop affair on to the Control Flow surface.  (BTW - there was more content at the conference - kick yourself for missing it, and the interactivity of my session.)
I was going to post this all in one shot, but that would have been far too lengthy.  So I've decided to divide it into three parts: why you should consider a Custom Task, how to make one, and adding a custom UI to your Custom Task.  Those ought to give you all the steps to take the contents of your Script Task, turn it into a Visual Studio project to compile a custom DLL, add that to BIDS, drop it on to your Control Flow surface, and merrily design packages with reuse confidence.
Out-of-the-box Tasks
The tasks that ship with SQL Server Integration Services (sometimes referred to as those that come "out of the box") are native (unmanaged) code.  They cover a wide range of capabilities, from database commands to file system and internet service interaction.  They include flow-control tasks such as a Foreach Container and Sequence Containers to manage encapsulation and concurrency.
What's a Script Task?
Script tasks are SQL Server Integration Services' primary vehicle for making the Control Flow of SSIS packages extensible.  If what you want done isn't available out-of-the-box, they wanted to provide some method for you to still accomplish your task.  (I have heard that this type of design decision is not common among ETL tool vendors - that's a shame, IMO.) 
In SSIS 2005, the Script Task allowed you to roll your own code in Visual Basic.Net (VBA to be precise).  SSIS 2008 extended this capability by replacing the VBA engine with VSTA, permitting VB.Net or C# code in the Script.  Both versions allowed you to do almost anything a full-fledged .Net framework application could do - on top of the ability to interact with the currently running SSIS package.  You could reference external assemblies, perform complex logic, ...
BIDS (Business Intelligence Development Studio - another name for Visual Studio when skinned for BI projects) allowed you to supply .Net code to accomplish the tasks you wanted to, and provided a neat harness that removed a lot of the complexity of plumbing into the COM interop you would otherwise need to do in order to execute in an IS package and communicate with it.
What's a Custom Task, and Why Do I Care?
A Custom Task is the next step in extensibility.  It's essentially a .Net Framework DLL containing code that conforms to a predetermined interface.  If you use a lot of Script Tasks to extend your packages' capabilities, you're probably quite aware of the Script Task's Achilles heel, and are looking for some alternatives.
Why Choose One Over The Other?
If at all possible, use an OotB Task.  The tasks that come with SSIS are very reliable and accomplish their jobs well.
That said, sometimes there isn't anything in that toolbox that does the job you're looking for.  A very common tool that seemed to have missed Microsoft's cut (or vision) was a task that would detect file existence, and/or report on file properties such as creation date or modification date.  In these cases, the Script Task is the tool of choice to add those capabilities.
So what space is left for Custom Tasks?  The best answer to that is reusability.
Reusability of Scripts
"Sure, Scripts are reusable - doesn't your copy and paste work?" - Captain Obvious
Of course they do - and of course you can reuse the code in a Script Task by copying and pasting into another Script Task.  In fact, you can copy the Task itself (not just the code) and paste it.  (You will have to edit it in order to get it to recompile, though - the paste doesn't trigger that step.)
The biggest problem with that statement, Captain, is that it's reusable at a cost.  The costs?
  • Some difficulty referencing external assemblies in BIDS 2005.
  • Script Tasks are rarely pure copy-and-paste reusable.  Variable names change from package to package, and there are always other small variations.
  • Fixing bugs means fixing bugs in each Script Task you've ever used that code in... and there's no way to find out where you've used it.
  • The only way to communicate "parameters" (or customize one use from another) in your Script is to hard-code them (not very flexible) or pass in variables (which can get tiresome).
  • There's no assistance provided to the package designer using the Script as to how they should use it.  Unless they coded the script, they won't know how many or which variables to pass in or out.
The Argument To Go Custom
Pretty much the only "argument" (reason) you can have to move to a Custom Task is that you've gone the Script Task route already, and have felt the burden of maintaining that codebase.  If you're looking to make your Task run faster - don't go custom - it won't help.  If you're looking for an easier debugging or coding experience - you won't find it here.
Take the Next Step!
If you've decided that now is the time to end the insanity of maintaining a reused Script Task, there may still be one option you should consider before you pull the trigger on converting to a Custom Task.  Ivan Peev has created quite a few Custom Tasks under the umbrella of CozyRoc.  One of those Tasks is the Script Task Plus, which does away with the primary pain point of reusability.  You owe it to yourself to give it a shot - it's free for eval.
If that's not for you, or if you're just plain code-crazy and would love to give this a shot, read on to part two and find out how to actually get it done.