Tuesday, June 30, 2009

Converting Your Script Task Into a Custom Task, Part III - Create Your UI

Spectacular.  If you've made it this far, then you've read Part One and agreed you felt too much pain from maintaining copy-and-paste Scripts.  You've said enough is enough, taken one last copy of your Script code and pasted it into Part Two's custom task harness, and massaged it to get a shiny new tool in your SSIS Control Flow toolbox.
This is the "last mile" and the place where you can change your workmanlike custom task into something that helps you get your packages designed properly, the first time, every time.  (Freebies at the bottom...)
UI Choice
Those crafty developers on the SSIS team gave us two very good ways of hanging a UI on a Task.  There's the completely custom way of creating your own Windows Forms interface - that's what I'm going to explain here.  There's also a neat way of extending built-in SSIS classes to get a very "stock" flavour to your UI.  Matt Masson does a great job detailing how to do the latter on the SSIS Team Blog: Creating a custom task with a default UI
Personally, I like the completely custom way of presenting the UI.  Despite not getting the expression editor UI included for "free," you do get a ton of control over how properties are presented and interact with each other - as they tend to do.  And really, it's not all that hard - the "hard" part is coding the UI, but if you've ever coded a Windows Forms app, that's a piece of cake.  (I'm not going to detail that here.)
The Parts of a Custom UI
In order to construct and attach a custom UI that BIDS knows how to use, you have to follow some rules.  First, you need to create a class that implements the IDtsTaskUI interface.  Second, mark up your component's class attributes to tell SSIS you've created that interface implementation.  Third, construct your UI class(es).
The Interface Class
The IDtsTaskUI interface describes a contract between your component and BIDS.  It tells BIDS how it must interact with your component at design time when the user edits your component, and how your component is expected to respond to those requests.  In order to provide an object for BIDS to communicate with using this interface, you need to define a new class that will implement your Task's handling of that interface.  This may get a little confusing - because we're talking about a C# "interface" construct here (the "contract" definition), and the class we're going to be creating to implement that is going to be called the "interface class".  I'll do my best to separate the two.
To get at the IDtsTaskUI interface definition, you have to include "Microsoft.SqlServer.Dts.Runtime.Design" in your "using" statements at the top of your code file, as well as "Microsoft.SqlServer.Dts.Runtime".  In order to get at those, you'll need to have added references to "Microsoft.SQLServer.ManagedDTS" (you should already have that reference from constructing your Task), and "Microsoft.SqlServer.Dts.Design".
The IDtsTaskUI interface requires that you implement four methods in your class: Initialize, GetView, New, and Delete.  All of them are very bare, if not empty.  The key ones are Initialize and GetView.  Initialize passes you references to the metadata of your Task so that you know what you're editing.  GetView launches your editor.
Since Initialize and GetView are called separately, the TaskHost and IServiceProvider arguments that Initialize gets passed to it aren't available in the GetView method.  And it's this GetView method that actually launches your editor.  In order to handle that, you need to store the TaskHost and IServiceProvider arguments (or parts of them) inside the interface class, and pass them on to your editor class instance you create in the GetView method.  Declare these at the top of your class:
private TaskHost _taskHost;

private Connections _connections;
Then place this inside your Initialize method:
this._taskHost = taskHost;

IDtsConnectionService cs = serviceProvider.GetService(typeof(IDtsConnectionService)) as IDtsConnectionService;
this._connections = cs.GetConnections();
And this inside your GetView method:
return new YourTaskEditor(this._taskHost, this._connections);
Take a peek at the template sample to get this spelled out exactly.
Decorating Your Task To Call Your UI
By default, the Task you created will tell BIDS that it has no editor, and will only be able to be configured using the Properties.  In order to tell BIDS that your Task has a full editor, you need to "decorate" your Task's class with attribute values.
Open your Task class and look at the DtsTask attribute that's applied to it.  You probably have a few arguments specified to that attribute, such as DisplayName and TaskType.  We need to add a new UITypeName argument, and specify what it is.  This argument is a string type, and it needs to specify the fully qualified assembly name of your interface class - this consists of five parts: Your full interface class name, the class name of the Task itself, the assembly version, culture, and public key token.  Here's how you find all that information.
The Full Interface Class Name
In your interface class, you'll have a "namespace" statement just under the "using" statements.  That's the first part of your class' full name.  Then you'll see your class' name right after the "class" statement.  Here's an example:
namespace Sample.SSIS.Namespace
{
    class YourTaskInterface : IDtsTaskUI
    {
From the above, you would get this as your full interface class name: Sample.SSIS.Namespace.YourTaskInterface
The (Short) Class Name of Your Task
This is the name of the class that you've converted your Script Task into, simple as that.  As an example:
namespace Sample.SSIS.Namespace
{
    [DtsTask(DisplayName = "MyTask", TaskType = "General")]
    class YourTask : Task
    {
From the above, you would get your Task class name as: YourTask
Assembly Version
Most likely (especially if you're working from the template), this is "1.0.0.0" - and it's typically a "bad idea" to change your assembly version - it will break (beyond automatic upgrading) any packages that used a different version of your Task.
You can find this value by opening the project properties of your Task and looking at the Application tab.  Once there, press the "Assembly Information" button, and copy the "Assembly Version" value.  You can also locate this value by navigating to the Windows GAC (Global Assembly Cache), finding your installed Task, and looking at the properties there.
Again, the most likely value is: 1.0.0.0
Culture
This part is also most likely set to "Neutral" - but you can check to make sure by looking at your Task DLL's properties in the GAC.
The most likely value is: Neutral
Public Key Token
This is the public half of the asymmetric key you used to sign your DLL.  You may have it recorded somewhere - but you can also find this easily by looking at your Task DLL's properties in the GAC.
Tasks signed with my key have this public key token: 8b0551303405e96c
All Together Now - the Fully-Qualified UI Type Name
Now that you've located all that, string it together separated by commas, like this (taken from the above example values) and add it in to the DtsTask attribute argument list:
[DtsTask(DisplayName = "MyTask", TaskType = "General",
    UITypeName = "Sample.SSIS.Namespace.YourTaskInterface,YourTask,Version=1.0.0.0,Culture=Neutral,PublicKeyToken=8b0551303405e96c")]
Your UI
This one is (almost) all up to you - use Windows Forms to design your interface.  You can get as elaborate as you desire - have multiple windows, make a "wizard" style interface, use a tabbed window - anything.  Your only requirement is to have a constructor that accepts a TaskHost and IServiceProvider (or just a Connections list, as I've done), as shown in the GetView method of the interface class.
In order to accomplish this from scratch, you need to add a reference to System.Windows.Forms, and place "System.Windows.Forms" in your using statements.  You'll also (likely) need to add "Microsoft.SqlServer.Dts.Runtime" in your using statements in order to accept and use the TaskHost and Connections arguments you're passing your constructor.
Wrapping It Up
Again, lots of windbag explanations for something that turns out to be fairly simple.  Add an argument to your DtsTask attribute, create a virtually empty "interface" class, then code your UI however you want.
Download the EmptyTaskWithUITemplate project from my SkyDrive to get started, or have a look at the multitudes of samples available on CodePlex.

Monday, June 22, 2009

Converting Your Script Task Into a Custom Task, Part II - How?

In Part One of this three-part series, we covered the background on why you might want to (need to) construct your own Custom Task in SQL Server Integration Services.  Unless you've already read that and come back here to refer to these instructions, I strongly suggest you read over Part One to put the following in context.
This post covers all the steps and issues you'll need to cover off to translate your Integration Services Script Task into a Custom Task.  I swear that with these instructions, it should take about ten minutes for you to convert your Script Task to a Custom Task.  Part Three will cover the steps required to extend your task with a Custom UI - which also isn't as difficult as you may think...
The Tools You'll Need
In order to build your own Custom Task from the contents of your Script Task, you need to know SSIS - at least enough to have coded scripts and feel the pain of maintaining them. You'll also need access to a full Visual Studio install - I believe that Standard edition is sufficient, but I use Professional, so don't take that as gospel. You'll also need to know your way around Visual Studio and the .Net Framework - but I'm assuming that if you actually coded your Script, you'll be able to get by. You'll need administrator access to your development machine - we'll be adding things to the GAC. Finally, you'll need the SQL Server Integration Services SDK installed on your development machine - it comes on the SQL install media.
Understanding The Differences
Life in the world of Custom Tasks is a barren wilderness... you can get started by reading MSDN (Books Online), and by downloading samples (see the Tasks section)... but those can be a little inaccessible with no firm guarantee of understanding what's going on, or when you'll "be done".  That's what this series aims to provide, and one of the essentials is knowing what's different in a Custom Task from a Script Task.
Code Harness
The SSIS team did a great job of building that "harness" for the Script Task - the behind the scenes code and comments that help you get started with the Script Task.  There's no such thing for a Custom Task... until now.  I've created an uploaded an "empty" Visual Studio 2008 C# project that has much of what I'm going to tell you all done, ready to get started with your itchy copy and paste fingers.  Download the EmptyTaskTemplate from my SkyDrive for the quick start.
Variable Access Syntax
One of the things your Script Task is all but guaranteed to do is reference SSIS variables.  You're likely passing in a file name, or a SQL statement, or text for an email, and might be passing some kind of a result back out.  The syntax for accessing and updating those variables is different in a Custom Task - but only slightly.
In the Script Task, you had to tell the Script Editor UI which variables you wanted access to - before you got into the script itself.  That was to enable the SSIS dev team to create that harness for you.  They found those variables using the COM API, and locked them for your use in the Script.  "Locked" meaning it informed SSIS that your Script was going to read or write to them, so it shouldn't allow any other Task to do the same at the same time - or you might get some corruption or other very weird issues cropping up.  Now that you're in a Custom Task - you need to take care of that for yourself, but it's not all that hard.
Instead of adding a variable to the ReadOnly list in the Script Task Editor UI, then referring to it via code like this:
string fileName = (string)Dts.Variables["FileName"].Value;
Dts.Variables["ModifiedDate"].Value = DateTime.Now;
You have to do a few steps (all in code), like this:
Variables vars = null// creates a repository for us to store references to the variable(s) we want
variableDispenser.LockForRead("User::FileName"); // adds a variable to the list we want SSIS to lock as ReadOnly
variableDispenser.LockForWrite("User::ModifiedDate"); // adds a variable to the list we want SSIS to lock as ReadWrite

variableDispenser.GetVariables(ref vars); // instructs SSIS to lock the variables in our list, and retrieve references to them

string fileName = (string)vars["User::FileName"].Value; // reads the value of a variable
vars["User::ModifiedDate"].Value = DateTime.Now; // writes to a variable

vars.Unlock(); // releases the variables so other Tasks can use them
There's a new object in play in the above sample code - variableDispenser.  This object is passed in to the Execute method - you'll see that later.
Connection Manager Access Syntax
Sometimes, you'll need your Task to communicate with your database in a way that SSIS might not support elegantly, so you'll decide to do so in a Script.  Hopefully, you don't hardcode the connection into your Script, and you use a Connection Manager to get at the connection.  In much the same vein as for variables, the syntax used to access those changes in a Custom Task - but locking isn't an issue here.  Instead of:
ConnectionManager cm = Dts.Connections["MyConnection"];
You'd say:
ConnectionManager cm = connections["MyConnection"];
Again, like the variables, the connections object is a parameter to the Execute method seen later.
Properties
One of the new capabilities you're sure to use in a Custom Task is the possibility of using properties.
Remember in the last article, I said that the only way a Script Task had to communicate with the package containing it was with variables?  If you wanted to make your Script reusable, you'd have to make it expect and respect directives that could only be given to it via passed in variables.  In the above example of retrieving variable values, I referred to a variable name directly - it was hardcoded into the custom task.  That's a little unrealistic - you can't expect the person using your task (even if it's you) to remember exactly what variables the task is expecting to use, and the precise spelling and case of the variable.
Custom Tasks give you a much better method of making your Task easy to configure.  Properties are drop-dead simple - just make a public property get/set pair in your class, and you're done.  The only caveat: the type of the property has to be a simple type - integer, numeric, or string.  If you want to expose an object as a property, you need to jump through a few more hoops - you'll need to provide a custom UI to edit it, and override how the Task saves its metadata to the package DTSX file.
It's quite likely that most of the variable usages in your Script Task will be replaced with properties, making configuration that much easier, and making your packages that much cleaner too.  When you have properties on your Task, you can specify the property explicitly, or use property expressions to fill it - it's up to the user of your Task.  Adding a custom UI (see part three) delivers even greater ease of use by allowing you to present the user with choices for some of your "restricted" properties - like an "action" property that only has a few acceptable values.
Validation
This concept is probably something that your Script Task never considered, but can be very important for a Custom Task.  Script Tasks I've seen in the wild rarely check that expected variables have been placed in the ReadOnly or ReadWrite lists.  Even more rare are Scripts that check to ensure that the variables are the correct data type.  Why is this?  It probably boils down to one reason - laziness.  Although it does help assuage your concience to realize you can rationalize your decision by recognizing that it wouldn't improve the experience much anyway.
I say laziness accounts for lack of validation in Script Tasks because it happens to me too.  You test your script code with the one condition you expect, and that's enough.  It's not too cumbersome to do some validation on your Script - but what value would that add to your Script Task?  If you checked for existence and data type of passed-in variables, what value could you (the coder) give to the experience?  Almost nothing except for a clearer error message at runtime.  There's absolutely nothing you can do in your Script Task to assist the user at design-time.  All of the variables and connections are specified by literal string - and the compiler can't check that those exist at design time.  Only at runtime can your Task do any checking - and by then it's too late.  Your only option is to fail the Script Task explicitly (by returning ScriptResults.Failure), or implicitly (via an unhandled exception which automatically fails the Task).  So, in the end, your laziness was prescient.  There was nothing you could do, so why try?
Fortunately for you, there is quite a bit of value you can add with a Custom Task.  First, the environment of use is different, which will probably dispel some of that laziness factor.  With the Script, you knew everything about the context that the Task would run in.  With a Custom Task, you have no idea what situations it will be placed in - so you'd better make it bulletproof.  (And you never know what morons will be using it either.  Oops - did I say that out loud?)  Add to that that you can validate your Custom Task at design time - which means that any problems with the Task configuration can be communicated to the package designer as warnings or errors before the package is run.  If you signal an error condition with your Custom Task - you'll actually prevent the package from being run entirely, just like OotB Tasks.  That's a fantastic benefit, even for tasks that you intend to use only for yourself.
The Custom Task Coding Environment
With Script Tasks, you placed your code in one method - Main().  You could construct as elaborate a script as you wanted - but SSIS called Main to execute your task, and that's the end of it.
A Custom Task inherits from the Microsoft.SqlServer.Dts.Runtime.Task, which has several overrideable methods: InitializeTask, Validate, and Execute.  Up at the top (as seen in the sample EmptyTaskTemplate) there are several attributes of the Task class that you can specify to give SSIS some valuable metadata for its toolbox.
InitializeTask
This method is rarely overridden in a Custom Task.  It can set default values for your properties (although you can also set those in the declarations), and can declare custom events for your Task.
Validate
Overriding this method is completely optional - but would be a big mistake to pass over.  This method gets called quite frequently at design time, and at least once at runtime.  It provides you (the coder) the opportunity to examine the values the package designer has set in your properties, check whatever SSIS variables and connections you have access to, and report back a status to SSIS and the user.  More importantly, you can report verbose errors and warnings using the passed in IDTSComponentEvents object.  Those warnings and errors are seen by the package designer in the output and error windows, and give invaluable information as to what they need to fix in order for your Task to operate correctly at runtime.
For example, suppose you had a string property that expected the package designer to specify an SSIS string variable.  You could check that the property was not empty, that it referenced an SSIS variable that actually existed, and that the variable was of the correct type.  Here's a sample of doing just that:
if (this._fileNameVariable == null)

{
    componentEvents.FireError(0, "", "No variable is specified for the file name.", "", 0);
    result = DTSExecResult.Failure;
}
else if (!variableDispenser.Contains(this._fileNameVariable))
{
    componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' does not exist.", "", 0);
    result = DTSExecResult.Failure;
}
else
{
    variableDispenser.LockForRead(this._fileNameVariable);
    Variables vars = null;
    variableDispenser.GetVariables(ref vars);
    if (vars[0].DataType != TypeCode.String)
    {
        componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' is not a String type.", "", 0);
        result = DTSExecResult.Failure;
    }
    vars.Unlock();
}
Instead of the task simply failing at runtime, the user gets some very informative messages as to why the task will fail, before they even try to execute it.
Execute
This is the meat of the Task - the runtime method that gets the work done.  Instead of the interaction with the outside SSIS environment being accomplished with the Dts object (as in Dts.Variables), those points of interaction are passed in as parameters to the method.
For the most part, this is the place where you'll paste the bulk of your code from the Script Task you're converting.  After you paste, you'll clean up the SSIS variable and connection manager accesses.  After that, you'll convert variable accesses into properties of the task.  And after that, you'll likely do some more work in Validate that didn't exist at all before.  Your Execute method will look a little cleaner and simpler than it did in the Script Task - hopefully meaning it's easier to debug and maintain.
Compiling, Debugging, and Distributing Your Task
Compiling
The next step is to compile your Task into a DLL.  If you've used the sample EmptyTaskTemplate project, all you should need to do is change the assembly name in the project properties, sign it with a key file, and compile.  If you've launched Visual Studio in administrator mode (in Vista/Windows 7) then the post-build event will automatically copy the DLL to the required folders and register it in the GAC.  (For those of you that run 32-bit OSes, you'll need to remove one line in that post-build event - the one that refers to "%ProgramFiles(x86)%".)
Debugging
You can now launch another instance of Visual Studio (BIDS, actually) and open an IS project.  In the toolbox, right-click and Choose Items.  Your new Task should be listed there along with all the other OotB and third party Tasks.  Drag a new instance on to your Control Flow, and design away!
If you want to debug the InitializeTask and/or Validate methods in your Task, switch to the first instance of Visual Studio, and select "Attach to Process" in the debug menu.  Pick the instance of "devenv.exe" that shows the title of your IS package.  You can now place breakpoints in your Validate and InitializeTask methods, and debug as normal.
To debug the Execute event, you need to place a breakpoint in the IS package on your Task (on the OnPreExecute event, to be precise - but that's the default).  Run your package, and when the breakpoint is hit, use the first instance of Visual Studio to attach to the instance of DtsDebugHost.exe that shows a Type of "T-SQL, Managed, ..."
Distributing
To distribute your Task to production servers or other developers' machines, you basically need to follow the steps in the post-build event of your project, or you could follow the relatively easy directions on the SSIS Team blog about creating an installer.
That's It?
Um.  Yes.  Pretty much.  If you think I missed something, please post a comment.
But "in for a penny, in for a pound," or "go big or go home!"  Read on to part three - creating a custom task UI.
And hey - if you've gotten this far and created your Custom Task - post it up on SkyDrive, CodePlex, anywhere, and let me know about it so I can add it to my directory of Custom Tasks and Components.

Monday, June 15, 2009

Converting Your Script Task Into a Custom Task, Part I - Why?

I said I'd post about this about a month ago (maybe longer) and here it is.  I constructed a lot of this content to accompany a presentation I did at DevTeach Vancouver in June '09 - it should go a long way to helping you make your Task a drag-and-drop affair on to the Control Flow surface.  (BTW - there was more content at the conference - kick yourself for missing it, and the interactivity of my session.)
I was going to post this all in one shot, but that would have been far too lengthy.  So I've decided to divide it into three parts: why you should consider a Custom Task, how to make one, and adding a custom UI to your Custom Task.  Those ought to give you all the steps to take the contents of your Script Task, turn it into a Visual Studio project to compile a custom DLL, add that to BIDS, drop it on to your Control Flow surface, and merrily design packages with reuse confidence.
Out-of-the-box Tasks
The tasks that ship with SQL Server Integration Services (sometimes referred to as those that come "out of the box") are native (unmanaged) code.  They cover a wide range of capabilities, from database commands to file system and internet service interaction.  They include flow-control tasks such as a Foreach Container and Sequence Containers to manage encapsulation and concurrency.
What's a Script Task?
Script tasks are SQL Server Integration Services' primary vehicle for making the Control Flow of SSIS packages extensible.  If what you want done isn't available out-of-the-box, they wanted to provide some method for you to still accomplish your task.  (I have heard that this type of design decision is not common among ETL tool vendors - that's a shame, IMO.) 
In SSIS 2005, the Script Task allowed you to roll your own code in Visual Basic.Net (VBA to be precise).  SSIS 2008 extended this capability by replacing the VBA engine with VSTA, permitting VB.Net or C# code in the Script.  Both versions allowed you to do almost anything a full-fledged .Net framework application could do - on top of the ability to interact with the currently running SSIS package.  You could reference external assemblies, perform complex logic, ...
BIDS (Business Intelligence Development Studio - another name for Visual Studio when skinned for BI projects) allowed you to supply .Net code to accomplish the tasks you wanted to, and provided a neat harness that removed a lot of the complexity of plumbing into the COM interop you would otherwise need to do in order to execute in an IS package and communicate with it.
What's a Custom Task, and Why Do I Care?
A Custom Task is the next step in extensibility.  It's essentially a .Net Framework DLL containing code that conforms to a predetermined interface.  If you use a lot of Script Tasks to extend your packages' capabilities, you're probably quite aware of the Script Task's Achilles heel, and are looking for some alternatives.
Why Choose One Over The Other?
If at all possible, use an OotB Task.  The tasks that come with SSIS are very reliable and accomplish their jobs well.
That said, sometimes there isn't anything in that toolbox that does the job you're looking for.  A very common tool that seemed to have missed Microsoft's cut (or vision) was a task that would detect file existence, and/or report on file properties such as creation date or modification date.  In these cases, the Script Task is the tool of choice to add those capabilities.
So what space is left for Custom Tasks?  The best answer to that is reusability.
Reusability of Scripts
"Sure, Scripts are reusable - doesn't your copy and paste work?" - Captain Obvious
Of course they do - and of course you can reuse the code in a Script Task by copying and pasting into another Script Task.  In fact, you can copy the Task itself (not just the code) and paste it.  (You will have to edit it in order to get it to recompile, though - the paste doesn't trigger that step.)
The biggest problem with that statement, Captain, is that it's reusable at a cost.  The costs?
  • Some difficulty referencing external assemblies in BIDS 2005.
  • Script Tasks are rarely pure copy-and-paste reusable.  Variable names change from package to package, and there are always other small variations.
  • Fixing bugs means fixing bugs in each Script Task you've ever used that code in... and there's no way to find out where you've used it.
  • The only way to communicate "parameters" (or customize one use from another) in your Script is to hard-code them (not very flexible) or pass in variables (which can get tiresome).
  • There's no assistance provided to the package designer using the Script as to how they should use it.  Unless they coded the script, they won't know how many or which variables to pass in or out.
The Argument To Go Custom
Pretty much the only "argument" (reason) you can have to move to a Custom Task is that you've gone the Script Task route already, and have felt the burden of maintaining that codebase.  If you're looking to make your Task run faster - don't go custom - it won't help.  If you're looking for an easier debugging or coding experience - you won't find it here.
Take the Next Step!
If you've decided that now is the time to end the insanity of maintaining a reused Script Task, there may still be one option you should consider before you pull the trigger on converting to a Custom Task.  Ivan Peev has created quite a few Custom Tasks under the umbrella of CozyRoc.  One of those Tasks is the Script Task Plus, which does away with the primary pain point of reusability.  You owe it to yourself to give it a shot - it's free for eval.
If that's not for you, or if you're just plain code-crazy and would love to give this a shot, read on to part two and find out how to actually get it done.