In Part One of this three-part series, we covered the background on why you might want to (need to) construct your own Custom Task in SQL Server Integration Services. Unless you've already read that and come back here to refer to these instructions, I strongly suggest you read over Part One to put the following in context.
This post covers all the steps and issues you'll need to cover off to translate your Integration Services Script Task into a Custom Task. I swear that with these instructions, it should take about ten minutes for you to convert your Script Task to a Custom Task. Part Three will cover the steps required to extend your task with a Custom UI - which also isn't as difficult as you may think...
The Tools You'll Need
In order to build your own Custom Task from the contents of your Script Task, you need to know SSIS - at least enough to have coded scripts and feel the pain of maintaining them. You'll also need access to a full Visual Studio install - I believe that Standard edition is sufficient, but I use Professional, so don't take that as gospel. You'll also need to know your way around Visual Studio and the .Net Framework - but I'm assuming that if you actually coded your Script, you'll be able to get by. You'll need administrator access to your development machine - we'll be adding things to the GAC. Finally, you'll need the SQL Server Integration Services SDK installed on your development machine - it comes on the SQL install media.
Understanding The Differences
Life in the world of Custom Tasks is a barren wilderness... you can get started by reading MSDN (Books Online), and by downloading samples (see the Tasks section)... but those can be a little inaccessible with no firm guarantee of understanding what's going on, or when you'll "be done". That's what this series aims to provide, and one of the essentials is knowing what's different in a Custom Task from a Script Task.
Code Harness
The SSIS team did a great job of building that "harness" for the Script Task - the behind the scenes code and comments that help you get started with the Script Task. There's no such thing for a Custom Task... until now. I've created an uploaded an "empty" Visual Studio 2008 C# project that has much of what I'm going to tell you all done, ready to get started with your itchy copy and paste fingers. Download the EmptyTaskTemplate from my SkyDrive for the quick start.
Variable Access Syntax
One of the things your Script Task is all but guaranteed to do is reference SSIS variables. You're likely passing in a file name, or a SQL statement, or text for an email, and might be passing some kind of a result back out. The syntax for accessing and updating those variables is different in a Custom Task - but only slightly.
In the Script Task, you had to tell the Script Editor UI which variables you wanted access to - before you got into the script itself. That was to enable the SSIS dev team to create that harness for you. They found those variables using the COM API, and locked them for your use in the Script. "Locked" meaning it informed SSIS that your Script was going to read or write to them, so it shouldn't allow any other Task to do the same at the same time - or you might get some corruption or other very weird issues cropping up. Now that you're in a Custom Task - you need to take care of that for yourself, but it's not all that hard.
Instead of adding a variable to the ReadOnly list in the Script Task Editor UI, then referring to it via code like this:
string fileName = (string)Dts.Variables["FileName"].Value;
Dts.Variables["ModifiedDate"].Value = DateTime.Now;
You have to do a few steps (all in code), like this:
Variables vars = null; // creates a repository for us to store references to the variable(s) we want
variableDispenser.LockForRead("User::FileName"); // adds a variable to the list we want SSIS to lock as ReadOnly
variableDispenser.LockForWrite("User::ModifiedDate"); // adds a variable to the list we want SSIS to lock as ReadWrite
variableDispenser.GetVariables(ref vars); // instructs SSIS to lock the variables in our list, and retrieve references to them
string fileName = (string)vars["User::FileName"].Value; // reads the value of a variable
vars["User::ModifiedDate"].Value = DateTime.Now; // writes to a variable
vars.Unlock(); // releases the variables so other Tasks can use them
There's a new object in play in the above sample code - variableDispenser. This object is passed in to the Execute method - you'll see that later.
Connection Manager Access Syntax
Sometimes, you'll need your Task to communicate with your database in a way that SSIS might not support elegantly, so you'll decide to do so in a Script. Hopefully, you don't hardcode the connection into your Script, and you use a Connection Manager to get at the connection. In much the same vein as for variables, the syntax used to access those changes in a Custom Task - but locking isn't an issue here. Instead of:
ConnectionManager cm = Dts.Connections["MyConnection"];
You'd say:
ConnectionManager cm = connections["MyConnection"];
Again, like the variables, the connections object is a parameter to the Execute method seen later.
Properties
One of the new capabilities you're sure to use in a Custom Task is the possibility of using properties.
Remember in the last article, I said that the only way a Script Task had to communicate with the package containing it was with variables? If you wanted to make your Script reusable, you'd have to make it expect and respect directives that could only be given to it via passed in variables. In the above example of retrieving variable values, I referred to a variable name directly - it was hardcoded into the custom task. That's a little unrealistic - you can't expect the person using your task (even if it's you) to remember exactly what variables the task is expecting to use, and the precise spelling and case of the variable.
Custom Tasks give you a much better method of making your Task easy to configure. Properties are drop-dead simple - just make a public property get/set pair in your class, and you're done. The only caveat: the type of the property has to be a simple type - integer, numeric, or string. If you want to expose an object as a property, you need to jump through a few more hoops - you'll need to provide a custom UI to edit it, and override how the Task saves its metadata to the package DTSX file.
It's quite likely that most of the variable usages in your Script Task will be replaced with properties, making configuration that much easier, and making your packages that much cleaner too. When you have properties on your Task, you can specify the property explicitly, or use property expressions to fill it - it's up to the user of your Task. Adding a custom UI (see part three) delivers even greater ease of use by allowing you to present the user with choices for some of your "restricted" properties - like an "action" property that only has a few acceptable values.
Validation
This concept is probably something that your Script Task never considered, but can be very important for a Custom Task. Script Tasks I've seen in the wild rarely check that expected variables have been placed in the ReadOnly or ReadWrite lists. Even more rare are Scripts that check to ensure that the variables are the correct data type. Why is this? It probably boils down to one reason - laziness. Although it does help assuage your concience to realize you can rationalize your decision by recognizing that it wouldn't improve the experience much anyway.
I say laziness accounts for lack of validation in Script Tasks because it happens to me too. You test your script code with the one condition you expect, and that's enough. It's not too cumbersome to do some validation on your Script - but what value would that add to your Script Task? If you checked for existence and data type of passed-in variables, what value could you (the coder) give to the experience? Almost nothing except for a clearer error message at runtime. There's absolutely nothing you can do in your Script Task to assist the user at design-time. All of the variables and connections are specified by literal string - and the compiler can't check that those exist at design time. Only at runtime can your Task do any checking - and by then it's too late. Your only option is to fail the Script Task explicitly (by returning ScriptResults.Failure), or implicitly (via an unhandled exception which automatically fails the Task). So, in the end, your laziness was prescient. There was nothing you could do, so why try?
Fortunately for you, there is quite a bit of value you can add with a Custom Task. First, the environment of use is different, which will probably dispel some of that laziness factor. With the Script, you knew everything about the context that the Task would run in. With a Custom Task, you have no idea what situations it will be placed in - so you'd better make it bulletproof. (And you never know what morons will be using it either. Oops - did I say that out loud?) Add to that that you can validate your Custom Task at design time - which means that any problems with the Task configuration can be communicated to the package designer as warnings or errors before the package is run. If you signal an error condition with your Custom Task - you'll actually prevent the package from being run entirely, just like OotB Tasks. That's a fantastic benefit, even for tasks that you intend to use only for yourself.
The Custom Task Coding Environment
With Script Tasks, you placed your code in one method - Main(). You could construct as elaborate a script as you wanted - but SSIS called Main to execute your task, and that's the end of it.
A Custom Task inherits from the Microsoft.SqlServer.Dts.Runtime.Task, which has several overrideable methods: InitializeTask, Validate, and Execute. Up at the top (as seen in the sample EmptyTaskTemplate) there are several attributes of the Task class that you can specify to give SSIS some valuable metadata for its toolbox.
InitializeTask
This method is rarely overridden in a Custom Task. It can set default values for your properties (although you can also set those in the declarations), and can declare custom events for your Task.
Validate
Overriding this method is completely optional - but would be a big mistake to pass over. This method gets called quite frequently at design time, and at least once at runtime. It provides you (the coder) the opportunity to examine the values the package designer has set in your properties, check whatever SSIS variables and connections you have access to, and report back a status to SSIS and the user. More importantly, you can report verbose errors and warnings using the passed in IDTSComponentEvents object. Those warnings and errors are seen by the package designer in the output and error windows, and give invaluable information as to what they need to fix in order for your Task to operate correctly at runtime.
For example, suppose you had a string property that expected the package designer to specify an SSIS string variable. You could check that the property was not empty, that it referenced an SSIS variable that actually existed, and that the variable was of the correct type. Here's a sample of doing just that:
if (this._fileNameVariable == null)
{
componentEvents.FireError(0, "", "No variable is specified for the file name.", "", 0);
result = DTSExecResult.Failure;
}
else if (!variableDispenser.Contains(this._fileNameVariable))
{
componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' does not exist.", "", 0);
result = DTSExecResult.Failure;
}
else
{
variableDispenser.LockForRead(this._fileNameVariable);
Variables vars = null;
variableDispenser.GetVariables(ref vars);
if (vars[0].DataType != TypeCode.String)
{
componentEvents.FireError(0, "", "The variable '" + this._fileNameVariable + "' is not a String type.", "", 0);
result = DTSExecResult.Failure;
}
vars.Unlock();
}
Instead of the task simply failing at runtime, the user gets some very informative messages as to why the task will fail, before they even try to execute it.
Execute
This is the meat of the Task - the runtime method that gets the work done. Instead of the interaction with the outside SSIS environment being accomplished with the Dts object (as in Dts.Variables), those points of interaction are passed in as parameters to the method.
For the most part, this is the place where you'll paste the bulk of your code from the Script Task you're converting. After you paste, you'll clean up the SSIS variable and connection manager accesses. After that, you'll convert variable accesses into properties of the task. And after that, you'll likely do some more work in Validate that didn't exist at all before. Your Execute method will look a little cleaner and simpler than it did in the Script Task - hopefully meaning it's easier to debug and maintain.
Compiling, Debugging, and Distributing Your Task
Compiling
The next step is to compile your Task into a DLL. If you've used the sample EmptyTaskTemplate project, all you should need to do is change the assembly name in the project properties, sign it with a key file, and compile. If you've launched Visual Studio in administrator mode (in Vista/Windows 7) then the post-build event will automatically copy the DLL to the required folders and register it in the GAC. (For those of you that run 32-bit OSes, you'll need to remove one line in that post-build event - the one that refers to "%ProgramFiles(x86)%".)
Debugging
You can now launch another instance of Visual Studio (BIDS, actually) and open an IS project. In the toolbox, right-click and Choose Items. Your new Task should be listed there along with all the other OotB and third party Tasks. Drag a new instance on to your Control Flow, and design away!
If you want to debug the InitializeTask and/or Validate methods in your Task, switch to the first instance of Visual Studio, and select "Attach to Process" in the debug menu. Pick the instance of "devenv.exe" that shows the title of your IS package. You can now place breakpoints in your Validate and InitializeTask methods, and debug as normal.
To debug the Execute event, you need to place a breakpoint in the IS package on your Task (on the OnPreExecute event, to be precise - but that's the default). Run your package, and when the breakpoint is hit, use the first instance of Visual Studio to attach to the instance of DtsDebugHost.exe that shows a Type of "T-SQL, Managed, ..."
Distributing
To distribute your Task to production servers or other developers' machines, you basically need to follow the steps in the post-build event of your project, or you could follow the relatively easy directions on the SSIS Team blog about creating an installer.
That's It?
Um. Yes. Pretty much. If you think I missed something, please post a comment.
But "in for a penny, in for a pound," or "go big or go home!" Read on to part three - creating a custom task UI.
And hey - if you've gotten this far and created your Custom Task - post it up on SkyDrive, CodePlex, anywhere, and let me know about it so I can add it to my directory of Custom Tasks and Components.
Hi Todd, it's a very informative and niche article on Custom Components. Keep up the good work :)
ReplyDeleteSiddharth Mehta