Pages

Friday, February 12, 2010

My CodePlex Wishlist

Now that I've finally gotten version 1.5 of the Kimball Method Slowly Changing Dimension custom component for SQL Server Integration Services published, there will surely be some bugs and requests that need addressing.
The project is hosted on Microsoft's open source project site, CodePlex.  Now, I'm not an OSS veteran, so I can't compare CodePlex to SourceForge or any other OSS hosting site, because I've never used them.  To date, I'm very pleased with the CodePlex experience - the project pages are easy to set up and edit, the discussions and issues work well, the releases (downloads) function effectively, the source control was pretty easy to set up, and the statistics are neat.  That said, there's always room for improvement, and I'm glad that the CodePlex team has a feedback mechanism set up to process it, and follows an agile process to work through the suggestions.  For example, they've just added Mercurial support to CodePlex - which looks like a very cool thing for larger development teams.
Even so, there are a few "project management" things that I wish were easier for me to accomplish, so I could spend more time coding, and less time managing the project.  I've filed a few of them with into the CodePlex issues list - and if you would like to help me (and I think other developers that use CodePlex too) then please sign on to CodePlex and vote up my following suggestions.  The CodePlex team has recently knocked off some "big" items that had a lot of votes - so much so that my requests are very near the top of the pile (when ordered by votes) - help me to push them to the top!
Release Management and Issue Reporting
To me, Issues and Releases go hand-in-hand.  Issues are bugs and feature requests that occur in one particular release and are addressed in another (or a revised) release.
Two things drive me bananas with bug reports in particular.  First, there's no facility for the person reporting the issue to indicate which version they observed the issue in, aside from the comments.  And we all know that users (yes, even I) don't always remember to include this essential piece of information in their bug report.  As you all probably know, it's a little hard to diagnose a problem when you don't know what version of the application the user is having an issue with!  The second thing is that I would really like to ask the user which version of SSIS, Windows, and bitness they're running.  Again, very key information that I would like to prompt from the user without having to hope they remember it being important.
The other side of the coin is my ability to report progress back to the people who filed the reports, or others who may be watching them.  There's the ability to indicate the issue is "fixed", and even to "close" the issue.  Fantastic.  But does that tell anybody anything useful?  What release did the fix make it into?  Does the release page show this issue as being addressed?  Does the release page show open issues related to this version?  No such luck - all that extremely valuable information has to be manually maintained by me. 
It would be great if the issue system would record a little more information (such as the "issue applies to this version", and "the issue is addressed in this version"), and then automatically marked up the releases page of both of those releases (which may be the same release!) with a "known issues" and "resolved issues" section.  CodePlex did just release the ability to tie source code changesets to releases - but that applies more to devs trying to figure things out - and it still doesn't tell them if an issue is actually fixed in the code or not.
If you like those ideas - or just want to make my life easier - see these issues on CodePlex:
Add More Properties to Issues - "Applies To Release"
Automatically Annotate Releases Based on Linked Issues
Discussion Management
One part of CodePlex that I really do like is the facility for users to post up discussions on possible problems, usage techniques, or anything else that relates to the project.  At the moment, though, it's quite unmanaged and can become a rats nest of confusing and worthless babble.  I'm not saying that user comments aren't wanted - but I would like to keep them organized an on topic so that people other than the original poster (or additional posters) can get some value out of the process.
My wishes for the discussions are heavily influenced by what I can do as a moderator on the MSDN SSIS forums.  I'm not asking for anything quite so sophisticated (although others are) - but a few more manageability features to keep things clean and organized would be great:
Add Ability to "Sticky" or "Pin" a Discussion
Allow Discussion Threads to be "Split"
Thanks to the CodePlex Team
Despite those minor changes I wish for, CodePlex has been a great place to find samples, working code, and to share what I've made with the community.  A hearty thank you to Sara Ford (blog|twitter) for steering the ship to where it is today (she's just announced she's moving on) and to the CodePlex team who've worked very hard to make it as successful as it is!  Follow all their announcements about new features on their blog, and on Twitter.  I'll be seeing you at your session at the MVP Summit shortly...
My Projects On CodePlex
Just in case you're interested, I currently have six projects open on CodePlex - all custom objects for Integration Services.  In order of importance...
  1. SSIS Community Tasks and Components - This "project" actually hosts John Welch's (blog|twitter) Batch Update Destination for SSIS.  But we've made it do double-duty to really show off the community that's grown to extend Integration Services.  Over 100 addons and extensions are listed - most of them free or open-source.
  2. Kimball Method Slowly Changing Dimension - Frustrated with the SCD Wizard in SSIS?  You can roll your own, use Script, the T-SQL MERGE command, or the very useful TableDifference component.  But I believe the KSCD beats them all!
  3. File Properties Task - Something that should have been "in-the-box" from the get-go, it's a task that lets you see if an expected file exists, and more.
  4. Send HTML Mail Task - Allows you to send HTML formatted email, something you can't do with the stock Send Mail Task.
  5. HTML Table Destination - Takes a rowset in a Data Flow and stuffs it into an SSIS string variable as an HTML marked-up table.  (I use this a lot with the Send HTML Mail Task.)
  6. Pause Task - Something you may need from time to time to assist with coordinating processes in your ETL or DI, this task allows you to wait for a certain number of milliseconds, or until a specific time of day.

4 comments:

  1. Hi Todd, thanks for leaving comments on my blog article My First SSIS Custom Task (http://geekatwork.wordpress.com/2010/03/05/my-first-ssis-custom-task/) and thank you for sharing the codeplex link with me. I found the link really help and your blogs educational. I found that database developers spend more times searching for a solution before deciding to build their own, and I believe there are quite a few things to do to make Integrated Service better. I would like to know your purpose of keeping this blog, and I look forward to reading more great articles here in the future.

    ReplyDelete
  2. Hey Todd, your custom tasks are great and add tons of extra functionality to BIDS. I was playing around with your Send HTML Mail task (2008) and noticed something weird. The BCc address value does not save when you enter it through the Editor. It only saves if you enter it through the Properties screen. Is there any chance you could release a fix to this?

    ReplyDelete
  3. How does your CurrentMember work on your SCD 1.5 transformation. I tried associating the CurrentMember item to my CurrentItem field but it simply shows up as NULL. Thanks so much.

    ReplyDelete
  4. The SCD2Current type of column has to have values populated in it already. I can't quite recall if v1.5 only supports the boolean data type - it may.
    So if your dimension table already has values in there (or is empty for your first load) then it should set that column to "true" for new records. If you have any further questions, please post to the Codeplex discussion forums. (Perhaps you already have!)

    ReplyDelete