Monday, August 29, 2011

Inserting Records AND Getting The Identity in SSIS - Part 2

Since I posted a ways back on retrieving the identity value for inserts in SQL Server Integration Services, I've learned many things - one of which is an inline solution for that problem I didn't know back then.  Check out that previous post for the business problem, then come back here for an "inline" solution using SSIS - without scripts - just regular SSIS tasks and components.
It's a Love/Hate Relationship
The OLE DB Command component in SSIS is a near-universally hated object due to it's inherent performance problems.  But it's so hard to hate when it can actually get the job done when nobody else can.
Yes, our old nemesis is back to solve a problem for us... at the same price it's always asked.  Do you want your packages to run fast, but be complicated?  Or do you want them to run slow, but be easy to understand?  Sadly, there is not much middle ground.
The Keys
It turns out to be pretty easy, and I should have got it into my thick skull before, from all the hints I'd read.  A stored procedure with an OUTPUT parameter, and the OLE DB Command are the keys.  For this example, assume we're using a table that has only two columns: ID and TEXTVALUE, with ID being the automatically generated identity column.
The Stored Procedure
Unfortunately, this doesn't use much of the GUI power of SSIS, it's old hand-coding.  For our simple table, this would be the stored procedure definition:

CREATE PROCEDURE InsertData
  
@textvalue AS CHAR(10),
  
@id AS INT OUTPUTAS
INSERT INTO
IdentityTable
  
(textvalue)
  
VALUES
  
(@textvalue)
SET
@id = SCOPE_IDENTITY()

The SSIS Data Flow
Use a Derived Column to generate the ID column with a NULL(DT_I4) value in it.  (We need somewhere to place our identity value in.)


Following that, we can work on the OLE DB Command component, using a statement like this:

EXEC InsertData ?, ? OUTPUT


Map the columns as you'd expect to map them.
If you'd like I've got an SSIS 2008 package showing this technique.
The Big Picture
Sure, it works - but here are the drawbacks:
It's going to perform badly. Horribly if the package is run on a different server from the database the stored procedure is on.  I can't stress that enough.  The OLE DB Command component is making a round-trip call to the database for each row.  Even on the same machine, that's an expensive operation.
You're doing an RBAR (row by agonizing row) process which is much, much slower than a set-based process that SQL Server is more suited for.  (SSIS is fine with RBAR, it's designed that way.)  Neither SQL Server process nor the Integration Services runtime will be working very hard during this part of the data flow - I don't think even one core would get utilized fully due to the wait-states involved.
If you're interested, here's a sample package to show this process.  It creates a database on your local machine, creates the sample table, generates one row, performs the insert (and shows the result to you), then tears the database down.
Can we mitigate this RBAR flaw?  Stick around or subscribe your RSS reader to find out in an upcoming post.

Monday, August 22, 2011

Anyone Want to Work With Me?

Let's see... my last post was almost a month ago, and it was a pretty small one.  I've been busy!
But it's good to be busy, especially when the housing market has looked so bad in the past few years, and you work for a company that directly serves that industry.  My guerilla war about data as (apparently) been successful enough that I'm now looking for someone to help us out, full time.
What Do I Want?
I want your job.  Seriously.  I'm of two minds on getting another body in here.  On the one hand, it should make things happen faster - but on the other, that's work I won't get to do and things I won't get to learn.  If I could clone myself, or get by with only a few hours sleep a night like Buck Woody, I'd do it.  (If you know me personally, that last sentence is pretty funny.)
What's the Job?
I need some help working on our data warehouse.  I feel pretty comfortable with the "first half" of the technical aspects of it - sourcing the data and transforming it into an appropriate dimensional model inside a relational store.  What I don't know well is the "last half" - working Analysis Services, a reporting tool (RS, Crescent, Tableau, QlikView, ...), and SharePoint.  We have a small team of hardware/OS guys, but they don't know this stuff either, and don't have a massive budget to address our coding inefficiencies.
This Isn't a Backroom Job
If you're not very comfortable talking to "suits," you won't enjoy this job.  Without a doubt, you'll be working with C-level executives and VPs from time to time.  Even though you may feel most at home in a cave gently washed by LED backlighting, you'll need fantastic communication skills in email, phone, and face-to-face situations all over North America.  It won't be the majority of your work, but you'll have to be comfortable with teaching "end users," evangelizing our capabilities, and listening very hard for opportunities and difficulties with data.
Self-Starters Only
Another reason I'm hesitant to move in this direction is that I loathe having to monitor other people... in part because I get distracted by work myself.  If you need constant feeding, at some point you'll starve to death because I've unintentionally ignored you.  You need to find your own work and check in with others from time to time to make sure we're all part of the same pack, pulling in the same direction.
I'm looking for a flexible mind, a problem solver, and a committed learner.  Someone who isn't a 9 to 5er, but understands that's when they'll get called for information.  You will also have to feed my desire to learn from you.  I want to understand the technology so we can offer the guys who do real work at this company the data and insights they need to keep us on top.  I'll reciprocate gladly - anything you want to know about anything you think I know.
Benefits Are Great
I'm not going to go into a lot of details here - because there's only one that should get me the applicants I want, the rest should take care of itself.  I like working here because I get to do what I love, and work with people that like what they do.  If you're passionate about fact-based decision making, and want to help provide it to a company that's waking up to how valuable that can be, contact me.