It's quite common to have to take some kind of a string representation of a date and convert it to a true date type, or convert a numeric date representation into a date - but I've recently had the requirement to process UTC dates in my packages. In order to process this information in SSIS, the DT_DBTIMESTAMPOFFSET data type is provided, which stores a high-precision date and time as well as a time zone offset.
So how do you work with a this type of data? Here are a couple considerations to be aware of.
Know What You're Looking At
One of the most common sources for data in Integration Services is your plain flat file - and in my case, a recent one included UTC date information. Plain text files don't have unambiguous formats! The trick here is understanding exactly how the author of the file encoded the time zone information. This isn't much different from understanding a localized date/time string - except that we typically don't "think" in time-zone offsets, so they can become confusing pretty fast.
Convert Early!
I highly recommend performing all conversions as soon as possible, and VERY explicitly. I always hated VB6 because it allowed so many implicit conversions without any warnings - and most of your destination components will allow this too. So if you're storing UTC data in your destination, be sure to convert it in your data flow to a DT_DBTIMESTAMPOFFSET as soon as possible, using the format described below. At least then you'll be able to see any odd conversion behaviour in a data viewer, with Profiler, or whatever tools you have at your disposal. If you push a string-typed column straight through to the destination, problems with your conversion format will appear out of nowhere.
Converting String UTC Dates
Converting string typed data to a real UTC date type is just as finicky as converting strings to regular dates. Any ambiguity in the format can cause SSIS (or any other tool) to misinterpret your conversion and you'll end up with an error, or worse - silent bad data. Read over my previous article on converting strings to dates to get my recommendations for the basic format and tools.
To convert to a DT_DBTIMESTAMPOFFSET, we just need to build a little on the foundation laid there. The unambiguous format for converting to an SSIS UTC datetime is:
yyyy-MM-ddThh:mm:ss.fffffff+hh:mm
or
yyyy-MM-ddThh:mm:ss.fffffff-hh:mm
The number of "f"s in that format (the fraction of seconds) is flexible. You can specify none, or up to seven decimal places. In order to convert that to a UTC date in SSIS, you need to perform a regular cast operation, but you also have to tell SSIS how many decimal places of fractional seconds you want it to store as well. So something like this will work, as it specifies three decimal places for seconds:
(DT_DBTIMESTAMPOFFSET, 3)"2010-03-05T16:46.154-08:00"
Storing UTC Dates
Now that you have a UTC date type, you probably need to store it somewhere. In SQL Server, the appropriate column data type for this SSIS type is datetimeoffset. Sure, you could store it in a regular datetime column - but you just put in a bunch of work to convert it to the most accurate and appropriate data type - keep that work intact!
Thanks for the article. I am dealing with SalesForce DateTime formats using an SSIS plugin. The transformation interface does not appear to provide a means of customizing the output format to something like... yyyy-MM-dd'T'HH:mm:ss.SSSXXX
ReplyDeleteAs long as the plugin is providing you a column that's a datetime type - not a string - you're OK. Don't worry what format you see if it's actually a datetime type, the data's there and you can work with it.
ReplyDelete