While not as frequent as questions about converting strings to dates, this is a common request at the SSIS forums. The difference here is that the incoming number can represent quite a wide variety of interpretations of the date value - although integers can be easier to work with.
Unlike the post on converting strings to dates, this post is less about the tools (functions) you'll use to get the job done, and more about the specific number formats I've seen questions about. But to start, we'll cover the tools.
Basic Tools for Converting Numbers to Dates
Casting to String
Typically you'll cast to a Unicode string type (DT_WSTR) since it's only an intermediate type (before you do your final cast to a date type) and fussing with code pages in DT_STR is unnecessary.
Check the samples below, as well as the post on converting strings to dates in order to understand the format you're going to need to cast the number to, or what you have to do with your string after you've casted it.
A cast from a number to a string is straightforward - the only possible tricky part is making sure you've given enough room (characters) to your cast command, such as:
(DT_WSTR, 10)
DATEADD
The DATEADD function works great for building a date from the ground up. The arguments might be different from what you might be used to in other languages, so pay attention! The first argument is the date part; You'll be using "Year", "Month", and "Day". The second argument is the number of those elements to add to the third argument: the base date. Unfortunately, SSIS doesn't have a literal character set to denote a typed-in date (like VBA does, for example) - so you'll have to start with a string, casted to a date. Again, check the post about strings to dates for the reasons why the date format is important in these examples. You will likely always use the following for your "base" date:
(DT_DBDATE)"1900-01-01"
Breaking Up the Date
It's also quite common to have your incoming date be a "concatenated" format, like YYYYMMDD. That kind of format can be quite convoluted to deal with in a single expression in a Derived Column transformation. You might be more comfortable dealing with concatenated integer values in two steps: breaking the number down, then converting it. That requires two Derived Column components to accomplish, but adds the debugging capability of using a Data Viewer between them to examine the intermediate "decomposed" values.
Conversion Examples
Integer in YYYYMMDD Format
The following examples will use a column called [DateYYYYMMDD] to generate a date type.
Convert Directly to String, Then Date
First, deconstruct your single integer value into its components, then build up a string to cast to the date type of your choice:
(DT_DBDATE)((DT_WSTR, 4)(ROUND([DateYYYYMMDD] / 10000, 0) + "-" + (DT_WSTR, 2)(ROUND([DateYYYYMMDD] / 100, 0) % 100) + "-" + (DT_WSTR, 2)([DateYYYYMMDD] % 100))
Create Date from Scratch
DATEADD("Year", ROUND([DateYYYYMMDD] / 10000, 0), DATEADD("Month", ROUND([DateYYYYMMDD] / 100, 0) % 100), DATEADD("Day", [DateYYYYMMDD] % 100, (DT_DBDATE)"1899-12-31")))
Or, Decompose the Integer, Then Convert or Create
Place two Derived Column components in your flow. The first one will create the following columns:
[DateYear]: ROUND([DateYYYYMMDD] / 10000, 0)
[DateMonth]: ROUND([DateYYYYMMDD] / 100, 0) % 100
[DateDay]: [DateYYYYMMDD] % 100
Which will then be used in the next Derived Column component to get the date by conversion to a string first, or by using DATEADD - your preference:
(DT_DBDATE)((DT_WSTR, 4)[DateYear] + "-" + (DT_WSTR, 2)[DateMonth] + "-" + (DT_WSTR, 2)[DateDay])
or
DATEADD("Year", [DateYear], DATEADD("Month", [DateMonth], DATEADD("Day", [DateDay], (DT_DBDATE)"1899-12-31")))
Serial Date Numbers from Excel (or Other Sources)
Dates from Excel (or other sources) can arrive at your package as a serial number that is "interpreted" by Excel (or the other source) as a date. You'll have to research other source's interpretations for yourself, but they're likely to match Excel's (and SQL's) fairly closely. Excel stores dates as floating point numbers - the integral part is the number of days past Dec 31, 1899, and the fractional part is the fraction of the day (see How to use Dates and Times in Excel). SQL stores dates similarly, with the first two bytes containing the number of days since the base date of Jan 1, 1900, and the second two bytes containing the number of milliseconds past midnight (see datetime and smalldatetime).
That was long-winded... the short form is this:
DATEADD("Day", [SerialDateFromExcel], (DT_DBDATE)"1899-12-31")
Questions?
If you have any questions as to how to convert any particular number format into a date, please post here, or over at the MSDN SSIS Forum. I'll update this post if any other notable conversion information comes my way.
Hi Todd,
ReplyDeleteThanks for explaining this in detail.
However, I just needed a clarification with this.
If my source is a flat file or a raw file, then ofcourse Derived Column is my best bet.
But, what if my source is not a flat file, a raw file or an xml source. I have one more option available in that case, that of writing a sql query by selecting "SQL Command" as the access mode in the source component and have a calculated column for converting the string value to date in the sql query.
Eg. Select MyColumn1, MyColumn2, MyColumn3, Substring (TempDate,1,2) + '/'+ Substring (TempDate,3,2) + '/' + Substring (TempDate,5,4) From MyTable
Which method is preferable in such scenario, using the Derived Column or using a sql query as the source?
Thanks Again,
Shalin
In general, I would say two things:
ReplyDelete1. Converting using the RDBMS query in the Source will likely result in best performance. Relational sources are usually better performing than SSIS on conversions (all other things being equal).
2. Converting using a Derived Column component will likely result in the best transparency. SSIS doesn't make source queries very "visible", but a Derived Column transform can be labeled appropriately. The SSIS expression may be just as cryptic as the SQL expression - but it's obvious that a "transformation" of source data is occurring.
Those are general comments - as usual, it depends on your environment and anticipated skill sets of the SSIS users.
Yup, correct. Understood both the points.
ReplyDeleteThanks for the clarification.
Shalin
This was a very useful post. It answered my question on how to convert an excel integer "date" to a regular date in SSIS. I had to make one modification to make it work though:
ReplyDeleteDATEADD("Day",(DT_I4)Days,(DT_DBTIMESTAMP)"1899-12-31")
(days was converted to string because of the IMEX=1 property and I had to explicitly convert the date in the 3rd argument)
Your formula works well for my SSIS package converting Excel integer date string to SSIS date. Thanks for sharing.
DeleteConverting ExcelSerailDate to number is the most useful information. Thank you very much you saved my day!!
ReplyDelete