First - it's been a while since I've posted. I have quite a few articles in the pipe, but they're "heavier", so take longer to flesh out and publish. I thought I'd start the ball rolling by commenting on the newly released Data Explorer Preview for Excel. If you are an "information worker" that is copying and pasting data into Excel, then cleaning it up, download and install this now. It definitely has the potential to save you tons of time redoing that work, or remembering what you did, or where you got the data later. There are plenty of resources to help you get started with Data Explorer, with walkthroughs and samples - Bingle to your heart's content. I'll wait.
The Commentary
As most of you know, I'm a DW and SSIS guy, so I'm interested in "enterprise" tools for ETL - things that can be managed, scheduled, scaled, and monitored. So why am I commenting on Data Explorer? First, it is very cool. But my real intention is to question and to elicit some reaction as to the future of Data Explorer. Data Explorer is currently a "self-service" BI tool just like PowerPivot was, but PowerPivot has matured somewhat since v1. PowerPivot workbooks can be transferred (by two methods) from the wild west of self-service to the Victorian rigour of IT governance.
The first method is pretty slick, transparent to business users, and pretty effective. IT simply has to set up their SharePoint farm appropriately with PowerPivot for SharePoint. Business users then simply take advantage of SharePoint's document repositories to collaborate and share. They'll do this of their own accord, because it's much more convenient than email attachments, and because you've berated them often enough about clogging your Exchange server with dozens of copies of a very large workbook distributed to a large number of recipients. What happens in the background is that SharePoint strips out the PowerPivot model from the Excel workbook and loads it into the PowerPivot for Excel shared service. In addition to the sharing benefits (not needing Excel or PowerPivot addin installed), IT immediately gets visibility and manageability of the underlying SSAS tabular model.
The second method is a manual, deliberate process. An IT resource can open the workbook in Visual Studio, extract the PowerPivot tabular model (just like happens in SharePoint), work with it (to extend and/or validate it), then deploy it to an IT-managed SSAS server.
Data Explorer Growth Options
What's the story for Data Explorer? Right now, there doesn't appear to be one, but I'm sure hoping (and asking) Microsoft to think about one. The benefits are quite similar to those achieved by the PowerPivot example. If we can "grow up" the Data Explorer transforms, IT can validate them, stabilize them, and manage data lineage concerns.
Similarly to PowerPivot, I can see a couple options here. First, they could simply create a "Data Explorer for SharePoint" complement to the "Data Explorer for Excel" shared service. When you uploaded an Excel workbook that uses Data Explorer, SharePoint would extract the DE part, store it in some kind of new DE repository/server, and data would be funneled through that server when the workbook was refreshed as an Excel Service. Monitoring and connection management are natural SharePoint extensions. Being able to view and/or edit the transform in a "management portal"... less likely - just open the workbook. Share a DE implementation between workbooks, or surface the DE result as a business data connection? I can see the appeal.
But that first option is likely to be unappealing to a good portion of IT shops, for much the same reason that the first PowerPivot option is. It's not using tooling they're already familiar with and skilled up for. It's not "enterprise-grade". And it's baked into SharePoint, which has a particularly bad stink to SQL Server professionals.
What I'd like to see is the ability to extract the Data Explorer bits from a workbook into SSIS. What would this look like? I can see two options - both of which would have to be a ways off in development, simply because Data Explorer isn't done yet.
First, I could see a "translation" layer, where the DE "steps" are converted into native SSIS data flow components. Some of these components may not exist now (depending on what DE steps ultimately get shipped). There would be some development required on the SSIS side to address that.
Second, I could see new DE components created for SSIS - possibly replacing or duplicating current components. That's a bigger job (I think), but may give Microsoft the opportunity to rationalize both tools at the same time. I don't know if they're interested in doing that.
Final Thoughts
One thing that all of the above still doesn't solve - and this is a little off-topic - is custom transforms. I don't currently see a method in DE to create your own transform. That's probably a little beyond your average information worker. But it's not beyond "the ecosystem". I can definitely see third parties wanting to create "address parsing" transforms and the like. I think this is a great opportunity to add that kind of curated "store" concept to DE/SSIS, to take over from amateurish efforts like mine.
So when you see a demo of Data Explorer given, raise your hand as a data professional and ask the presenter about the enterprise story. That feedback will make it back to the product team, and make us all happier.
You're not the only one on this. I actually sent feedback to the MSFT DE team and got no response back...that could mean 2 things:
ReplyDelete1. They are tired of me bugging them about the errors and bugs that I've encountered
2. They have something in their hands that they are not prepared to let us know yet (there's hope!)
Who knows? maybe in the next summer they'll release a CTP with some new cool features?