This article discusses Microsoft Codename “Data Explorer.” All information is subject to change.
I have a client for whom I develop many in-house business applications. The company also has a resident geek, a clever young guy who leverages his Web 2.0 background to create some in-house solutions as well. I code for them using Microsoft .NET Framework tools against a SQL Server 2008 database. He develops PHP apps that use an online MySQL database. A few years ago he showed me what he was working on and I noticed dropdown lists containing data that duplicated what my apps create and consume in SQL Server. WCF Data Services was still in its early stages (known then as “Astoria”), but I saw a great opportunity to try this technology out as a bridge between our development projects.
I proposed creating a read-only service from the SQL Server data that he could consume in his applications, eliminating the need to maintain the duplicate data in the MySQL database. He loved the idea. A few days later, Microsoft released its first PHP toolkit for consuming Astoria feeds. This developer was one of the first toolkit users and we had our collaborative solution up and running very quickly.
Fast forward … Astoria is now WCF Data Services, its output has been encapsulated in a specification called OData (odata.org) and there’s an OData SDK for PHP (odataphp.codeplex.com/). My client’s service has evolved after numerous requests that I expose more data from the SQL Server database for his applications. But there’s a problem. Each time my client needs something new added to the service, I have to create an appropriate view in the database, open Visual Studio, modify the data model to include an entity that maps to this view, then compile and redeploy this new service over the VPN to his Web server. The process is fairly time-consuming and can be a bit of a drag—especially if I’m getting close to a Data Points column deadline!
At the SQLPass Summit in October 2011, Microsoft announced a new tool under development that’s currently called Microsoft Codename “Data Explorer.” Here, I’ll simply call it Data Explorer.
Data Explorer allows users to access a variety of data sources (SQL Azure, SQL Server, HTML from Web pages, OData feeds, text files and Excel files, to name a few) and build views from them using a relatively simple interface. The UI lets you slice, dice, filter, rename and combine data. You can mash up data from disparate file types to, for example, create a view that joins your SQL Server data to statistics listed on a Web page, and then present the results as an OData feed, an Excel file or any of a number of other formats.
You can find lots of walk-throughs and videos of Data Explorer features on the team’s blog (blogs.msdn.com/dataexplorer). There’s a Web-based client (referred to as a cloud service) and a desktop client. The cloud service currently lives in the SQL Azure Labs as a service and you must sign up for access (at bit.ly/pI8Oug). You can download the current version of the desktop app from http://www.microsoft.com/en-us/sqlazurelabs/default.aspx. As of this writing, these tools are still in early development stages, and I anticipate many improvements and benefits to come.
One of the Data Explorer features rang out as a way to simplify my process of adding more data to the Data Service I maintain for this client—the ability to publish the results of the slicing and dicing as an OData feed. Today (early January 2012, as I write this column) publishing is supported by the cloud service but not yet by the desktop client. While my eventual goal is to do all of this work on my client’s internal network using its SQL Server database and Web service, I’ve explored the possibility of using the cloud service and a SQL Azure database. This is what I’ll share with you. I won’t bother with a step-by-step walk-through, as that’s readily available in the resources I mentioned earlier.
Your workspace can contain multiple mashups and each one can contain a number of resources. Figure 1 shows four resources I created in a mashup called Mashup1. Three of the resources expose a view of Customers, a list of Beer data and a list of Breweries, while the final resource (Beer List Merged with Breweries) exposes a merged and flattened list of the breweries along with the beer each produces. I had to create the Beer List and Breweries resources in order to merge them into yet another resource. The state of the “check mark” icon to the right indicates what will and won’t be exposed in the published data. The dimmed check marks next to Breweries and Beer List indicate that these will not be exposed. As you can see in Figure 1, I plan to publish the Customers and the Beer List Merged with Breweries, but not the building blocks I used to create the merged set.
Figure 1 Four Data Sets that Have Been Created in Mashup1
If you look at the mashup’s overview in my Workspace, as shown in Figure 2, you can see that Mashup1 has two outputs, representing the two resources I’ve chosen to expose.
Figure 2 Overview of Mashup1 in My Workspace
When I publish Mashup1 using the Data Explorer tools and then browse the resulting OData feed, you can see that the service does indeed expose those two resources and nothing more (see Figure 3).
Figure 3 Two Data Sets Exposed as Collections in My Published Mashup
<?xml version="1.0" encoding="UTF-8"?>
<service xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom" xml:base="https://ws39047873.dataexplorer.sqlazurelabs.com/Published/CustomersAndBeerData/Feed/">
<collection href="Beer List Merged with Breweries">
<atom:title type="text">Beer List Merged with Breweries</atom:title>
The Data Explorer UI enables you to build up complex queries visually, but because I’ll be building this data feed, it’s easier for me to just create a view in the database. So my plan is that each time my power user wants access to more data in his service, I’ll create a database view, add it as a resource in my mashup and republish the feed.
In this case, because I want my OData feed to just reflect the changes in the schema on demand, I can simply do that. But you need to be careful because apps that are relying on the feed might break if the feed introduces non-additive or certain other changes.
As another approach, however, Data Explorer also lets me shield my API/feed surface from any changes in the schema. I simply have to update the mashup to account for the schema changes, while keeping the shape of my outputs unchanged.
I built these resources directly from my SQL Azure database. My power user could do the same, but I don’t want him to directly access the database. Instead, I’ll let him have his own Workspace and build his own mashup from the one I published.
Figure 4 shows all the types of data you can use to create resources in a mashup. SQL Databases can be either a SQL Server database or a SQL Azure database.
Figure 4 Options for Adding Data to a Mashup
Among the possible data sources are an OData Feed and an existing mashup. These make it possible for the power user to build a mashup from my published feed.
He can add a dataset based on my OData feed to a mashup in his own workspace and then massage that view for his own needs. Data Explorer lets him hide columns he’s not interested in, build concatenated fields, change how his service displays field names and so much more. If, for example, he initially thinks he doesn’t need a column and later decides he does want access to it, he can simply modify his own mashup without worrying about whether I’m abroad at a conference or out on a bike ride.
The database stays secure, but what about the data feed I’m creating? Data Explorer already has a number of security features built in, and Microsoft is working on enriching security even more. One security path is to grant permissions to a user with an existing Data Explorer account to access your workspace as an Owner, an Author or a Consumer in the workspace. Were I to grant my power user Author permissions, in his own workspace he could then create and publish his own mashups based on my feed.
I could also provide the power user with a Feed Key. Each workspace has a single Feed Key that lets anyone access any feed published from a workspace. When the power user accesses my OData feed to build his own mashup, he’ll be required to supply whatever credentials I’ve specified. Figure5 shows the credentials being requested when the user tries to add my CustomersAndBeerData feed to his own.
Figure 5 Adding an OData Feed to a Mashup
Once he’s pulled my feed into his own mashup, he can reshape the output of the feed, even combining it with other data sources if he wishes. When he’s satisfied, he can publish his results to an OData feed of his own and consume it from his application using credentials that he specifies in his own workspace.
In the end, when the power user wants additional OData for his application, my involvement is greatly reduced. I still need to add a view in the database, then add a new resource to my mashup online, then republish. But this workflow, visualized in Figure 6, is much more appealing to me than having to open the project in Visual Studio, update the data model, rebuild, log in to the VPN and, finally, push the new assembly to my client’s server.
Figure 6 Workflow from Database to Application
Data Explorer is not yet a released product and will continue to evolve. But having worked through this scenario using the currently available preview, I’m truly looking forward to shifting to this solution when Data Explorer is officially released. And while I’m currently focused on a very narrow use case, I’ll be keeping an eye on blogs.msdn.com/dataexplorer for updates.
By Miguel Llopis and Faisal Mohamood
Microsoft Codename “Data Explorer” is a new tool that enables self-service Extract, Transform and Publish (ETP). Data Explorer simplifies the creation of data mashups from a combination of different data sources, such as Excel, Access, SQL Server, the Web (HTML, Web APIs, OData) and more. The main goal for Data Explorer is to make the acquisition, transformation and composition of data easy and seamless in order to publish the data into end-user tools such as Excel/PowerPivot or to power custom applications.
Data Explorer is currently available as a lab at SQL Azure Labs and supports both on-premises as well as cloud data sources. Visit dataexplorer.sqlazurelabs.com to learn more and try out this innovative Azure service.
Julie Lerman is a Microsoft MVP, .NET mentor and consultant who lives in the hills of Vermont. You can find her presenting on data access and other Microsoft .NET topics at user groups and conferences around the world. She blogs at thedatafarm.com/blog and is the author of “Programming Entity Framework” (O’Reilly Media, 2010) and “Programming Entity Framework: Code First” (O’Reilly Media, 2011). Follow her on Twitter at twitter.com/julielerman.
Thanks to the following technical experts for reviewing this article: Miguel Llopis and Faisal Mohamood
More MSDN Magazine Blog entries >
Browse All MSDN Magazines
Subscribe to MSDN Flash newsletter
Receive the MSDN Flash e-mail newsletter every other week, with news and information personalized to your interests and areas of focus.