August 2009

Volume 24 Number 08

.NET Visualization - Visualizing Information with .NET

By Laurence Moroney | August 2009

This article discusses:

  • Data visualization
  • Building data-agnostic services
  • Building a visualization server
This article uses the following technologies:
C#, ASP.NET, XML

Contents

Getting Started
Building a Data-Agnostic Service That Uses This Data
Building a Data-Agnostic Visualization Server
Next Time

Information visualization has been around for a long time, but ask different people what it means, and you'll likely get many different answers—for example, charting, innovative animated images, or computationally intensive representations of complex data structures. Information visualization encapsulates all of these answers, and an information visualization platform is one that can support each of these scenarios.

From a scientific perspective, information visualization is usually used to define the study of the visual representation of large-scale collections of information that is not necessarily numeric in nature, and the use of graphical representations of this data to allow the data to be analyzed and understood. From a business perspective, information visualization is all about deriving value from data through graphical rendering of the data, using tools that allow end users to interact with the data to find the information that they need.

Of course, having just the capability to draw these pictures usually isn't enough for a good information visualization platform; there are also other levels of functionality that need to be addressed, such as:

  • InteractivityInteractivity can vary from animating the movement of slices in and out of a pie chart to providing users with tools for data manipulation, such as zooming in and out of a time series.
  • Generating related metadataMany charts have value added to them through related contextual metadata. For example, when you view a time-series chart, you might want to generate a moving average and tweak the period for this moving average or experiment with what-if scenarios. It's not feasible to expect a data source to generate all of these data views for you. Some form of data manipulation is necessary at the presentation layer.
  • Overlaying related dataA common requirement for charting is to take a look at other stimuli that might affect the data and have the visualization reflect this. Consider a time series showing a company's stock value and a feed of news stories about that particular stock. Real value can be added to the chart by showing how the news affected the value. "Good" news might make it go up, "bad'" news might make it go down. Being able to add this data to your time-series chart turns it from a simple chart into information visualization.

The key to building a visualization platform that can enable all of this is to have flexibility, so that you can render any data in any way at any time. This is a huge and generally specialized effort, but a technique that you can use to ease this effort is to use with called data agnotisticism.

Data agnosticism arises when you define an architecture for visualizing your data that isn't dependent on the data itself. For example, if you consider the example of a time-series chart that provides related metadata, it's quite easy to program an application to read the time-series data and the related metadata (such as a news feed) and to write the data on to the screen using a charting engine. However, once you've done this, your effort is good for this representation and this representation alone. The application you've written is tightly bound to the data itself.

The principle of data agnosticism allows you to pick a data source, define the data you want, and then tell the visualization engine to go and draw it however you want it to. We'll take a look at how to build a simple version of this engine in this article.

Getting Started

As with anything else, it's good to start with the data. In this section, I'll give a brief overview of a simple XML-over-HTTP service that provides time-series data provided by Yahoo Financial Services.

The Yahoo time-series service returns a CSV file containing basic time-series data with the following fields: Date, Opening Price, Closing Price, High, Low, Volume, and Adjusted Close. The API to call it is very simple:

ichart.finance.yahoo.com/table.csv

You use the following parameters:

Figure 1 Size and Access Times with Non-local Storage
Parameter Value
s Stock Ticker (for example, MSFT)
a Start Month (0-based; 0=January, 11=December)
b Start Day
c Start Year
d End Month (0-based; 0=January, 11=December)
e End Day
f End Year
g Always use the letter d
ignore Always use the value '.csv'

To get the time-series data for Microsoft (MSFT) from January 1, 2008, to January 1, 2009, you use the following URL:

https://ichart.finance.yahoo.com/table.csv?s=MSFT&a=0&b=1&c=2008&d=0&e=1&f=2009&g=d&ignore=.csv

Figure 1shows a C# function that takes string parameters for ticker, start date, and end date and builds this URI.

Figure 1 A C# Function That Builds a URI to Capture Data

public string BuildYahooURI(string strTicker, string strStartDate, string strEndDate) { string strReturn = ""; DateTime dStart = Convert.ToDateTime(strStartDate); DateTime dEnd = Convert.ToDateTime(strEndDate); string sStartDay = dStart.Day.ToString(); string sStartMonth = (dStart.Month - 1).ToString(); string sStartYear = dStart.Year.ToString(); string sEndDay = dEnd.Day.ToString(); string sEndMonth = (dEnd.Month - 1).ToString(); string sEndYear = dEnd.Year.ToString(); StringBuilder sYahooURI = new StringBuilder("https://ichart.finance.yahoo.com/table.csv?s="); sYahooURI.Append(strTicker); sYahooURI.Append("&a="); sYahooURI.Append(sStartMonth); sYahooURI.Append("&b="); sYahooURI.Append(sStartDay); sYahooURI.Append("&c="); sYahooURI.Append(sStartYear); sYahooURI.Append("&d="); sYahooURI.Append(sEndMonth); sYahooURI.Append("&e="); sYahooURI.Append(sEndDay); sYahooURI.Append("&f="); sYahooURI.Append(sEndYear); sYahooURI.Append("&g=d"); sYahooURI.Append("&ignore=.csv"); strReturn = sYahooURI.ToString(); return strReturn; }

Now that you have the URI for the data, you need to read it and to use it. In this case, I'll convert the CSV data to XML. A function that can do this is shown in Figure 2.

Figure 2 Converting CSV Data to XML

public XmlDocument getXML(string strTicker, string strStartDate, string strEndDate) { XmlDocument xReturn = new XmlDocument(); DataSet result = new DataSet(); string sYahooURI = BuildYahooURI(strTicker, strStartDate, strEndDate); WebClient wc = new WebClient(); Stream yData = wc.OpenRead(sYahooURI); result = GenerateDataSet(yData); StringWriter stringWriter = new StringWriter(); XmlTextWriter xmlTextwriter = new XmlTextWriter(stringWriter); result.WriteXml(xmlTextwriter, XmlWriteMode.IgnoreSchema); XmlNode xRoot = xReturn.CreateElement("root"); xReturn.AppendChild(xRoot); xReturn.LoadXml(stringWriter.ToString()); return xReturn; }

I put these functions into a class called HelperFunctions and added the class to an ASP.NET Web project. To this, I added an ASP.NET Web Form (ASPX) called GetPriceHistory and edited the ASPX page to remove the HTML markup so that it looks like this:

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="GetPriceHistory.aspx.cs" Inherits="PriceHistoryService.GetPriceHistory" %>

The nice thing about this approach is that you can now write code that writes directly to the response buffer and set the response type so that you can write XML over HTTP.

Because the helper functions take strings for the ticker and for the start and end dates, you can use them as parameters to the ASPX. You can then pass them to the helper functions to generate XML, which you then write out to the response buffer. In addition, the MIME type needs to be set to "text/xml" so that any reader sees it as XML and not text.

Figure 3shows the code to do that. Remember that HelperFunctions is the name of a class containing the functions that build the Yahoo URI and that read it and convert the CSV data to XML.

Figure 3 Code for the Helper Functions

HelperFunctions hlp = new HelperFunctions(); protected void Page_Load(object sender, EventArgs e) { string strTicker, strStartDate, strEndDate; if (Request.Params["ticker"] != null) strTicker = Request.Params["ticker"].ToString(); else strTicker = "MSFT"; if (Request.Params["startdate"] != null) strStartDate = Request.Params["startdate"].ToString(); else strStartDate = "1-1-2008"; if (Request.Params["enddate"] != null) strEndDate = Request.Params["enddate"].ToString(); else strEndDate = "1-1-2009"; XmlDocument xReturn = hlp.getXML(strTicker, strStartDate, strEndDate); Response.ContentType = "text/xml"; Response.Write(xReturn.OuterXml); }

You now have a simple XML-over-HTTP service that returns time-series data. Figure 4shows an example of it in action.

fig01.gif

Figure 4 A Simple XML-over-HTTP Service

Building a Data-Agnostic Service That Uses This Data

With server-generated visualization, a client renders an image, and all processing is done on the server. Some very smart visualization engines provide code that can post back to the server to provide interactivity by using image maps in the image that is rendered back, but this is extremely complex to generate, and the functionality can be limited. This approach is useful if you want to generate static charts that require no end-user runtime because the browser can render the common image formats. Figure 5shows a typical architecture for this approach.

fig05.gif

Figure 5 Typical Server-Rendered Visualization Architecture

When you build this architecture, you usually write server code that understands the data. In the previous case, for example, if you're writing a time-series chart that is plotting the Close value, you would write code that reads in the XML and takes the Close data and loads it into a series on the chart so that it can be plotted.

If you are using the Microsoft ASP.NET charting engine (which is freely downloadable; see the link later in this article), you'd typically define a chart like this:

<asp:Chart ID="Chart1" runat="server"> <Series> <asp:Series Name="Series1"> </asp:Series> </Series> <ChartAreas> <asp:ChartArea Name="ChartArea1"> </asp:ChartArea> </ChartAreas> </asp:Chart>

This approach, however, usually limits you to charting rather than visualization because the ability to provide interactivity is limited. The ability to generate related metadata is also limited in this scenario because all requests require a post-back to the server to generate a new chart and would be limited to the functionality that is provided on the server. The ability to overlay related metadata is also limited for the same reasons.

However, the important capabilities of data agnosticism can be enabled by this scenario. It's relatively easy for you to configure metadata about your data source and where in the data source you can find your data series and data categories. An engine can process this metadata and turn it into the series and categories that the server can render, making it easy to add new visualizations without a lot of extra programming.

Building a Data-Agnostic Visualization Server

There are a number of server-side charting technologies available, and the programming APIs change across them, but the principles that I discuss here are similar across all of them. In this section, I'll look at the free ASP.NET charting engine from Microsoft. You also need the Visual Studio add-ins for the Charting server.

Let's look at what it takes to build a pie chart with this charting engine. The code is very simple. First, add an instance of the chart control to an ASPX Web form. You'll see something like this in the code view:

<asp:Chart ID="Chart1" runat="server"> <Series> <asp:Series Name="Series1"> </asp:Series> </Series> <ChartAreas> <asp:ChartArea Name="ChartArea1"> </asp:ChartArea> </ChartAreas> </asp:Chart>

Then write code like the following to render some data in the chart control:

double[] yValues = { 20, 10, 24, 23 }; string[] xValues = { "England", "Scotland", "Ireland", "Wales" }; Series mySeries = Chart1.Series[0]; mySeries.Points.DataBindXY(xValues, yValues); mySeries.ChartType = SeriesChartType.Pie;

In this case, I've hard-coded the values, but you would usually read them from a database or from a service and then load them into the arrays before using them to generate the chart. Of course, the reuse of this code becomes difficult, and any changes in the data source can break it, so let's take a look at writing something that doesn't need to be bound to the data type.

The nice thing about representing the data in XML is that I can use the XPath language to define where in the XML document the data I want to plot will come from. For the data shown in Figure 1, the XPath statement that defines the location of the Close prices looks like this:

/NewDataSet/TimeSeries/Close

Now, if you think about it, instead of writing code that contains the definitions for your chart, you can externalize it as a configuration. Imagine a configuration file like the one shown in Figure 6.

Figure 6 A Configuration File That Defines a Chart

<root> <Chart Name="PriceHistory1"> <Uri> <Path>https://localhost/PriceHistoryService/GetPriceHistory.aspx</Path> <Param Name="ticker"> MSFT </Param> <Param Name="startdate"> 1-1-2008 </Param> <Param name="enddate"> 1-1-2009 </Param> </Uri> <Data> <SeriesDefinitions> <Series id="ClosePrice"> <Data>/NewDataSet/TimeSeries/Close</Data> <Type>Line</Type> </Series> </SeriesDefinitions> </Data> </Chart> </root>

You're now defining a chart called PriceHistory1 that takes its data from the given URL, appending parameters with the given names and given values. The values are hardcoded in this case, but there's nothing to stop you from writing code that uses parameters generated by an end user.

Additionally, the Series Definitions section defines a number of series with an XPath statement indicating where the data comes from and how to draw it. Right now it uses a simple definition of a chart type, but you could include extra parameters here for color or other elements or for defining multiple series (it's XML after all, so it's easy to add extra nodes) as well as categories, labels, or other such metadata. For this example I've kept it simple.

Now your charting-engine code will look vastly different. Instead of writing code that reads the data, parses the data, and loads it directly into the chart, you can write code that reads the configuration, builds the service call URI from the configuration data, calls the service, gets the returned XML, and uses the XPath variables in the configuration to get the data series you want.

Under these conditions, your architecture can be much more robust. Consider, for example, if the data source value changed its XML tag from Close to Closing Price. You wouldn't have to edit or recompile your code; you'd simply edit the XPath variable in the chart definition.

It's not much of a stretch to think about how you would edit this to connect to different types of data sources, such as database connections or Web services. Figure 7shows the code that plots the time-series data on an ASP.NET chart.

Figure 7 Plotting the DTime-Series Data on an ASP.NET Chart

protected void Button1_Click(object sender, EventArgs e) { // Variable declarations StringBuilder dataURI = new StringBuilder(); WebClient webClient = new WebClient(); XmlDocument xmlChartConfig = new XmlDocument(); XmlDocument xmlData = new XmlDocument(); // Get the chart config Uri uri = new Uri(Server.MapPath("ChartConfig.xml"), UriKind.RelativeOrAbsolute); Stream configData = webClient.OpenRead(uri); XmlTextReader xmlText = new XmlTextReader(configData); xmlChartConfig.Load(xmlText); // I'm hard coding to read in the chart called 'Price History 1'. In a // 'real' environment my config would contain multiple charts, and I'd // pass the desired chart (along with any parameters) in the request // string. But for simplicity I've kept this hard coded. XmlNodeList lst = xmlChartConfig.SelectNodes("/root/Chart[@Name='PriceHistory1']/Uri/*"); // The first child contains the root URI dataURI.Append(lst.Item(0).InnerText.ToString()); // The rest of the children of this node contain the parameters // the first parameter is prefixed with ?, the rest with & // i.e. https://url?firstparam=firstval&secondparam=secondval etc for (int lp = 1; lp < lst.Count; lp++) { if (lp == 1) dataURI.Append("?"); else dataURI.Append("&"); // In this case the desired parameters are hard coded into the XML. // in a 'real' server you'd likely accept them as params to this page dataURI.Append(lst.Item(lp).Attributes.Item(0).Value.ToString()); dataURI.Append("="); dataURI.Append(lst.Item(lp).InnerText); } // Now that we have the URI, we can call it and get the XML uri = new Uri(dataURI.ToString()); Stream phData = webClient.OpenRead(uri); xmlText = new XmlTextReader(phData); xmlData.Load(xmlText); // This simple example is hard coded for a particular chart // ('PriceHistory1') and assumes only 1 series lst = xmlChartConfig.SelectNodes("/root/Chart[@Name='PriceHistory1']/Data/SeriesDefinitions/Series/Data"); // I'm taking the first series, because I only have 1 // A 'real' server would iterate through all the matching nodes on the // XPath string xPath = lst.Item(0).InnerText; // I've read the XPath that determines the data location, so I can // create a nodelist from that XmlNodeList data = xmlData.SelectNodes(xPath); Series series = new Series(); // I'm hard coding for 'Line' here -- the 'real' server should // read the chart type from the config series.ChartType = SeriesChartType.Line; double nCurrent = 0.0; // I can now iterate through all the values of the node list, and foreach (XmlNode nd in data) { // .. create a DataPoint from them, which is added to the Series DataPoint d = new DataPoint(nCurrent, Convert.ToDouble(nd.InnerText)); series.Points.Add(d); nCurrent++; } // Finally I add the series to my chart Chart1.Series.Add(series); }

The results are shown in Figure 8. No configuration has been done on the chart, and it's using the default configuration values, but the data is being read and being plotted.

fig08.gif

Figure 8 The Results Generated by the Time-Series Data

A small tweak to the configuration file to give me volume and a different set of dates (1-1-1980 to 1-1-1990) provides the view in Figure 9—without changing a line of code in the charting service because it is data-agnostic.

fig09.gif

Figure 9 New Results After Tweaking the Configuration File

Documenting how to build a data-agnostic server would take an entire book in its own right, and here I've just skimmed the surface. The principles you've seen will apply to most APIs you work with, and the majority of the code you write should be for managing the external chart configuration to get the data, making what you've seen here very portable.

In this article, I looked at one of the main principles of building data visualizations—providing a way to render your data in a data-agnostic manner. In a future article, I will explore using rich technologies on the client side to provide the ability to interact with your data and to smartly aggregate disparate data sources. The power of the .NET platform is now available in the browser using Microsoft Silverlight, so we will use it to demonstrate these principles.

**Laurence Moroney **is a senior technology evangelist with Microsoft, specializing in Silverlight. He is the author of many books on computing topics, including Silverlight, AJAX, interoperability, and security. You can find Laurence's blog at blogs.msdn.com/webnext.