Web Queries and Dynamic Chart Data in Excel 2002

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

 

John R. Durant
Microsoft Corporation

June 2003

Applies to:
    Microsoft® Excel 2002
    Microsoft Office Excel 2003

Summary: Learn how to use advanced Web query techniques to streamline data feeds into Microsoft Excel 2002. This article describes how to use some of the lesser-known abilities of the QueryTable object in Excel such as how to use it for importing data while sending parameters to the target Web server and how to cause a chart to change its data source after importing a new payload by using Web queries. (15 printed pages)

Contents

Business Scenario
Excel Web Queries
Dynamic Chart Data
Final Touches
Conclusion

Business Scenario

Before diving into the technical details, let's take just a moment to describe exactly what kind of problem Web queries and dynamic chart sizing can solve. Imagine a sales manager who regularly looks at sales data. She wants to specify a range of dates and then get aggregate sales data. In particular, she wants to see sales figures summarized by day within the specified range. She wants to see the raw data as well as a chart showing the summary sales trend for the specified date range.

It is not practical to create explicit reports for each of her requests. To do this, someone must sift through input data and create separate spreadsheet reports for the vast number of scenarios that could exist in the organization. The burden on the developers and IT staff is too large, especially when considering that the sales manager for one department is not the only person in the organization that has a request of this nature. A better possibility is to have an Excel spreadsheet that could adapt itself to deliver the various reports the sales manager needs as well as one that other departments could reuse and adjust for their similar needs.

This article shows you how to use features in Microsoft Excel 2002 for a simple, yet complete business solution, shown in Figure 1, that meets these needs.

Aa140050.odc_webqueries01(en-us,office.10).gif

Figure 1. Identify criteria and generate a dynamic report

This article discusses how to use Web queries as the primary vehicle for data delivery. You will also employ a few controls in an attractive user interface to make it easier for the user to initiate the report. Finally, you will see how to create a chart that automatically responds to changes in its source data so that it is a dynamic, living report.

Excel Web Queries

Web queries are not new to Excel 2002 or to Microsoft Office Excel 2003. Actually, they have been around since Excel 97, but with Excel 2002, they became more obvious, more powerful, and easier to use. The basic principle behind Web queries is to allow Excel seamlessly use data that is retrieved from a URL. The data can be simple HTML pages, document content, XML, or any text-based data in a URL-addressable location. What this means is that, although a Microsoft ActiveX® server control or other complex binary object on a page cannot provide its internal data directly to Excel, virtually any other URL that offers data to your organization is a candidate for direct import into Excel for reporting needs. Users can employ the Excel user interface to initiate a Web query. However, what makes things intriguing for developers is that you can also do these activities programmatically, thus allowing Web queries to become part of a more complete and reusable solution that you develop.

In our business solution, the user opens the Web query dialog box and types a URL to retrieve the target data, but we want a solution that makes things easier for the user. The focus for the user is on evaluating the data, not in knowing how to get the data. To understand how it works, look at Figure 2 that shows the application flow.

Aa140050.odc_webqueries02(en-us,office.10).gif

Figure 2. Retrieve data from a Web page based on specific criteria

The focus of the code, then, is where the Web query retrieves the data from a URL and places it into the spreadsheet. While we use the term Web query to refer to this technique, in the object model a QueryTable object represents a Web query, so named because it actually represents a table of data returned from a variety of external sources. A Web query is a type of QueryTable, whose source is an addressable URL. A workbook organizes by group all of these together in a QueryTables collection, so the lifetime of a QueryTable is bounded by the lifetime of a workbook. In other words, there is no way to make Web queries work in the background without an open workbook.

QueryTable Object Overview

To begin to create a Web query programmatically, you must set a couple of properties of the QueryTable object and add it to the QueryTables collection. This process is completed rather economically in a single step by calling the Add method of the QueryTables collection. The Add method returns a QueryTable object that is the new Web query. The syntax for the Add method is as follows:

QueryTables.Add(Connection,Destination)

The Connection parameter accepts a reference to whatever is going to serve up your data. If the method returns a QueryTable used to retrieve Web-based data, then the connection is a URL. Otherwise, this can be a database connection string, an ActiveX Data Objects (ADO) Recordset object, a reference to a text file, or an instance of a QueryTable whose properties are already set. Because the function of a QueryTable is to import tabular data from an external source, it needs somewhere a location to place the resulting payload. The Destination parameter is for this purpose. This parameter is a reference to a cell range where the imported data is placed. Because a QueryTable is tied to a worksheet, the range must be within the same sheet as where the QueryTable exists. The final parameter, Sql, is optional and designates a SQL-syntax query string when creating a QueryTable to an ODBC source. Web queries do not use this parameter.

These values are the only ones that are absolutely required to set any Web query. The actual connection, however, is made when calling the Refresh method of the QueryTable object. When this method is called, Excel connects to the source specified for the Connection property of the QueryTable. Keep in mind that Excel cannot broker complicated connectivity scenarios for the user. For example, if the URL to access requires a user to logon with specific credentials, the Web server prompts the user. However, Excel cannot negotiate this interaction, and the Web query fails. You should handle such results in your code rather than having various alerts unexpectedly appear to the user. The Application object has a DisplayAlerts property that can be set to FALSE during the time you are attempting to run the Web query to avoid these unexpected alerts. If the QueryTable successfully connected to the target source and retrieved data, it returns a value of TRUE, otherwise it returns FALSE. Sometimes, the Web query may retrieve more than the limit of rows allowed in an Excel spreadsheet. To determine whether this error occurred, look at the FetchedRowOverflow property of the QueryTable object. If the number of rows was too large, the property value will be TRUE. In the following example, you can see how the QueryTable is created, and how things are handled if too many rows are returned.

Dim qtsQueries as QueryTables
Dim qtQuery as QueryTable
Set qtsQueries =ActiveSheet.QueryTables
Set qtQuery = qtsQueries.Add _
  ("URL;http://server_name/site_name/my_data.htm", _      
  Application.Range("C11:D11"))
With qtQuery
  .Refresh
  If .FetchedRowOverflow Then
    MsgBox "Your request returned too many results. " _
    & "Please refine your search.", vbInformation, "Result Error"
  End If
End With

You should configure other settings on the QueryTable object so that the data imports in an expected manner. While you should consult the object model documentation for all of the properties and methods of the QueryTable object, some more commonly used members are worth noting here. To make things easier, the table organizes the members into groups for reference. The first group is a list of members that connect to data, identify which data to import, and reconnect to data sources refresh data. The second group is a list of members that format imported data.

Connections and Identifying Target Data
Type Name Description
Property BackgroundQuery Set to TRUE to allow the Web query to run periodically and refresh the data import in the background.
Property QueryType Use to determine the type of QueryTable object used. Web queries only use xlWebQuery
Property WebDisableRedirections Set to FALSE by default to allow Web pages to redirect the browser to a more Web query-friendly version so that non-Web query requests can have more complex markup in the page.
Property WebSelectionType Set to determine whether an entire Web page, all tables on the Web page, or only specific tables on a Web page are imported into a query table. The possible values are: xlEntirePage, xlAllTables (default), xlSpecifiedTables
Property WebTables Set if the target Web page contains tables. To retrieve only selected tables set this value as a comma-delimited list of table names or table index numbers such as: "1,2,3" or "FirstTable, SecondTable, ThirdTable"
Property PostText Set to specify a string containing name/value pairs that can be accessed by code on the Web server.
Property Refreshing Gets TRUE if the Web query is running as a background process and is in process.
Property RefreshOnFileOpen By default, set to FALSE. Use to configure the Web query to reconnect to the target source and refresh the spreadsheet with data when the host workbook is opened.
Property RefreshPeriod Gets or sets an integer from 0 through 32767 that represents a value in the number of minutes. If BackgroundQuery value is set to TRUE, use this property to set the time interval between refreshes.
Property RefreshStyle Get or set to indicate how the worksheet should accommodate changes. Possible values are:

xlInsertDeleteCells- Insert or delete partial rows to match the exact number of rows required for the new recordset.

xlOverwriteCells- Do not add new cells or rows to the worksheet. Overwrite data in surrounding cells to accommodate any overflow.

xlInsertEntireRows- Insert entire rows, if necessary, to accommodate any overflow. Delete no cells or rows from the worksheet.

When periodically refreshing a Web query in the background, the result of the import may be different. You can add new rows, or delete existing ones.

Method Refresh Call this method to trigger the connection and data import process. Returns TRUE if successful; otherwise, FALSE. Only one parameter is passed to this method: BackgroundQuery, a Boolean value. Always set to FALSE for Web queries, as it is designed for query tables that connect to database source.
Method CancelRefresh Call to cancel connections to data sources. Use the Refreshing property to determine if the query is in process.
Event AfterRefresh Fires after the query table refreshes or is cancelled.
Event BeforeRefresh Fires before any refreshes of the query table, irrespective of how the refresh was initiated.
Formatting
Type Name Purpose
Property AdjustColumnWidth Read/write Boolean. Set to TRUE if the column widths automatically adjust for the best fit each time you refresh a query table. FALSE if the column widths do not automatically adjust with each refresh. The default value is TRUE.
Property PreserveFormatting Set to TRUE when any formatting common to the first five rows of data are applied to new rows of data in the query table; otherwise FALSE.
Property FillAdjacentFormulas Set to TRUE to copy formulas in columns to the right of the range where the target data is placed; otherwise FALSE. If the range shrinks because of a data refresh, any unused formulas to the right of empty cells are removed.
Property ResultRange Gets a Range object that represents the area occupied by the results of the Web query. Use this Range to gain programmatic access to the columns and rows of imported data quickly.
Property WebDisableDateRecognition Set to disable data recognition to eliminate ambiguity if imported data is ambiguous in that it could be a date or some other meaningful element, such as "1/3."
Property WebFormatting Get or set to indicate how much formatting to retain from a Web page when importing the page into a spreadsheet. Possible values include:

xlWebFormattingAll

xlWebFormattingRTF

xlWebFormattingNone (default)

Property WebPreFormattedTextToColumns Get or set to indicate whether data contained within HTML <PRE> tags in the Web page is parsed into columns when imported. Set to TRUE by default. This property applies to Web queries when importing from HTML.

Be assured that it is unlikely you will need to use all of these properties, methods, and events in every Web query. The QueryTable object has several additional properties and methods that are not relevant to Web queries, so they are not included here.

Returning to our solution, the target data are contained on an ASP-based Web page. That page actually retrieves the data from SQL Server based on the authenticated identity of the user. The retrieved data are actually in XML format and then transformed using XSL to become simple HTML. You can further restrict the results retrieved from SQL Server by using values in posted form variables sent to the ASP page. These values are the date range values the user supplies in the spreadsheet. The internal workings of these Web and database operations are beyond the scope of this paper. However, the fact that the Web page expects form variables means that the VBA code must accommodate them in the QueryTable definition before it attempts to connect to the data source.

Web Query Setup

The two main things you need to specify for any Web query are where to look for source data, and where to place the data in the spreadsheet after retrieval. Our scenario adds one other bit of complexity by requiring that the Web query provide some parameters, in this case a beginning and an ending date, so that it can refine scope of retrieved data. In our VBA IDE, we place our code in the Click Event procedure for a button embedded in the worksheet, cmdGetData.

This procedure launches the main code sequence to deliver the business solution. At the outset, the procedure contains a series of variable declarations and some code that defines the source of the query, the destination range, and the parameters posted to the Web page.

  Dim qtQuery As QueryTable
  Dim qtsQueries As QueryTables
  Dim nmDefinedNames As Names
  Dim rngQueryParams As Range
  Dim strPost As String
  Dim strParamName As String
  Dim i As Integer
  Dim strTargetURL As String
  
  Set qtsQueries = ActiveSheet.QueryTables()
  Set nmDefinedNames = ActiveWorkbook.Names
  strTargetURL = "URL;" & nmDefinedNames("Server").RefersToRange.Text & _
    cboReportSource.Text & nmDefinedNames("Page").RefersToRange.Text
  Set qtQuery = qtsQueries.Add(strTargetURL, _
    Application.Range("ReportSheet!C11:D11"))
  Set rngQueryParams = nmDefinedNames.Item("Parameters").RefersToRange
  For i = 0 To rngQueryParams.Rows.Count - 1
    strParamName = rngQueryParams.Rows(i)
    If i = 0 Then
      strPost = strParamName & 
        nmDefinedNames.Item(strParamName).RefersToRange.Text
    Else
      strPost = strPost & "&" & strParamName & 
        nmDefinedNames.Item(strParamName).RefersToRange.Text
    End If
  Next i

The first variable used, qtsQueries, acquires a reference to QueryTables collection from the worksheet like this: Set qtsQueries = ActiveSheet.QueryTables(). Then, add a new QueryTable instance to the collection and its reference is returned to the qtQuery object variable. In so doing, two parameters are passed to the Add method of the QueryTables collection, the target URL, contained in a variable, and a hard-coded reference to the destination range. The strTargetURL variable is constructed by concatenating values from the user interface and from named ranges on a hidden worksheet. Figure 3 shows the hidden worksheet with its named ranges. Storing information in this way is a common technique in Excel programming, and its principal benefit is that by adding our change settings here, the overall solution can adapt to changing needs.

Aa140050.odc_webqueries03(en-us,office.10).gif

Figure 3. Use a hidden worksheet to hold settings for the Web query

To make it easier to gain access to the defined names in the workbook, the nmDefinedNames variable gains a reference to the Names collection in this way:

Set nmDefinedNames = ActiveWorkbook.Names

The target URL is comprised of the name of a Web server, retrieved from a named range, the name of a Web site, taken from user selection in a combo box embedded in the user's worksheet, and the name of a Web page, also taken from a named range on the hidden worksheet.

The next thing to do is to load the QueryTable definition with a list of parameters in a named/value pair format. This is done by looking at another named range on the hidden worksheet. This named range, called Parameters, can have no rows or many rows containing the names of parameters to pass to the Web page. Because the number and names of parameters can change, loop through the collection of rows in the range to get the names of parameters. Each parameter is listed as text, not as its own named range, and all of the rows in the list are part of the Parameters range. These text names are matched by actual named cells back on the user's worksheet. For example, the Web page expects a BeginDate parameter. We also find a named range with the same name. By separating them in this way, we can distinguish named ranges that refer to parameters for the Web query and those that serve other purposes in the solution. Each of the names and their corresponding values in the user's worksheet are placed in a concatenated string that becomes the value of the PostText property setting of the QueryTable object.

Set rngQueryParams = nmDefinedNames.Item("Parameters").RefersToRange
  For i = 1 To rngQueryParams.Rows.Count
    strParamName = rngQueryParams.Rows(i)
    If i = 1 Then
      strPost = strParamName & 
        nmDefinedNames. Item(strParamName).RefersToRange.Text
    Else
      strPost = strPost & "&" & strParamName & 
        nmDefinedNames. Item(strParamName).RefersToRange.Text
    End If
  Next i

With the basic information provided for the QueryTable definition, you can apply further settings. Because we are going to configure multiple properties of the same object, it is best to place the settings in a With . . . End With block to avoid repeatedly calling the object.

With qtQuery
    .Name = "ReportData"
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .PostText = strPost
    .AdjustColumnWidth = True
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebSingleBlockTextImport = True
    .WebDisableDateRecognition = False
    .Refresh BackgroundQuery: = False
    If .FetchedRowOverflow = True Then
      Err.Raise vbObjectError + 1001, "QueryTable", "Too many rows"
    End If
  End With

The properties specified in this example are explained earlier in this article. The PostText property is set using the string that was assembled with the parameter names and values from the worksheet. In addition, the FetchedRowOverflow property is used. If the results are too large, a custom error is raised so that the problem can be properly handled.

Now that the mechanics of retrieving the data are in place, you must place the data in the spreadsheet. However, the dynamically retrieved data also becomes the source for a chart on the user's worksheet, a chart that is equally dynamic in automatically expanding, contracting, and in general responding to changes in the source data.

Dynamic Chart Data

A chart in Excel uses the data points from other cells as its source, some for the X-axis, and some for the Y-axis, as with a two-dimensional line chart includes as part of this business solution. A dynamic chart allows you to specify named ranges for source data rather than specific cells. Those named ranges refer to cells that contain data, excluding those that do not contain data. Thus, as users add new rows, the array of values from the named range reflects the changes. As users remove rows of data, the array of values reflects this change as well.

To make all of this work, you must use several key Excel functions as shown an explained in the following table.

Name Purpose
OFFSET This function returns a value from a cell whose location is in relation to another cell. You pass to this function the location of the base cell whose location is known and then specify how many rows and columns away the target cell is. =OFFSET(ReportSheet!$E$11,1,0,COUNT(ReportSheet!$E$12:$E$2007),1)
COUNT This function counts the number of cells that contain numbers. You pass to the function a range of cells that you want searched.
TEXT This function converts a value to text. You pass to the function the value you would like converted, and you can optionally specify number formats for the conversion if desired.
NOW This function returns the current date and time.

In our business solution, two named ranges in the spreadsheet refer to cells containing data points for the chart. One, called Chart_X_Values, is used to refer to data points on the X-axis, and the other, Chart_Y_Values, is used for data points on the Y-axis. The definition of these two defined names is as follows:

Name Formula
Chart_X_Values =OFFSET(ReportSheet!$C$11,1,0,COUNT(ReportSheet!$C$12:$C$2007),1)
Chart_Y_Values =OFFSET(ReportSheet!$D$11,1,0,COUNT(ReportSheet!$D$12:$D$2007),1)

Because the sample Web page returns two columns, one with dates and the other with sales data for each date, these columns provide data points to the chart. The chart begins using data one row down from the titles of the column names, which are in row 11 and in columns C and D. The OFFSET function sets the starting point at cells C12 and D12 respectively for dates and sales. Then, the COUNT function gathers all rows in the respective columns that have values. The solution arbitrarily sets a maximum of 2007 rows for the data. Because the solution requires that users restrict date ranges in the worksheet, you can adjust these values as needed.

Programmatically, the two named ranges that are configured in the worksheet are set for the chart after you import the data using the QueryTable object. Before doing so, you should name the chart. While you could hard-code the name of the chart, to make the solution more adaptable, you can reference a cell that contains the name to use. The cell can use a dynamically assigned name or a simple value. To set a value, select a target cell that includes a column heading for data returned from the Web query. Then, set a value in the cell with this formula: ="Sales"&TEXT(NOW(),""). By adding this formula, the chart updates each time the sheet recalculates. By using the name here as the name for the chart, the name of the chart dynamically changes according to the value in the cell. The code for setting up the chart is as follows:

    Range("ReportSheet!D11").Select
    ActiveCell.FormulaR1C1 = "=""Sales""&TEXT(NOW(),"""")"
    ActiveSheet.ChartObjects(1).Activate
    With ActiveChart
      .ChartType = xlLine
      'Set dynamic retention values
      .SeriesCollection.NewSeries
      .SeriesCollection(1).XValues = "=ReportTool.xls!Chart_X_Values"
      .SeriesCollection(1).Values = "=ReportTool.xls!Chart_Y_Values"
      .SeriesCollection(1).Name = "=ReportSheet!R11C4"
    End With

This code activates a chart already embedded on the page. It also sets the type of chart and identifies the data points for the X and Y values. Finally, the code specifies a name for the chart.

Final Touches

To make the solution more usable, you can also add some basic error handling. The following is a truncated example of the entire event procedure to retrieve the data and configure the chart, with error handling:

Private Sub cmdGetData_Click()
 On Error GoTo HandleError
  Dim qtQuery As QueryTable
  . . .
ExitFinally:
    Set qtQuery = Nothing
    Set qtsQueries = Nothing
    Set nmDefinedNames = Nothing
    Set rngQueryParams = Nothing
    Exit Sub
HandleError:
  If Err.Number = vbObjectError + 1001 Then
    MsgBox "Your request returned too many results." _
      & "Please refine your search.", vbInformation, "Result Error"
      Err.Clear
  ElseIf Err.Number <> 0 Then
    MsgBox "There was a problem retrieving the data you requested from the following URL:" & vbCrLf _
      & strTargetURL & vbCrLf _
      & "Please contact your systems administrator.", vbInformation, "Data retrieval error"
      Err.Clear
  Else
    GoTo ExitFinally
  End If
End Sub

Another useful measure is to let the user clear the report information so that the user can generate an entirely new report. To do this, you want to introduce another embedded button on the worksheet, cmdClearCriteria. The Click Event procedure for this button clears all of the report data, clears dates from the search criteria, removes any chart settings, and finally removes the QueryTable definition from the worksheet. The following code example demonstrates this:

Private Sub cmdClearCriteria_Click()
  Dim strEnd As String
  Dim lngLastRow As Long
  Dim nmDefinedNames As Names
  Dim rngQueryParams As Range
  Dim strParamName As String
  Dim i As Integer

  lngLastRow = ActiveSheet.Range("D65536").End(xlUp).Row
  strEnd = "C11:D" & CStr(lngLastRow)    
  ActiveSheet.ChartObjects(1).Activate
  ActiveChart.SeriesCollection(1).Delete
  Application.Range(strEnd).ClearContents
  Set nmDefinedNames = ActiveWorkbook.Names
  Set rngQueryParams = nmDefinedNames.Item("Parameters").RefersToRange
  For i = 1 To rngQueryParams.Rows.Count
    strParamName = rngQueryParams.Rows(i)
      nmDefinedNames.Item(strParamName).RefersToRange.Value = ""
  Next i
  For i = 1 To ActiveSheet.QueryTables.Count
    ActiveSheet.QueryTables(i).Delete
  Next i
End Sub

After the obligatory variable declarations, this code example determines the number of rows that are populated in the worksheet because of the Web query. After determining the last row, the code example activates the chart and deletes the Series instance from the SeriesCollection object. The example then uses the discovery of the last row to clear the contents of all cells containing imported data. After this, it clears the cells where the user specified a date range and then removes the QueryTable definition from the worksheet. This process prepares the worksheet for the next data request, and allows the Web query process to begin anew.

Conclusion

One of the most common requests any Office developer receives is to help in the creation of reports in Excel. To ease the burden, it is wise to find ways of getting data into Excel as efficiently as possible. You can import existing Web pages rich with data or raw XML directly into Excel using its powerful Web query feature. This feature, represented by the programmable QueryTable object, not only can ease the task for users in creating ad hoc reports, but it can also be part of more planned business solution you develop.

This article described the properties, methods, and events of the QueryTable object and then showed how can be used to deliver a more intelligent solution for users. This solution also described how to let the powerful charting capabilities of Excel benefit from the dynamic power that Web queries offer. Besides enabling users re-query and target different data, this type of solution lends itself very well for adaptation to other needs and reports.

One of the limitations of this solution is its ability to consume unformatted XML. A forthcoming companion paper to this one discusses further customizations to allow you to achieve the same functional results demonstrated here with unformatted XML with the target application Excel 2003, part of the new Microsoft Office 2003 System.