This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
MIND

Accessing Recordsets over the Internet
Dino Esposito

Code for this article: Cutting0300.exe (39KB)

I
n my humble opinion, recordsets are one of the best software inventions of the last millennium. In general terms, a recordset is a sort of abstract data type, holding data as well as functions to deal with them. Other data access technologies use slightly different names for recordset-like objects, such as the RDO resultset, the OLE DB rowset, or the dynaset used by DAO and Oracle. Whatever you call it, in every case you have a data type implemented through a collection of COM interfaces. This object encapsulates the records and provides a powerful programming interface for manipulating them.
      Until the recent introduction of ActiveX® Data Objects (ADO) 2.x, a recordset was bound to the records output by a database query. With the release of Microsoft® Data Access Components (MDAC) 2.0 over a year ago, OLE DB was implemented as the core technology for data access in the Microsoft model. ADO became the principal object model for making OLE DB accessible from within COM automation and script environments such as Visual Basic®, HTML, ASP, and Windows® Script Host.
      With ADO recordsets you can achieve the same results (and more) that you used to get with RDO resultsets or DAO dynasets, but not necessarily in the same way. RDO and ADO are similar object models; the differences are mostly concentrated in the recordset object itself. ADO is less hierarchical than RDO and puts more objects at the same logical level in the hierarchy. For example, with ADO you don't need to explicitly create a connection before issuing a command to a data source (as required by RDO).
      There are deeper differences between ADO and DAO. For more information on this topic, have a look at "Porting DAO Code to ADO with the Microsoft Jet Provider," by Alyssa Henry, in the Technical Articles section of MSDN Online (http://msdn.microsoft.com/library/techart/daotoado.htm).
      In this column, I'll explain why recordsets can be considered a general-purpose data container for any application and how you can take advantage of recordsets over the Web. In particular, I'll follow up on some issues with Remote Scripting (RS) that I discussed in the January 2000 issue of Microsoft Internet Developer (http://www.microsoft.com/mind/0100/cutting/cutting0100.asp). I'll show you how to convert a recordset to a JavaScript object, which is much easier to use than an array and can be sent back to the client through RS. Then I'll focus on specific features provided by Microsoft Internet Explorer 4.0 and greater that help you manage recordsets between the browser and the Web server. These technologies include Remote Data Services (RDS) and data binding. See the sidebar, " VBScript and Remote Scripting," for details about using VBScript with RS.

Creatable Recordsets


      ADO 2.x provided two notable new data access features: full support for any sort of data provider and the introduction of the principle that recordsets are creatable objects. ADO 2.x is built on top of OLE DB and is capable of accessing any sort of data that is wrapped and exposed by an appropriate OLE DB provider. The relationship between ADO and OLE DB has an important consequence: through ADO you can access any sort of structured (and with MDAC 2.5, semistructured) information including e-mail, directory listings, LDAP directories, and custom data. Whenever you employ ADO, you're using recordsets to carry data back and forth. Recordsets always contain data, but this data does not necessarily match a table's records.
      Making the recordset structure externally creatable means that you can create a new recordset object anywhere in your code, and you can use it without a connection to a database. I'm not just talking about disconnected recordsets; I mean connectionless, custom-shaped recordsets. A disconnected recordset is just what the name implies: a recordset that you disconnected from a data source. A disconnected recordset supports a static, client-side cursor that automates downloading the records on the client side. With RDO you can have disconnected recordsets, but you can't have connectionless recordsets.
      A connectionless (or custom) recordset is a recordset whose fields have been defined on the fly by the application to match the structure of the information you want it to manage. Previously this capability was reserved for the data object model, such as ADO 1.x, RDO, or DAO.
Figure 2 Recordset Output
Figure 2 Recordset Output

       Figure 1 shows some VBScript code that creates a brand new recordset that has no relationship to an OLE DB data source. This code generates a recordset that reads drive information through the FileSystemObject scripting object. Figure 2 shows the output.

A General-purpose Data Type


      You can consider an ADO recordset a sort of super array, a real general-purpose data container. A recordset is much more manageable than an array since it provides methods to scroll, filter, bookmark, and sort data. In addition, a recordset can be persisted to and from disk in several formats including the native datagram format (ADTG) and, as of ADO 2.1, XML.
      When transmitting recordsets over a network you can optimize the operation to save time and minimize trips. Let's assume that both the client and the server know how to handle the recordset object. Is there something that you can do to optimize the exchange of a recordset between two COM business objects? Recordsets support custom marshaling, which means that ADO defines a custom proxy/stub pair for the recordset object. When you have to transmit a recordset from the client to the server through the Update method, you can use the recordset's MarshalOptions property to tune how the marshaling will work. By default, all the records will be sent back. However, by setting


rst.MarshalOptions = adMarshalModifiedOnly
rst.Update
before calling Update, you limit the number of records transmitted to those that have been modified locally.
      Another property to consider carefully if you want to get the most out of your recordsets is CacheSize, which defines the number of rows that ADO will fetch and cache. If the required record is within the range of cached rows, reading it is much faster. The buffer is updated as soon as you move out from the current cache while scrolling the recordset. Choosing the right size for the cache is highly subjective. The default value is 1. You almost definitely want to increase this value to a level that makes sense for the data you're transmitting, proportional to the actual speed you experience between your client and server.

Recordsets and Remote Calls


      In most Web scenarios, you have the browser collect some data from the user and send it to the Web server. An ASP page on the server extracts the data from the Request object and performs any necessary data access. In this case, you aren't sending a recordset object back to the client; you're using its content to serve a brand new page to the browser. What travels over the network is just the contents of the recordset, not the object itself. When you need to transmit the recordset itselfâ€"in other words, when you want to marshal itâ€"both proxy and stub modules on each end of the connection must know how to handle recordsets. This will not be a problem if you use a Windows DNA schema and take advantage of MDAC 2.x to architect the data access procedures.
      Transmitting a recordset over the network becomes a problem if you employ cross-browser remote call schemes between the browser and the Web server. In my January 2000 Cutting Edge column in MIND, I explained how RS can be considered a (relatively) cross-browser solution that allows the client to get data from the server. Recordsets, however, aren't the type of data you can send back through an RS call for several reasons. First, you need the browser to support COM and, more importantly, you need MDAC 2.x to be installed locally. This isn't a problem with Internet Explorer, but what about clients running Netscape Navigator or HotJava? This is definitely a problem if you have to support multiple browsers, whether on the Internet or an intranet or even an extranet.
      The second, even more fundamental reason is that RS doesn't support objects other than ECMAScript objects, and recordsets are complex COM objects. It doesn't make sense at all to have an RS stub module capable of marshaling recordsets if the browsers don't support COM and ADO.
      The third reason is security. RS provides the same level of security as Java-language applets or <iframe> tags. That's why the implementation doesn't support compound objects, including COM objects. In addition, the RS server must be the same server that previously sent the page calling the remote object.

A JavaScript Recordset Object


       Figure 3 shows an ASP include file that transforms a recordset into a valid JavaScript object. If you import that file into a server-side ASP page whose methods are invoked through remote scripting, you can easily convert any ADO recordset into a JavaScript object. The clear advantage is that a JavaScript object can be smoothly serialized over the network by the RS stub and reach the browser safe and sound.
      In my January 2000 Cutting Edge column, I demonstrated how to convert a recordset into a rough array. JavaScript objects are much better than arrays since you can define methods and properties and model the recordset's fields as properties of the object. In Figure 3, the recordset is represented by an array of objects called Record. Each record has as many properties as the Fields collection of the original recordset object. All the records are grouped in the new Recordset object through a property called Items. As a result, a syntax like the following is perfectly valid:


empName.innerText = oRS.Items[0].LastName;
Here, oRS is an instance of this JavaScript object, and LastName is the LastName field of the first record. Let's see how to rewrite my sample applications to take this into account.
Figure 4 Record Display
Figure 4 Record Display

In Figure 4 you can see the new version of one of the sample apps. The employee information is now obtained and displayed through the code shown in Figure 5. Notice that the line



oRS = co.return_value;
automatically instantiates the new JavaScript object containing the recordset returned by the GetEmployeeInfo remote method. The server-side code that creates and sends the JavaScript's recordset is shown in Figure 6.
      The syntax to convert an ADO recordset to a JavaScript recordset is just as simple. It can be taken care of within the JavaScript object's initialization:


rst.Open(sql, "NW");
oRS = new Recordset(rst);
rst.Close();
return oRS;
The recordset is defined in the JSRECORDSET.ASP file included at the top of the page. Since the recordset is now a regular JavaScript object, the RS stub knows how to serialize it over the network.
      Of course, I've just reproduced the main features of an ADO recordset. No additional features, such as bookmarks, sorting, or filters, have been coded.
      It goes almost without saying that you can pack any ADO recordset this way no matter which data provider produced it. You can also employ custom OLE DB providers (such as the one that the Visual C++® 6.0 ATL wizard generates for you) to return a recordset of all the files in a given folder:


rst = new ActiveXObject("ADODB.Recordset");
rst.CursorLocation =  adUseClient;
rst.Open("c:\\*.*", "provider=FS.FileSystem.1");
oRS = new Recordset(rst);
rst.Close();
If you want JavaScript to tell you about the fields that form a recordset, use code like this:


var buf = "";
for (fld in oRS.Items[0]) {
    buf += fld + "\n";
}
WScript.Echo(buf);
      You enumerate all the properties of any of the records in the Items array. Working with this object, I noticed that the RS stub module has problems when it serializes the content of binary and memo fields. I don't know whether this is due to bugs in the RS stub or my code. For example, use the Northwind database from Microsoft Access to issue a "select * from employees" command. Among the returned fields you'll also have Photo and Notes, whose ADO types are adLongVarBinary and adLongVarChar, respectively. In this case, the RS engine returns a generic failure error to the effect that something went wrong during the trip to the client.
      Stepping through the RS code, I figured out that the problem was detected in the proxy module (rs.htm). However, this is not very informativeâ€"it's not clear whether the error was raised because something was corrupted or unintelligible for the proxy or because a wrong action was taken locally by the proxy itself. Then I remembered the golden rule: selecting all fields with * instead of using the fields list can slow the application. So I simply avoided loading Photo and Notes! You can see the new code in Figure 6.
      You can always extract the content of a memo field and return it as text. However, returning an image instead is another story. I think that the best solution for this would be to create a temporary GIF or JPEG file on the server and return the name of this file through the specific recordset field. On the client, you only need to set the src property of the <img> tag with the name of the server-side file. In a nutshell, you'll find that memos and pictures still require special treatment.

Remote Scripting, Frames, and Netscape


      In my January 2000 Cutting Edge column, I also presented a frame-based application using RS without Dynamic HTML to update the local page. Specifically, all portions of the UI that could change due to the action of RS were isolated into frames and updated separately. If you plan to use RS and recordsets to do this, there's an additional problem: how do you transmit the recordset from frame to frame?
Figure 7 Recordsets in frames work in both Netscape Communicator and Microsoft Internet Explorer
Figure 7 Recordsets in frames work in both Netscape Communicator and Microsoft Internet Explorer

       Figure 7 shows the user interface of the application. It has three different horizontal frames above and below the central yellow frame. The top frame, which includes the input form, invokes a remote method. This page receives the recordset asynchronously. However, the yellow page, rs_info.asp, is the real page that's refreshed with this data:



function refreshPage(co) {
  if (co.status == 0) 
    top.info.location = "rs_info.asp?info=" + 
                         escape(co.data);
}
How do you pass the recordset? You need to serialize it once more to put all the content into a string. This isn't particularly difficult, but the target page has to parse it and extract the information.
      While I was considering this, I remembered the co.data field. The Call object that RS returns to the client contains a ready-to-use version of the return value (the return_value field) as well as its original form (the data field). By original form I mean the string that the RS stub prepared and sent over the network. For a recordset this means exactly the string that represents itâ€"the same string the proxy evaluated through eval to produce an instance of the Recordset. Passing this co.data field to the frame is the most efficient way to transfer this sort of information.
      There's a pitfall, though. The co.data string is a pretty simple XML string with the following schema:


<RETURN VALUE>
<METHOD>
���
</METHOD> 
<RETURN_VALUE>
The body of the <METHOD> tag is the string you must evaluate to reproduce the recordset in the yellow frame. The RS proxy analyzes such a string without employing a full-fledged XML parser. It's quite simple after all. There's a function in Figure 3 called EvalRS that does this for you. The following code snippet shows how the rs_info.asp page processes its input and gets its own copy of the original recordset:


<%
  strText = Request.QueryString("info");
  if (strText == "") Response.End();
  oRS = EvalRS(strText);
  if (oRS.length == 0) {
     Response.Write ("No information available.");
     Response.End();
  }
%>
Of course, you need to import the JSRECORD.ASP file once more.


<!--#INCLUDE VIRTUAL="/ScriptLibrary/JSRECORD.ASP.ASP"-->
      With this trick you can pass the JavaScript Recordset object from frame to frame using RS. Figure 7 shows an application that uses RS to retrieve a recordset from the Northwind database and update its frames. It works fine under Netscape Communicator and Internet Explorer.

Remote Data Services


      If you can be sure that your users are running only Internet Explorer 4.0 or higher, you can take advantage of RDS, now part of MDAC 2.x. Basically, RDS is a collection of COM objects you can use to communicate with other COM objects via the HTTP protocol. Among other things, this enables you to invoke the methods of a COM object located on the Web server and establish a direct binding between tags on your page and fields of a recordset generated on the server. Invoking methods of a COM object on the server is a technique that looks like RS, but is completely based on COM. Binding tags to recordset fields is an old technique known as data binding. This technology is now fully integrated into the RDS package.

Calling a COM Object over the Web


      RS can call a JavaScript object. RDS, on the other hand, can call remote COM objects through HTTP and DCOM. Of course, the hottest part of this new technology is the ability to pass calls through HTTP (see Figure 8).
Figure 8 RDS Architecture
Figure 8 RDS Architecture

      From your client page you create an instance of the RDS.DataSpace object, then call its CreateObject method to obtain a pseudo-reference to a remote object located on the specified Web server.



rds = new ActiveXObject("RDS.DataSpace");
obj = rds.CreateObject("Object.ProgID", 
                       "http://expoware");
Actually, the reference you obtain is relative to a local object that RDS creates for you. This local object mirrors the functionality of the remote object. Each call you make to this proxy objectâ€"it's actually a proxy for the server-side business object you want to callâ€"originates a new HTTP connection, sets up a new instance of the real COM object, executes the method, and gets destroyed. All the necessary marshaling is handled by the DataSpace object.
      The nature of the mechanism for marshaling data, as well as the nature of the reference that DataSpace's CreateObject returns, depends on the argument you pass as the server. You can indicate not just an HTTP or HTTPS server, but also a LAN-connected machine or even the local machine. If you specify a machine name, DCOM will be the transportation layer for the calls you make. If the object you want to talk to is local, then no proxy or stub are created; you just get a pointer to the object. This doesn't affect the code you write.
      It's important to note that RDS produces a stateless model, much like HTTP. A business object that needs to be invoked remotely through RDS must be stateless because each call creates a new instance of the same component. For example, if you set a property before you call a method, you're actually talking to two different instances of the same object. The first instance is destroyed immediately after setting the property, so when you call the method you'll end up working in a brand new environment with no notion of the previous assignment and, more importantly, where the property doesn't have a default value. A golden rule for stateless programming is to avoid properties altogether, using only methods (with parameters, if necessary), and taking advantage of persistent media (files or databases) to store information that absolutely needs to survive the current session.

Getting Recordsets with RDS


      There are basically two ways to get recordsets with RDS. You can either use the system-provided RDSServer.DataFactory object or you can write your own full-blown custom business objects that return ADO recordsets. RDSServer.DataFactory is a general-purpose object that takes a connection string and command text and returns a disconnected recordset.


rds = new ActiveXObject("RDS.DataSpace");
df = rds.CreateObject("RDSServer.DataFactory", 
                      "http://expoware");
rs = df.Query("DSN=Northwind", sql);
At this point you can use the recordset to update the current page. If you need to submit changes, just use the SubmitChanges method. The recordset is sent back to the server where the stub sets up a new connection with the data source and submits the changes.
      The DataFactory issues a command to the data source and returns the resulting recordset. If you need to employ more complex logic, write your own components. Your custom business objects must be registered on the server via the following registry key:


HKEY_LOCAL_MACHINE
  \System
    \CurrentControlSet
      \Services
        \W3SVC
          \Parameters
            \ADCLaunch
              \YourProgID
Just create a new key under ADCLaunch using the progID of your business object as its name.

Data Binding


      Data binding in Internet Explorer 4.0 and above is the Web counterpart of data-bound controls in Visual Basic. Through data binding you can bring a recordset to the client and associate a specific HTML tag with a specific field of the recordset. To make it even more interesting, you can ask the browser to treat the value of the field as raw text or HTML. See the article "Data Binding in Dynamic HTML," by Rich Rollman (MIND, July 1997, or at http://www.microsoft.com/mind/0797/databinding.htm), for more information.
      With Internet Explorer 4.0, enabling data binding meant inserting an ActiveX control into your page and setting it with parameters like the HTTP server, the connection string, and the command text to execute. Today, data binding has been integrated with RDS and ends up being just another way of getting a recordset down to the client. You still have to use an ActiveX control if you want to link recordset fields with HTML tags.


<OBJECT style="display:none" id="rds" 
    classid="clsid:BD96C556-65A3-11D0-983A-00C04FC29E33">
</OBJECT>
      The control's progID is now RDS.DataControl. Its ID plays a central role in the linking process. In fact, nothing prevents you from getting a recordset this way without automatically binding tags with fields. You may use just one control to bind multiple tags.


<TR>
<TD><B>Title</B></TD>
<TD><SPAN DATASRC=#rds DATAFLD=Title></SPAN></TD>
</TR>
The DATASRC and the DATAFLD attributes define the source of the data for the specific tag.
      DATAFORMATAS specifies how to interpret the text. Possible values are "text" and "html". In MDAC 2.x, data binding uses the DataFactory control for the underlying data access and is fully compatible with any OLE DB provider. This means that you can also update the data source through data binding.

Which Technique is Best?


      There are similarities between RS and the RDS architecture. Both use a proxy and a stub, and both use an intermediate object to package and marshal the data over the Web. Each has a server-side object with a well-known layout that works in a stateless fashion. Apart from this skeleton, however, the internals are completely different. RDS is completely COM-based. COM is the component technology employed by remoteable components (as opposed to ECMAScript objects); a COM component is the intermediate DataSpace object (versus the RS Java-language applet); and COM is behind the marshaling techniques used to send and review data and recordsets.
      RS is the more portable technique for invoking remote functions. It is not portable to all possible platforms and browsers, but it does have fewer constraints than RDS. It just needs a browser with full Java language and ECMAScript support and a Web server running ASP. Also, it's distributed in source code (with the exception of the Java-language applet), and this is a great opportunity for you to improve or customize it if necessary.
      Data binding is the most specific technique of all. The number of things you can do with data binding is limited: you can bind an HTML tag with a field. It supports tables, but I don't think a true geek would use data binding on a regular basis because it doesn't give you the best control over recordset manipulation. I suggest that you consider data binding to write enhanced table controls (say, through Dynamic HTML scriptlets or behaviors) that occupy a well-known space on the page, fill up asynchronously, and can dynamically hide the complexity of the code that is used to define their content.
      The RDS DataSpace object is not asynchronous like RS, and is completely based on COM. This means that Internet Explorer 4.0 or above and MDAC 2.x are absolutely necessary to make things run smoothly.

Dino Esposito is a senior trainer and consultant based in Rome. He has recently written Windows Script Host Programmer's Reference (WROX, 1999). You can reach Dino at desposito@vb2themax.com.


From the March 2000 issue of MSDN Magazine.

MSDN Magazine Blog

MSDN Magazine Right Rail

14 Top Features of Visual Basic 14: The Q&A
Leading off the feature in the January issue of MSDN Magazine is Lucian Wischik’s fantastic look at Visual Basic .NET 14. As Wischik writes, the newes... More...
Wednesday, Jan 7
Big Start to the New Year at MSDN Magazine
Folks, things are hopping over here at MSDN Magazine. We are kicking off the new year with a pair of issues: Our regularly scheduled January issue and... More...
Friday, Jan 2

More MSDN Magazine Blog entries >


Receive the MSDN Flash e-mail newsletter every other week, with news and information personalized to your interests and areas of focus.