Netting C++

Describing the EEK!-osystem

Stanley B. Lippman

Contents

Hazards and Rewards in the Environment
Laying Out the World
Summing Up

In the previous installment of this column, I began building my mouse and mouse environment simulation called EEK! I discussed the two primary methods of parsing the environment XML file: the firehose and Document Object Model (DOM). This month I'll begin by crafting the XML document, then I'll show you another way of incorporating an XML file into the program: using a DataSet, found within the System::Data namespace of the .NET Foundation Class Library. The DataSet leverages the document as a set of relational tables that you can traverse using unique key values.

The first order of business is generating a description of the simulated environment to allow the program to reconstruct and manipulate that world in the course of the application. To do this, I need to determine how best to design the XML document to represent the terrain the mouse will inhabit in the simulation.

The environment I've created (if you're not familiar with it, please read the June 2007 column at msdn.microsoft.com/msdnmag/issues/07/06/NettingC) has a unique constraint. I need to be able to represent it at both macro- and micro-scales and to be able to slide up and down examining the state of each simultaneously—that is, within the Tick of the simulator. For example, at the micro level, I want to track the movement of chemical molecules within the environment (the smell of different entities, such as food or a cat). At the macro level, I need to track the resulting progress and actions of the mouse. A good part of a mouse's behavior is driven by chemical molecules that trigger associated program functions because, unlike humans, it depends on smell, not sight, for exploring its environment. I call this a Scalable Environment Map, or SEM.

Hazards and Rewards in the Environment

The simulated environment itself is just a mapping, although it contains both hazards and rewards. It represents a contiguous area that the mouse moves through, encountering memorable entities such as nesting sites, food, burrs, electrical shocks (e-shocks), scents, and so on (see Figure 1). These encounters between the mouse and the environment literally color the environment.

Figure 1 Neutral World Tile Attributes

Figure 1** Neutral World Tile Attributes **(Click the image for a larger view)

In the simulation, the experience of the mouse in exploring its environment is "remembered" by maintaining a personal map. In that map, which begins as a blank, a color represents how the mouse experienced that portion of the environment (its interactions with the entities it finds) and allows it to identify and value subsequent areas of the environment by recognizing similar patterns. There are five colors associated with four valuations.

Green and dark blue indicate a rewarding experience—food and drink, a nesting site, and so on. Bright blue denotes fear—some terrifying encounter (in this case represented by simulated electrical shock). Red signifies anger or frustration—some obstruction to the mouse's goal-seeking (embodied in what I call burrs). And orange represents a neutral experience.

One panel of the simulation's display shows the real-time coloring of the environment as it unfolds. Because the environment is deliberate, the resulting maps and ability of the mouse to reach its goals should be predictable based on its configured nature.

In practical terms, this means there is one "truth" map and one subjective map for each moving (sentient) entity in the simulation. The truth map is built from the XML document and represents the physics of the simulated environment. Each subjective map is built up by exploration and colored by personal experience. It is always a subset of the "truth" map. The initial quest driving the mouse is to mark off a personal territory through its subjective map, which contains food and water and a nesting site, and which it can defend. Until I populate the simulation with multiple sentient entities competing for territory, the burrs and e-shocks of the environment will represent hazards and constraints. Figure 1 gives a sense of how a world tile of the environment is populated.

The environment is separated into tiles that are connected by from one to four "doortos" that represent entrances and exits available to the mouse. The simulation begins in an initial chamber that has three doortos representing alternative paths. This chamber is "unpleasant" to a mouse: it is without ground cover, totally exposed with an intense white light. For performance reasons, the simulation only keeps the tile the mouse is currently exploring plus all tiles connected to that tile by a doorto. This becomes the working set (see Figure 2).

Figure 2 World Tiles in Memory

Figure 2** World Tiles in Memory **

Laying Out the World

The elements that need to be represented within the XML document consist of the set of titles, the doortos that connect them, and the entities that occur within them:

<element name="tile" ...
<element name="doorto" ...
<element name="burr" ...
<element name="e-shock" ...

The first question is to determine how to relate the elements to each other. Each element of its kind is unique in its attributes, yet echoes aspects of earlier instances the mouse has possibly encountered. (Each time it encounters an element that "reminds" the mouse of an earlier experience, the color of the earlier experience transfers to the new element. This speeds up the mouse's reactions to its environment over time.)

One way to organize the XML is to nest each element description within the tile it occurs in. In Figure 1, this would mean the two burrs, the two scents, the e-shock, the nesting site, and the food and water would all be described within the tile description. This way, by parsing the tile definition, you have a complete description of all the entities within it as well. It also makes the description complicated and the document itself difficult to apprehend by a human reader.

Instead, the document groups each separate entity—first, all the tiles, then doortos, then burrs, and so on. Each entity within the tile is represented as a unique ID, plus a location within the tile. That is, the elements are presented within the document as a set of tables associated by keys.

A second motivation for not defining elements within the tile they occur in is the dual citizenship, so to speak, of doortos. A doorto connects two (and only two) tiles. It does not really belong inside either. It's similar to a design problem that partially motivated the friend mechanism in C++. I create a Matrix class and a Vector class. Now I create a multiplication operator to multiply a Matrix by a Vector. Where do I place it? It doesn't belong in either; however, for performance reasons, I make it a friend to both. That's how I think of doortos; they don't really belong to either of the tiles they connect.

As with most design solutions, this one presents a problem: how to connect a doorto to a tile, or a burr, or scent, or any other entity. I do this by matching uniqueIDs across the two sets of entries. As it happens, this mirrors the representation of a relational database, and I can leverage the database support within the System::Data namespace of the Foundation Class Libraries of the Microsoft® .NET Framework, as I indicated earlier.

A DataSet represents a collection of one or more DataTable instances. In this case, the set of tiles represents the main table. Each environment entity, such as burr or doorto, is stored as an additional DataTable. For each attribute within the tile, I retrieve its uniqueID and cross-reference to the appropriate associated table.

Converting the XML representation into a DataSet is actually quite simple. An XML document and a relational database share a special relationship, in a sense similar to the relationship between water and ice. You can think of them as alternative forms of the exact same data, each suitable for different activities.

A DataSet has a ReadXml method that takes an XML document path name and populates the DataSet object. For example,

DataSet^ ds = gcnew DataSet;
ds->ReadXml( eek_world_xml_path_string );

Although the DataSet object now holds the XML document as a set of DataTables, it does not as yet recognize the relationship between the tables. It needs to have that spelled out. This is done by adding a DataRelation property to the DataSet. The DataRelation defines a relationship between two tables based on a shared column, which represents the element's unique ID, or key.

The Tables property of the DataSet class returns a DataTableCollection object that contains the set of DataTable objects associated with the DataSet. To access a specific instance, you can either index it positionally,

enum class tableIndex { 
    tile, doorto, burr, e-shock, ...
};

DataTable^ world_tile = ds->Tables[tile];

or index it associatively using a string representation of the table you want to access:

DataTable^ world_tile = ds->Tables[tile];
world_tile = ds->Tables[ worldTile_table_string ];

There are a number of properties associated with the DataTable, such as the TableName, Rows, Columns, and PrimaryKey:

// returns the name of the table ...
String^ tabName = world_tiles->TableName;
DataRowCollection^    drows = world_tiles->Rows;
DataColumnCollection^ ccols = world_tiles->Columns;
int countRows = drows->Count;
int countCols = ccols->Count;
array< DataColumn^ >^ keys = world_tile->PrimaryKey;

A primary key is a column guaranteed to contain a unique value for each record entry, which also serves as the identity of the record within the table. (A primary key can also be a set of columns. This is why the PrimaryKey property returns an array of DataColumn objects rather than just one.)

In the tile table, for example, the doorto attribute holds the uniqueID of the doorto. In the doorto table, the primary key for organizing the doorto entries is that key.

I identify this relationship by defining a DataRelation class object linking the columns of the two tables that hold key ID value:

DataRelation^ doorto = gcnew DataRelation("UniqueID",
    ds->Tables["worldTile"]->Columns[ doortoID_string ],
    ds->Tables["doorto"]->Columns[ doortoID_key ]);

The first column is referred to as the parent. Its value is used to retrieve the matching entries in the second column, which is referred to as the child. We add the relationship to the DataSet:

ds->Relations->Add(doorto);

In this way, rather than explicitly searching the doorto table for the explicit ID, I instead use the DataRelation object to perform the retrieval.

Summing Up

One of the most productive aspects of .NET is its rich toolset. If I become space-constrained as our environment grows in size, I am best served by the firehose XML support. Otherwise, DOM provides the most data-rich XML representation. In this case, I exploit the relational aspect of the document's layout by reading it into a DataSet. And these solutions work across all .NET-compliant languages. As I've repeatedly said, it's not about languages anymore; it's about libraries. In the next column, I'll begin implementing the mouse.

Send your questions and comments for Stanley to purecpp@microsoft.com.

Stanley B. Lippman is best known for his work on C++ in the 1980s and 1990s with its inventor, Bjarne Stroustrup, at Bell Laboratories, together with his introductory textbook, C++ Primer (Addison-Wesley, 2005), now in its 4th Edition. Stan also worked as Architect with the Visual C++ team at Microsoft in the invention of C++/CLI, which is the focus of this column.