Export (0) Print
Expand All

Architecting Server Resources for Efficient Smart Client Consumption

.NET Compact Framework 1.0
 

Jim Wilson
JW Hedgehog, Inc.

February 2004

Applies to:
   Microsoft® .NET Compact Framework 1.0
   Microsoft Visual Studio® .NET 2003
   Microsoft SQL Server™ CE
   Microsoft SQL Server 2000

Summary: As anyone who has ever built a smart client system knows, the code running on the smart client device is only part of the overall system. Much of the work actually occurs on the servers that support the system. This article presents solutions and guidelines for successfully building server systems that efficiently support smart client applications. (15 printed pages)


Download 1-2004 Efficient Server Resources.msi from the Microsoft Download Center.

Contents

Introduction
Server Resource Accessibility Considerations
Efficient Database Design
Efficient Server Processing
Conclusion

Introduction

As anyone who has ever built a smart client system knows, the code running on the smart client device is only part of the overall system. Much of the work actually occurs on the servers that support the system. Whether as a database or as a processing engine, servers must behave and communicate efficiently. Designing a successful smart client system requires giving as much attention to the server design as to the smart client itself.

Server Resource Accessibility Considerations

Obviously, there is no single set of rules for building an efficient server system but there are a common set of considerations: network connectivity, data access patterns and the APIs used to access the server.

Network Connectivity

For smart clients, connectivity will virtually always be via wireless network. Whether by WiFi (802.11b), CDMA or GPRS, wireless networks provide notably less throughput then traditional 100mb wired networks with WiFi operating at about 10% of the throughput of a wired network, CDMA at about 0.6% (bursts may be about 2%) and GPRS providing even less at just under 0.2%. Wireless connectivity is dependant on availability of the signal and therefore may be unavailable for long periods of time and is subject the abrupt termination of service.

What this means to our server design, is that resources must be accessible via brief and efficient communication. The amount of data passed back-and-forth must be minimized and systems must be architected to service requests in a single round-trip, avoiding multi-step, stateful conversations.

Data Access Patterns

Memory and resource limitations normally require that smart client applications carry only a subset of the data that may be available. These applications need to efficiently retrieve only that information in which the user is interested. This will normally be data for a given geographic region or timeframe.

Smart client applications frequently spend large periods of time disconnected from the server. Server resources must be organized so that clients can efficiently identify that data which is either new or has changed.

APIs

Most smart client applications will user either Remote Data Access (RDA) or Web Services to access the server as both are firewall friendly and work effectively in limited connectivity environments. Resources should be organized to work efficiently with these.

Caution   The .NET Compact Framework offers a SQL Server ADO.NET Provider that gives smart clients full-feature access to SQL Server. It is however of limited utility as it relies on the TDS protocol which is not firewall friendly and performs poorly in loosely connected environments. Designers of server resources should avoid assuming that smart clients will be able to reliably exercise this level of access.

Efficient Database Design

Smart clients provide mobile, un-tethered access to central data resources. These systems depend on the ability to efficiently download meaningful subsets of enterprise data. Properly organized servers are able to provide this data easily and efficiently. Improperly organized, servers can easily become a bottleneck making it difficult to successfully deploy smart client applications.

Track and Index Geographic and Time Data

During database table design, the focus is of course on business use cases and the associated access patterns. Smart client applications add two additional dimensions to the problem – locality and time.

The SQL Server "Pubs" sample database is an excellent example of a database that although well architected for the business problems for which it was designed, lacks effective smart client access.

Note   For a listing of the tables in the Pubs database, see the Pubs Sample Database.

This is a problem best demonstrated by example. Consider two smart client applications: A publisher's mobile sales force application and a human resources application used to track new publisher employees.

The mobile sales force application represents a common problem of needing to carry only that data which applies to the user. A particular sales representative is responsible for a fixed geographic region (probably a list of zip codes). The application runs on a Pocket PC and manages the list of stores to visit.

The human resources application represents the problem of needing only data for a small timeframe. In this case a Pocket PC application is used by a member of the human resources staff who is responsible to periodically travel to the various publishers and train any new employees hired since the previous visit.

The Mobile Sales Force Problem

In designing the mobile sales force application, the "stores" table seems to have everything the application needs: store id, name, address, city, state and zip. Each salesman's Pocket PC application need only download the appropriate list of stores by running a query similar to the following.

Select * from stores where zip in ('98056','90019')

With the "sales" table's current configuration, chances are the sales representative will start complaining almost immediately that initializing the application is too slow. The problem is with the "stores" table index. The table's only index is on store id. As a result, every salesman triggers a sequential scan of the entire sales table. An index on the zip code eliminates the problem.

The New Hire Problem

In this application, the only data required is the list of employees who have been recently hired. Again the database already has all of the data we need, this time in the "employee" table: employee id, name, job information, publisher foreign key and employee hire date. To download the new employee information, the smart client application executes a query like the following.

Select * from employee where hire_date > '01-01-1994'

Again, the application is slow to initialize and the more employees the publishers hire, the slower initialization becomes. Again the problem is the index. The only indexes are on the employee id and name columns, so we again have a sequential scan of the entire table. Adding an index to the hire date column makes the query efficient.

Unfortunately in this case, adding an index doesn't fix all of our problems. The ultimate goal of this Pocket PC application is to provide human resources with the information necessary to visit all publisher new hires. We don't have the necessary data! The "publishers" table doesn't provide street address or zip code information; only publisher id, name, city, state and country.

The user is either going to have to use some other system to get this information (like the local phone book) or the "publisher" table will need to be expanded and have this additional data loaded – a process that would have been much easier at the time the database was originally developed.

A Common Scenario

Both of these examples demonstrate a very simple, yet important issue: smart client systems rely on time and location information. Server systems must carry complete time and location information and that information must be organized for efficient search and access.

Make Identifying New and Changed Records Easy

By their nature, smart client apps spend only a fraction of the time connected to the server. When connected, the application needs to: 1) easily determine if there is any new or changed server data and 2) efficiently access the new or changed server data.

Note   SQL Server Merge Replication provides the ability to automatically keep SQL Server and SQL Server CE tables synchronized, which in some cases is preferable to manually managing data synchronization. It is a powerful tool that must be used appropriately to avoid data and scalability issues. The specifics of architecting a Merge Replication solution will be discussed in an upcoming article.

Tracking changes requires that each row carry the information necessary to know when it was created or changed. This data must be easily searched and always up-to-date. There are three columns types that may provide a solution.

  1. Identity columns
  2. DateTime column
  3. TimeStamp column

Identity Columns

An identity column provides every record with a guaranteed table-wide unique value. In the case of SQL Server, each newly added record is automatically assigned a new identity value when inserted. The value is guaranteed to be larger then all previously added records.

Note   For information on creating identity columns with SQL Server, see Identity Property.

Identity columns provide an easy and efficient mechanism for identifying newly added records. The following creates a table with an identity column named "idcolumn."

CREATE TABLE ExampleData (
   idcolumn int IDENTITY (1, 1) NOT NULL ,
   col1 nvarchar (50) NOT NULL ,
   col2 decimal(18, 0) NOT NULL 
) 

The value of the first record inserted will have a value of 1. Each successively added record will have a value 1 greater then the previously added record. Clients can access all newly added records with the following stored procedure.

CREATE PROCEDURE get_newrows_exampledata 
   @previous_id as int  = 0
AS
SELECT idcolumn, col1, col2
   FROM exampledata
   WHERE idcolumn > @previous_id
   ORDER BY idcolumn

The client can download the complete list of rows by calling the stored procedure with no arguments. When the server is next accessed, the smart client need only call the procedure with the largest idcolumn value it received on the previous call. The server should, of course, index the identity column.

Identity columns are especially useful in cases where the client needs to simply know whether new data has been added. SQL Server automatically tracks the current identity value for each table which can be accessed with the following query.

Select IDENT_CURRENT('exampledata')

If the value returned is larger then the smart client's largest stored identity value, new records have been added.

DateTime Column

The limitation with an identity column is that it is only useful for identifying newly added records. To track updates to existing records requires adding a DateTime column to the table.

CREATE TABLE ExampleData (
   idcolumn int IDENTITY (1, 1) NOT NULL ,
   col1 nvarchar (50) NOT NULL ,
   col2 decimal(18, 0) NOT NULL, 
   last_modified datetime DEFAULT GETDATE()  
)

The last_modified DEFAULT GETDATE() automatically stores the current date and time when a new record is added. Unfortunately, it has no affect on updates. To keep last_modified current requires explicitly modifying the column on each update; therefore updates are best managed through a stored procedure.

CREATE PROCEDURE update_exampletable 
                 @col1_value as nvarchar(50), 
                 @col2_value as decimal(18, 0), 
                 @idcolumn_value as int
AS
UPDATE ExampleData 
    SET col1=@col1_value,
        col2=@col2_value,
        last_modified=GETDATE()
    WHERE idcolumn = @idcolumn_value

With the last_modified column properly maintained, the list of new and modified rows can then be retrieved by calling this stored procedure.

CREATE PROCEDURE get_modifiedrows_exampledata 
   @lasttime as DATETIME  = '19000101'
AS
SELECT idcolumn, col1, col2, last_modified
   FROM exampledata
   WHERE last_modified > @lasttime
   ORDER BY last_modified

The client initially calls the procedure with no arguments thereby receiving all rows. On successive calls, the client application passes the value of the largest last_modifed from the last row of the previous call. Again for efficiency, the last_modified column needs an index.

TimeStamp

At first glance, the TimeStamp data type seems like it might be a better solution then manually managing a DateTime column. A TimeStamp column is automatically updated each time any data in the row changes. There's just one problem; the TimeStamp data type is not actually related to time. Rather it is simply a value guaranteed to be unique throughout the database. The TimeStamp column can be relied upon to see if the data in a record has changed but unfortunately there is no efficient way to locate those records with a modified timestamp without individually checking each record.

Divide Data into Static and Dynamic Segments

When working with very wide tables or in cases where only a subset of the columns in a table change frequently, it may be most efficient to separate the columns into two separate tables: one table containing the frequently updated columns and one table containing the more stable columns.

Consider the following product_inventory table.

CREATE TABLE product_inventory (
   idcolumn int IDENTITY (1, 1) NOT NULL , -- identity
   last_modified datetime NOT NULL ,       -- last update
   prod_code char (10) NOT NULL ,          -- product id
   prod_desc nvarchar (50) NOT NULL ,      -- text description
   supplier_code char (10) NOT NULL ,      -- id of product supplier
   qty_on_order int NOT NULL ,             -- qty currently on order with supplier
   date_ordered datetime NOT NULL ,        -- date order placed with supplier
   price decimal(8, 2) NOT NULL ,          -- current price of item
   qty_on_hand int NOT NULL ,              -- qty available to sell
   qty_reserved int NOT NULL               -- if item sold out, length of waiting list
) 

The product_inventory table contains product identification, ordering and current availability information. It is well normalized and can be easily searched for additions and updates. Even with all of that, this table requires smart clients to download far more data then is required.

If the store using this application is successful, the qty_on_hand and/or qty_reserved columns change frequently; with each change the last_modified column is updated. On a busy day, these columns may change every few minutes. If the store sells just 50 items, the smart client will need to download 50 records which are over 7.5 Kb data for this table.

Creating a separate table for the dynamic columns yields the following.

CREATE TABLE product_inventory_stable(
   idcolumn int IDENTITY (1, 1) NOT NULL , -- identity
   last_modified datetime NOT NULL ,       -- last update
   prod_code char (10) NOT NULL ,          -- product id
   prod_desc nvarchar (50) NOT NULL ,      -- text description
   supplier_code char (10) NOT NULL ,      -- id of product supplier
   qty_on_order int NOT NULL ,             -- qty currently on order with supplier
   date_ordered datetime NOT NULL ,        -- date order placed with supplier
   price decimal(8, 2) NOT NULL ,          -- current price of item
) 
CREATE TABLE product_inventory_dynamic (
   last_modified datetime NOT NULL ,       -- last update
   prod_code char (10) NOT NULL ,          -- product id
   qty_on_hand int NOT NULL ,              -- qty available to sell
   qty_reserved int NOT NULL               -- if item sold out, length of waiting list
)

This new structure allows smart clients to stay current on the dynamic data without the overhead of the more static data. This new structure reduces the data download size by 85%.

Consider De-Normalizing Some Data

As noted earlier, smart clients are rarely able to take advantage of the full featured SQL Server capabilities of the SQL Server ADO.NET Provider. Instead they must rely on Remote Data Access (RDA) which utilizes HTTP to retrieve SQL Server data and therefore is able to work effectively through firewalls and on loosely-coupled networks. RDA can be used in a number of ways but is most commonly used to "Pull" a snapshot of SQL Server data into a local SQL Server CE table and later "Push" locally made changes back to the server.

Note   For more information on RDA checkout Introducing RDA, RDA Samples, and SqlCeRemoteDataAccess Class.

This Pull/Push behavior is both powerful and efficient. It does however have limitations

  1. The query executed by the Pull cannot be a multi-table select.
  2. RDA does not track the pre-change version of modified data; therefore the Push operation applies changes to the server indiscriminately.

These limitations have notable implications:

  1. Any SQL Server CE table created by RDA with change tracking enabled must map to a single table on the server
  2. Each record tracked by RDA must be uniquely updatable.

In many systems, the data naturally conforms to these limitations and can be used with RDA directly; for example a table listing delivery drivers and the inventory contained in each delivery truck. As the driver makes deliveries, each driver modifies her own inventory. The physical reality that the only one with access to the contents of the truck is the driver makes it unlikely that anyone else will take an action that modifies the records for that driver.

In cases where the data does not naturally conform, creating a slightly de-normalized version of that data required by smart clients can overcome these limitations. An example of this may be a table listing deliveries to a particular customer. A customer who has placed a very large order, may require two or three drivers to each perform deliveries to that customer. In this case, each driver would need to perform a modification to that customer record to update the quantity of items delivered.

Note   As noted earlier, SQL Server Merge Replication provides an alternative for table synchronization. In some cases it may be preferable to RDA. The specifics of architecting a Merge Replication solution will be discussed in an upcoming article.

Single Table Select

This one is pretty straightforward; RDA requires that a tracked SQL Server CE table map directly to a single table on the server. The only way to use RDA change tracking on data that requires a JOIN is to create a table on the server containing a de-normalized version of the data with the JOIN pre-executed. The smart client then performs the RDA Pull/Push operations against the de-normalized table.

Rather then carry a de-normalized copy of the entire set of data; in some cases it may be worthwhile to create a stored procedure. The smart client executes the Pull using the stored procedure which executes a query that moves the required data to the de-normalized version of the table; the stored procedure then selects the records from the new de-normalized table returning the records to the smart client. The smart client later uses the Push to apply changes back to this new table with a server process merging the changes back into the original table. In general one would want to merge the changes back to the original table as soon after the Push as reasonably possible to make the smart client modifications available to other systems relying on the original table.

Unique Updatability

In many cases, the data accessed by smart client applications is naturally segmented and therefore non-overlapping. This is especially the case in situations where the smart client application is used by delivery drivers or salespeople where each has a distinct route with no overlapping data. In these situations, RDA safely manages smart client updates to data.

If data is not naturally segmented, changes made by one RDA client may overwrite changes made by another RDA client (or any other system). The only way to insure that this is not the case is to make each record unique to that client. This can be achieved by doing the following:

  1. Create a new table containing the same columns as the original.
  2. Add a column that represents the smart client's user id or other identifying value.
  3. Copy the data from the original table into the new table with one copy of each record for each user who may access or modify the record.
  4. For efficiency create a compound primary key containing the user id along with the data's natural primary key.

The new table now contains uniquely identifiable records.

Although, the new table on the server contains duplicate records it is unlikely that the table is substantially larger then the original. Modifying data in a smart client application normally relates to a user performing some task related to that data. There are few situations where a large number of users perform the same task on the same data. This is equivalent to two drivers working for the same delivery company having the same delivery route, delivering the same product to the same customer on the same day. It's a situation that might exist, but only in small percentage of the overall cases.

Once the table is created, smart clients can perform a Pull from the new table for the records containing their user id; the returned records are unique. Being unique, RDA can Push the changes back to the de-normalized table without concern for overwriting other changes. At the end of the day, the server can sweep through the de-normalized table applying changes back to the main table. This sweep has the added benefit of providing an opportunity to apply business logic against any duplicated records to insure that multiple users modifying the same data have done so in a valid way.

Note   A complete example of using de-normalization to avoid overwriting updates is included in the accompanying download.

De-normalization Wrap Up

The prospect of duplicating data is rarely an inviting one but does sometimes have its place. RDA provides an easy to use and extremely efficient mechanism for communicating to SQL Server from smart client applications. Even with the hassle of creating duplicate versions of some tables, RDA provides a much simpler solution then attempting to manually build a solution and in scenarios where there are large numbers of smart clients or high data volumes may be more efficient then Merge Replication.

Efficient Server Processing

For many organizations, smart client systems are portable versions of existing systems. As a result much of the server processing required is already built. As is always the case, new systems should take advantage of existing code as much possible. However when it comes to building smart client systems, a lot can be gained by wrapping existing code in smart client aware wrappers.

Note   This section examines two important guidelines for exposing efficient Web Services. This is in no way meant to be an exhaustive list but rather some general guidelines. Web Service development is a complex topic involving many issues which are outside the scope of this article. For those looking for detailed information on web service development, please see the MSDN Web Services Developer Center and Aaron Skonnard's The XML Files column.

Can't I Just Make It a Web Service?

By far the easiest way to expose existing .NET components to smart clients is through Web Services. The ease of web service development is both good and bad. Obviously it's good that exposing server functionality is easy. Unfortunately the ease of development often causes developers to overlook the need to properly design server components for efficient smart client access. Consider the following two classes.

public class StudentManager
{
  // Create a first year student with no grades
  public Student CreateStudent(string argName)
  {
    Student s = new Student();
    s.Name = argName;
    s.GPA = 0.0;
    s.Year = 1;
    return s;
  }
  // If business rules are met, return the new student year,
  //  otherwise return current year
  public void AdvanceYear(ref Student argStudent)
  {
    // Apply business rules
    if (argStudent.GPA >= 1.0)
      argStudent.Year += 1;
  }
}
  public class Student
  {
    public string Name;
    public double GPA; // 0.0 thru 4.0
    public int Year;   // 1,2,3 or 4
    // Other members elided for clarity
  }

Creating a Web Service with No Code

These classes can be exposed via web service and therefore smart client accessible without writing a single line of .NET code.

Note   What follows is a summary of the steps required to expose the StudentManager methods as a web service - for a detailed discussions see Code Model for XML Web Services in Managed Code.
  1. Annotate the methods we would like web service callable with the WebMethod attribute.
  2. Create a virtual directory ( "StudentLibWS")
  3. Create a subfolder named "bin"
  4. Copy the assembly containing the classes into the bin folder
  5. Create an ".asmx" file pointing to the StudentManager class ("Manager.asmx")

So the StudentManager class now looks like this.

public class StudentManager
{
  [WebMethod()]
  public Student CreateStudent(string argName)
  {
    // code is unchanged
  }
  [WebMethod()]
  public int AdvanceYear(Student argStudent)
  {
    // code is unchanged
  }
}

…and Manager.asmx contains the following line:

<%@ Webservice language='C#' Class='StudentLib.StudentManager' %>

Consuming the Web Service from a Smart Client

The StudentManager class can now be used by smart clients to create and manage instances of the Student by simply adding a Web Reference and pointing to this URL.

http://<yourserver>/StudentLibWS/Manager.asmx?WSDL

This technique makes the server components accessible to smart clients and does so using only existing code. Unfortunately being smart client accessible does not equate to being smart client efficient.

Creating an Efficient Web Service

A common mistake when architecting a Web Service is failing to recognize the cost of communication. The following are a few simple guidelines for creating smart client aware web services.

Make Trips to the Server Count

First we'll look at the CreateStudent function. With our web service in place, calling CreateStudent from a C# smart client application takes just a few lines.

StudentManager mgr = new StudentManager();
mgr.Url = webServiceURL;
Student s1 = mgr.CreateStudent("Fred");

The problem with our web service is that if we need to create several students, we must call CreateStudent separately for each. This of course incurs a separate network round-trip for each call.

In the case of a relatively simple function like CreateStudent, the network overhead takes more time then the actual server processing – this tends to be true a great deal of the time. Creating 100 students with the CreateStudent method demonstrates this. Using a 400 MHz Pocket PC 2003 communicating over an 802.11b-WiFi network, calling CreateStudent 100 times consistently takes between 25 and 30 seconds.

Delays like this are what often lead to the failure of smart client systems. Every trip to the server should perform as much work as reasonably possible. Avoid exposing server functionality as single steps – focus on creating meta-operations. In the case of our StudentManager, adding a simple helper makes student creation much more efficient.

[WebMethod()]
public Student[] CreateStudentList(string[] argName)
{
  Student[] s = new Student[argName.Length];
  for (int i = 0; i < argName.Length; i++)
  {
    s[i] = CreateStudent(argName[i]);
  }
  return s;
}

Using CreatingStudentList those same 100 students that took 25 seconds to create, now take less then 1 second.

Beware of Objects

Objects are the foundation of modern software development. Used correctly, they simplify the development of complex systems and provide for more manageable and maintainable code.

It's difficult to disagree with the elegance of a passing an object to a method and then having the object returned in its new-and-improved state. Consider the StudentManager AdvanceYear method which examines a student's current grade-point-average and decides whether the student should be promoted to the next year.

public void AdvanceYear(ref Student argStudent)
{
  // Apply business rules
  if (argStudent.GPA >= 1.0)
    argStudent.Year += 1;
}

As a local method, this call is very efficient and elegant. The method is efficient as the only data copied from the caller is a simple reference to the Student object memory (technically it's a reference to the Student reference). It is elegant because the caller is completely insulated from the details. The involved data members and associated algorithm are completely encapsulated.

Moving this method to a web service changes everything; this very efficient method becomes equally inefficient. Web services are message-based rather then being object-based. There is no concept of passing memory references; rather everything is passed via messages. Each call involves two messages: one containing the input parameters, one containing the return parameters.

In the case of the AdvanceYear web service method, passing the Student object by reference means that the complete Student class is contained in both messages. The Student class is copied in its entirety from the client to the server and again copied in its entirety back to the client. This is inefficient in any case but of even greater concern in the case of smart clients who must work within the limitations of a wireless network.

Making this method smart client efficient means thinking in terms of messages rather then objects. The AdvanceYear method only operates on two pieces of data: GPA and Year. It modifies only one piece of data: Year. The messages should reflect that; modifying the messages means modifying the web service parameters as in the AdvanceYear2 function.

public void AdvanceYear2(double argGPA, ref int argYear)
{
  if (argGPA >= 1.0)
    argYear += 1;
}

AdvanceYear2 breaks out the required data members into individual parameters. As an input parameter (ByVal in Visual Basic .NET terms) GPA is contained in only the input message and as a reference parameter (ByRef in Visual Basic .NET terms), Year is contained in both the input and return messages. This simple change reduces two very large messages containing the full object description into two much smaller messages containing only the required data: GPA and Year in the input; Year in the return.

Conclusion

Well architected server resources are at the heart of any smart client system. Developing these systems requires careful planning and a willingness to rearrange the data. In this article we looked at a number of specific techniques for managing, organizing and rearranging server resources for smart clients. In each of these cases and in virtually any case you're likely to encounter, the answer comes down to maintaining brief, efficient communication and arranging resources so that they can be easily reduced into subsets based on place, time or frequency of access.

Show:
© 2014 Microsoft