Pragmatic Architecture: User Interface
Summary: Moore's Law has been responsible for a steadily growing range of power and capability in the user interfaces of the applications that we build. While this power brings to us a tremendous flexibility in the technologies for building applications, that flexibility also brings with it a rather daunting and weighty responsibility: We must choose which one to use. (6 printed pages)
"…And, I'm telling you, this AJAX stuff is hot! It's going to completely reinvent the entire user experience! It's Web 2.0, and if you're not building an AJAX app, you're building legacy code!"
Wearily, I lean back and sneak a glance around at the others at the monthly meeting. Some of the other admitted architects look eager and excited, some skeptical, some downright disgusted. This is the third discussion that we've had in as many months on a user-interface technology, and no matter what the technology—WPF, Flex, AJAX, rich clients, even traditional Web applications using Plain Old Ordinary HTML (idly, I wonder if anybody's coined the "POOH" acronym yet)—it always seems to be accompanied by claims of being the "wave of the future."
Sometimes, I just wish that the future would hurry up and finally arrive, so that I could finally know which one of these to pick and be done with it and move on to more important discussions, like… well, like anything else but this.
Moore's Law giveth, and Moore's Law taketh away. In this particular case, Moore's Law—which states, roughly, that the number of transistors on a given chip doubles every 18 months, and has become synonymous with saying that computing power doubles every 18 months—has been responsible for a steadily growing range of power and capability in the user interfaces of the applications that we build. While this power brings to us a tremendous flexibility in the technologies for building applications, that flexibility also brings with it a rather daunting and weighty responsibility: Now, we must choose which one to use.
This decision is not an easy one to consider, because so much of the application will be coded specifically to that choice. Certainly, we can talk about modularization of components and n-tier systems and all, but when push comes to shove, even if you've neatly separated your business objects away from your presentation layer, you still have to write the presentation layer and the user interaction therein. That's not an insignificant bit of code, particularly if you guess wrong and find yourself recoding the entire thing in something different just a couple of years later. No other change, short of changing your programming language or platform, represents such a drastic pitching of code. (Don't believe me? Look at your current project, be it a WinForms or ASP.NET project. If you had to "refactor" the entire thing from WinForms to ASP.NET or vice versa, how much work would that entail? Even in the best scenarios, with perfectly separated presentation and business layers, probably close to 50 percent of the entire project would have to be redeveloped. Ouch.)
Feeling apprehensive yet? Let's try to put some perspective on the UI space and make it a bit easier to navigate. Fundamentally, any user interface is going to be made up of five parts: style, implementation, perspectives, cardinality, and locality.
By style, essentially we're describing how the user interface looks to the user. In today's technology space, that means one of three things: graphical, command-line, or none. The graphical UI is rather easy to understand, because it's most common in today's GUI-based world.
The command-line UI is not something that is traditionally considered, but many applications are, in fact, better off as a command-line UI than as a graphical one. Particularly if the application in question is intended to be used repeatedly, command-line applications are much easier for users (in particular, power users) to script together by using tools like PowerShell, Ruby, Perl, or even the ubiquitous "Command Shell" batch language.
The last possibility—none—is the one that will take a lot of people off-balance; it's hard to imagine an application with no UI whatsoever. However, "headless" applications are more common than first assumed. For example, consider the traditional Unix daemon process, which runs in the background. Its closest Microsoft Windows equivalent is the Windows service (not Web service, mind you, but those programs that are managed by the Services console under Administrative Tools). These have zero interaction with the user, except through very indirect means, such as the Services console.
In implementation, we turn to how the UI is described by the programmer to the technology, and here we have two possibilities: code and markup.
A code-based UI is one in which the UI elements are created and manipulated through traditional programming code of some form. In the .NET environment, the cleanest example of this is the WinForms API, in which each control is constructed by simply "new"ing the desired object and programmatically setting properties to get the desired look. (The curious Microsoft Visual Studio developer has by this point already noticed that dragging and dropping WinForms controls onto the form simply adds C#/VB/whatever statements to the InitializeComponents method inside the form-derived class.)
A markup-based UI, on the other hand, uses some form of "data-like" language to describe the user interface, which is then usually interpreted at run time by a third-party entity to present the user interface in question. The canonical example here, of course, is HTML.
By perspectives, I mean that the UI presents different views to different users. In other words, the user's view of the application can (and probably should) be different from that of the administrator, even for those situations in which a user and an admin are the same person. Keeping the two UIs different helps avoid "oops" kinds of mistakes, such as when an admin accidentally deletes a customer (whereas a normal user would not be allowed to do this). This is known in security circles as the principle of least privilege. There, this principle is necessary to build layers of defense, whereas here, it's just to avoid mistakes. Other perspectives include reporting (which might very well be done better using a UI technology different from the data-entry portions, for example) or monitoring the application's performance.
With cardinality, we enter into a new dimension of the user interface—that of what some are calling the composite application or, to use the hip Web 2.0 terminology, the mash-up. (Who says that architects aren't cool anymore?) In a composite application, the user interface actually aggregates several other applications underneath one visual umbrella—such as how Microsoft Visual Studio 2005 is now a composite application for Microsoft Visual Basic, Microsoft Visual C#, Microsoft Visual C++, IronPython, and several other tools. In the traditional mash-up or other portal-style Web application, several Web applications are brought together (usually through creative use of IFRAME elements) to unite them in a single visual "space." Thus, the key consideration here is whether the UI will be a "singular" interface or whether it will intend to aggregate several other applications/components within its own interface.
By locality, we address the basic issue surrounding the relative proximity of the UI's processing to the user's fingers. Is it something that is on the user's machine, something sent to and from another machine on the network, or some hybrid of the two? UIs that are sent from a remote location (typically, a shared server) are easier to deploy to users, because the user, by definition, is polling the remote location for the latest and greatest, so that updating the application is as easy as putting the new code on the shared server. By contrast, of course, sending UI code or data across the wire can take non-trivial amounts of time, which results in reduced performance in the one area in which it will be most keenly felt.
(Numerous UI studies have found that a half-second lag in UI reaction, such as when opening a menu, makes the application feel "sluggish" and "slow" to users, even if the rest of the application is twice as fast as its competitor without the half-second lag. For all things UI, perception is everything.)
In the parlance of the day, the canonical distributed user interface is the Web application, where HTML is fetched by a user by way of a TCP/IP connection and interpreted inside of the HTML browser. Subsequent actions by the user generate new requests, which return new HTML to be interpreted, and so on. The canonical local user interface is the so-called "fat client," such as the traditional installer-based application. Of course, hybrids are fast becoming more popular—ranging from AJAX, which uses asynchronous calls back to the HTML server to do processing without requiring the user to stare at an empty browser; to "smart client" applications, which execute on the user's local machine, but are initially downloaded from a shared server and re-fetched when updates are available; to "rich client" applications, which automatically and silently fetch updates when they are made available from a shared server. (Note that the distinctions and nomenclature between each are entirely up for debate.)
Thus, it seems, a UI is made up of style, implementation, perspective, cardinality, and locality. Which gives us… What?
Well, a chart, for one. (Architects love charts.) Notice that this chart lists different presentation technologies and their values among three of the five points—perspective and cardinality being questions of usage, not of technology.
Table 1. UI technologies mapped to fundamentals
|ClickOnce/WinForms||Graphical||Code||Remote-fetched, locally executed|
|Adobe Acrobat||Graphical||Code||Remote-fetched, locally executed|
|Adobe Flex||Graphical||Mixed||Remote-fetched, locally executed|
*WPF either can be run locally, or the XAML and associated code can be downloaded over HTTP in what the WPF docs call a "browsable application."
**With Crystal Reports, the data will typically (although not always) live in a remote RDBMS, and the report processing will typically (although, again, not always) be done on the local tier. Nominally, this puts it into the remote category.
Certain generalizations are being made here. For example, notable Ajax presenters have gone to great lengths to point out that Ajax applications are fully capable of being executed entirely locally and offline. And HTML itself, as a display technology, is a product of the Web browser, and so it also could be executed locally. In some cases, the distinction between "graphical" and "console" is one of choice and not technology. For example, although PowerShell scripts will most often be done from a command-line window, nothing prevents a cmdlet from displaying results in a graphical fashion—such as the PowerGadgets set of cmdlets that provide a number of different "gauge" UIs for displaying the stream of objects that are passed between cmdlets. Therefore, such distinctions in the chart above are not intended to be ironclad; they're just guidelines as to the intended and/or most frequent use of a UI technology.
From this breakdown, it should be clear that choosing a UI is not just as simple as choosing between "Web or client" application technology. There are a fair number of options, even before considering some of the "legacy" technologies that have been introduced some years ago (Microsoft Visual Basic 6 or MFC applications) and that are still viable options under certain circumstances. And herein do we face the crux of the UI architect's decision-making process: Choosing a UI technology is a matter of finding the best tool to suit the problem at hand.
Implementation plays a factor here, too, in that if the UI technology supports some kind of markup option, it becomes possible—or, at least, reasonable—to have a non-programmer build the "look and feel" of the UI. This has a couple of benefits, one being that now a programmer can be released to work on other parts of the system, and another being that a non-programmer is building the UI itself, which gives it a better chance of being considered "usable" by the users. (Remember, folks: Programmers are the ones who invented the BLINK tag. Unless you've explicitly researched the subject, assume that you are a complete "n00b" when it comes to UI, and you probably won't be far off the mark. Read the opening chapter of The Inmates Are Running the Asylum, if you need further convincing; and, as homework, explain why the iPod's user interface is considered intuitive, while the Zune's is not. This will be one-quarter of your grade. Due by Friday.) For some code-based UIs, a visual layout tool—such as the WPF Designer or Visual Studio itself—can generate the code, which again puts into play the idea of a non-programmer generating the layout, assuming that they have access to your source-code control system.
Locality typically comes up next. Distribution of any form typically has the advantage of easier deployment, and it helps skew the distribution of resources that are required to process the UI in your favor. When dealing with low-power devices on the client (when the target audience is still running Microsoft Windows 95 on its 486 PCs, for example), putting the majority of the processing on a server helps make the system more responsive on those clients. That said, it also incurs two potentially significant penalties: one, the application is now intrinsically tied to the network, which means that it cannot execute offline; and two, the application is now at least partly running over the network, which means that it takes a performance hit every time it makes a round-trip between the two tiers. This will be true for any UI layer that involves distribution of some form—AJAX, HTML/ASP.NET, whatever. Treat this concern with some care, as excessive round-trips across the network have killed more applications than all the viruses, Trojan horses, and spam put together. (Don't take my word for it; to see what I mean, walk up to any senior Java programmer and say, "EJB Entity Beans.")
After you have reviewed the technology choices, it's paramount to remember that UI is about usage. A UI that does not address these fundamentals, no matter how technologically slick, is still going to be an immediate candidate for refactoring.
Consider a prototypical internal application, for example. Regardless of the application's business domain, there are typically at least three UI perspectives that must be considered: the user's, the system admin's, and any reporting that has to go on. (In fact, reporting can often be broken into multiple perspectives, too, because security concerns often restrict who gets to see what data.) Obviously, for most users, a graphical UI is going to be preferable; although at times your user community will be sophisticated enough to want something that's easily hooked into a scripting language, which necessitates either a console UI or a cmdlet interface.
For your administrators, however, carrying out commonly executed tasks—such as adding or removing users, changing passwords, and so on—begs to be scripted in some fashion; so, even if the admins ask for a graphical UI, consider building a "hybrid" UI, instead. (Even Microsoft Visual Studio 2003 does this. If you run "devenv.exe MySolution.sln" from the command line, the GUI shell never comes up; it builds the solution, then quits.) This allows them to string tasks together in ways that your analysis and requirements could never anticipate. Reports do not always have to be visible, by the way. A number of situations arise in which a report must be generated, but not made visible, to the user or system triggering the report's generation. Batch systems frequently create reports for later viewing, for example.
Cardinality does not often come up explicitly in UI discussions, but it's always worthwhile to consider. For example, many internal systems want to display all of their various UI perspectives in the same basic shell, which raises all sorts of security-authorization issues: If the user is also an admin, do they get all the admin functionality in their user UI? Or do they have to log on under separate credentials to be an admin? If they log on under separate credentials, how long do they remain logged on, and does that somehow propagate to the rest of their UI? Remember, too, that it might be a convenience to them to allow both admin and UI functionality with a single logon, but it also opens up the possibility of "oops" kinds of mistakes that cannot be easily recovered. If you explicitly assume that running admin functionality will require a separate UI, none of these concerns applies to building the user UI, which simplifies your development life.
Unfortunately, careful analysis of your user-interface options still doesn't guarantee that you won't end up building tomorrow's "legacy" application that has to be rewritten in the technology of the day—simply because that technology is hip and hot and happening. In fact, nothing can prevent that from happening, if only because far too many developers and managers are overly open to rewriting applications in the "Hot New Thing" of the day. When asked, of course, they'll say that it's because the "Hot New Thing" (HNT, for you acronym watchers) will "revolutionize user interface" and "make your apps so much easier to use." And… well… because it's the HNT, so who wouldn't want to use it? In the final accounting, remember that building an application has first to be usable, and a large part of that means a deep understanding of the tools that are available to you, and which of those tools will yield the kind of UI that you'd be proud to sign your name to.
About the author
Ted Neward is an independent consultant who specializes in high-scale enterprise systems, working with clients who range in size from Fortune 500 corporations to small 10-person shops. He is an authority in Java and .NET technologies, particularly in the areas of Java/.NET integration (both in-process and by way of integration tools like Web services), back-end enterprise software systems, and virtual machine/execution engine plumbing.
He is the author or coauthor of several books, including Effective Enterprise Java, C# in a Nutshell, SSCLI Essentials, and Server-Based Java Programming, as well as a contributor to several technology journals. Ted also is a Microsoft MVP for Architecture, BEA Technical Director, INETA speaker, Pluralsight trainer, frequent conference speaker, and a member of various JSRs. He lives in the Pacific Northwest with his wife, two sons, two cats, and eight PCs.