November 2011

Volume 26 Number 11

UI Frontiers - Finishing the E-Book Reader

By Charles Petzold | November 2011

Charles PetzoldBefore there were Kindles and Nooks and iPads and smartphones; before there was HTML, PDF, XPS, EPUB, MOBI and Plucker; before there were even computers owned by individuals, there was Project Gutenberg. Founded in 1971 by Michael S. Hart (who recently died at the age of 64), Project Gutenberg is easily the oldest collection of digitized public-domain books. While its inventory of about 35,000 books seems quaint when compared to the 15 million or so texts accumulated by Google Books, Project Gutenberg remains an invaluable resource for accessing the classics.

Project Gutenberg books are now available in several rich-text formats, but for many years the focus was entirely on plain text, the reasonable assumption being that rich-text formats come and go but plain text is forever. Five months ago, when I needed a simple but substantial text file for a column demonstrating how to write pagination logic for Windows Phone, I turned to Project Gutenberg.As I began progressively enhancing this code, the project took on a life of its own. The program has now evolved into a full-fledged e-book reader for Windows Phone that gives you access to the Project Gutenberg library. I call it Phree Book Reader—pronounced “free book reader” and rhyming with “e-book reader” but spelled with a “ph” for “phone”—and it’s available as a free download from the Windows Phone marketplace. As usual, you can also download the program’s source code from the MSDN Magazine Web site.

Catalogs and Web Services

There are many ways to build a large application. Some developers like a top-down approach that begins with the overall structure and gradually implements more detailed code. Others prefer a bottom-up process starting with the low-level routines that are combined into more powerful assemblies.

I tend to mix the two approaches with the primary goal of getting something—anything—working as quickly as possible. Even a skeleton program that’s only a tiny bit functional gives me the essential positive feedback that keeps me going, and once I’m using the program, the necessary enhancements become obvious.

If you’ve been reading this column for the past five months, you know that my previous e-book readers were limited to just one book or, more recently, four books. These books were bundled into the application executable as content. This approach allowed me to focus entirely on the reading experience while avoiding the messy job of downloading books over the Web, and the even messier problem of allowing users to search for books by title and author. But I knew I’d have to meet these challenges eventually.

The Project Gutenberg Web site (gutenberg.org) tries to make it easy for users to search and download books, but it does not implement a public Web service so that other applications can perform similar searches.

Each book stored at Project Gutenberg is identified by an integer ID. If a program knows the ID number for a particular book, it can download an XML file at gutenberg.org/ebooks/N.rdf, where N is the ID number. This resource description format, or RDF, file is often around 10KB in size, and contains the book’s title, author and other information, along with links to the book itself in several different formats. This file is great if you know the ID number of a desired book, but not so great otherwise.

Project Gutenberg also makes available a complete catalog of all the books in its collection at gutenberg.org/feeds/catalog.rdf.zip. It’s about 9MB in size and unzips to a 200MB XML file. The information in this catalog is similar—but not identical—to the information in the individual RDF files. A new version of the catalog is created every day as new books are added to the Project Gutenberg inventory.

At first I thought my e-book reader could download the entire catalog to the phone and store it in isolated storage for searching purposes, but I was worried about the size. For example, the average word is five characters long and words are separated by spaces, so a 50,000-word book in plain-text format requires only 300KB of storage. The unzipped catalog, in contrast, takes up the same amount of storage as more than 6,000 books!

I then encountered another problem: I could open and read the catalog file with the regular .NET version of XmlReader by setting the ProhibitDtd property of the XmlReaderSettings to false. However, the Silverlight version of XmlReader choked on the file, and no setting of the DtdProcessing property of XmlReaderSettings—or anything else I tried—worked.

After much contemplation, I decided to write my own Web service hosted on my Web site. That Web service obtains the catalog file from the Project Gutenberg site, unzips it, opens it, parses it and then stores a stripped-down version locally in a “flat” format—one text line per book—for faster searching.

Of course, any time you’re dealing with somebody else’s data, you’re also dealing with their data structures. My Web service implements a method named Search with arguments specifying title and author words, and returns up to 25 instances of a type I called GutenbergBook. This class incorporates information about the book obtained from the catalog entry, including a title (and sometimes two titles), zero or more “creators” (author and possibly coauthors), and zero or more contributors (such as translators and editors).

Also included in the Gutenberg catalog is a “friendly title,” which generally incorporates both the title and the author, and which is limited to 50 characters. This friendly title seemed ideal for showing search results to the user, and for identifying the book when it’s being read.

But this friendly title is not always so friendly. It’s great for short titles, such as “Emma by Jane Austen,” but is often deficient for longer titles. For example, the Gutenberg catalog contains 12 entries for various editions and volumes of Edward Gibbon’s famous historical opus, and all 12 have the same friendly title truncated to 50 characters: “The History of the Decline and Fall of the Roman E.”

To me this meant that if I were going to use this friendly title to identify a downloaded book, I’d need to give the user some way to edit it to something more meaningful, such as “Decline and Fall of the Roman Empire, Volume 3.”

The Front-End Pivot

At the very least, the front end of an e-book reader needs to display a list of downloaded books and a search screen for downloading more books. It seemed obvious to me that these two items would be part of a Pivot control—a popular control for Windows Phone programs for presenting multiple screens in a format other than navigable pages.

The front-end Pivot control of Phree Book Reader has five items in the following order: bedside, library, search, request and about.

Although search is the third item in the Pivot, it’s where a new user will begin. As shown in Figure 1, you type in title words or author words, and it makes a call to the Web service. Up to 25 hits are returned and displayed in a ListBox. Each hit is identified by the ID number and friendly title from the Project Gutenberg catalog.

The Search Item of the Pivot Control
Figure 1 The Search Item of the Pivot Control

When the user taps one of these items, the program navigates to a download page, as shown in Figure 2. This page shows additional information from the catalog and lets you download the book. Notice the Contributor entry indicating the famous translator of Russian literature, Constance Garnett.

The Download Page Ready to Download
Figure 2 The Download Page Ready to Download

When you begin downloading a book, often you’ll see the filename suddenly change. The Project Gutenberg catalog contains filenames for the various formats available—including the preferred format for my purposes, plain text encoded in UTF-8—but I discovered that some of these files were empty or corrupted. The filenames in the individual RDF files were different and much more reliable. So before Phree Book Reader begins downloading a book, it downloads the RDF file and obtains a filename from that.

After the book is downloaded, the download page displays buttons that navigate to other pages. The first button lets the user change what the program refers to as the “display title.” This title is originally set to the friendly title from the Gutenberg catalog and is also limited to 50 characters.

The second button involves chapter breaks. Project Gutenberg books vary in the number of blank lines used to separate chapters. This option allows a user to change that criterion, and remove superfluous chapter breaks.

The request item on the Pivot control is similar to search except that you simply type in a Project Gutenberg ID number for a book rather than search terms. The Pivot control then navigates to the download page.

When a book has been downloaded, the book joins the library, which is the second item on the Pivot control, and is shown in Figure 3.

The Library Item of the Pivot Control
Figure 3 The Library Item of the Pivot Control

This library view uses the title and author information from the catalog entry rather than the display title. Books are organized by author and title. Tap one of the titles to start reading that book. Tap the question mark to see the full catalog information (similar to the download page) and optionally edit the display title and the chapter breaks, or delete the book.

There was never any doubt in my mind that I wanted the Phree Book Reader library organized alphabetically by author. That’s exactly how my fiction shelves are arranged at home, except I’m not quite so compulsive about alphabetizing the titles. I suppose there might be users of Phree Book Reader who would prefer the library being arranged a little differently, but I wrote this program mostly for myself so I’m really not open to negotiation on this issue!

Displaying books by author and title is a great application of ListBox grouping. The Windows Presentation Foundation (WPF) version of ItemsControl has a terrific property called GroupStyle that lets you define a style for a grouping property of items in a CollectionView. In Silverlight, you’d use CollectionViewSource, and although it does support groups, the Silverlight ItemsControl doesn’t have a GroupStyle.

Instead, I used ItemsControl for the collection of authors, where the template for each item displays the author’s name followed by a non-scrollable ListBox for that author’s titles, as shown in Figure 4.

Figure 4 The Library Pivot Item

<ScrollViewer Name="libraryScrollViewer"
              VerticalScrollBarVisibility="Auto">
  <ItemsControl Name="libraryItemsControl">
                     
    <!-- Assumes ItemsSource = AppSettings.Library.Authors -->
    <ItemsControl.ItemTemplate>
      <DataTemplate>
        <StackPanel>
          <!-- Creator -->
          <TextBlock Text="{Binding}"
                     FontWeight="Bold"
                     Margin="0 6" />
                                     
          <!-- Books -->
          <ListBox ItemsSource="{Binding Titles}"
                   ItemContainerStyle="{StaticResource listBoxItemStyle}"
                   SelectionChanged="OnLibraryListBoxSelectionChanged">
                                     
            <!-- Prevent scrolling of ListBox -->
            <ListBox.Style>
              <Style TargetType="ListBox">
                <Setter Property="ScrollViewer.VerticalScrollBarVisibility"
                        Value="Disabled" />
              </Style>
            </ListBox.Style>
                                     
            <ListBox.ItemTemplate>
              <DataTemplate>
                <Grid>
                  <Grid.ColumnDefinitions>
                    <ColumnDefinition Width="Auto" />
                    <ColumnDefinition Width="*" />
                    <ColumnDefinition Width="Auto" />
                  </Grid.ColumnDefinitions>
 
                  <Grid Grid.Column="0"
                        Margin="12 6"
                        VerticalAlignment="Center">
                    <Polygon Fill="{StaticResource PhoneAccentBrush}"
                             Points="6 0, 47 0, 47 57, 41, 63, 0 63, 0 6" />
                    <Image Source="Images/BookIcon.png" />
                  </Grid>
                                                 
                  <!-- Book title -->
                  <TextBlock Grid.Column="1"
                             Text="{Binding}"
                             FontSize="{StaticResource
                               PhoneFontSizeMediumLarge}"
                             VerticalAlignment="Center"
                             TextWrapping="Wrap" />
                                                  
                  <!-- Info Button -->
                  <Button Content=" ? "
                          Grid.Column="2"
                          Tag="{Binding ID}"
                          VerticalAlignment="Center"
                          Click="OnInfoButtonClick" />
                </Grid>
              </DataTemplate>
            </ListBox.ItemTemplate>
          </ListBox>
        </StackPanel>
      </DataTemplate>
    </ItemsControl.ItemTemplate>
  </ItemsControl>
</ScrollViewer>

Finding an alternative to this scheme is on my to-do list for Phree Book Reader. I’ve noticed that as I accumulate more books in the library, the initial load time starts to suffer, and I suspect that the nested ListBox controls are the reason why.

People sometimes refer to books they’re currently reading as “the books on my bedside table.” For that reason, the first Pivot item is labeled “bedside” and shows a maximum of six books, sorted in descending order of the date and time last read. Each book is identified with its display title.

By the way, a Windows Phone is a great way to read in bed with the lights turned off.

The last Pivot item is labeled “about” and contains some help information about the program as well as links to my past MSDN Magazine columns about the e-book reader.

Pivoting Around the Pivots

As I was developing this front end, my biggest struggles involved the Pivot control itself, and navigation away from and back to the Pivot control.

Normally when the program starts up, the default Pivot item should be the bedside. The program should make it as easy as possible to resume reading the most recently read book. However, if the user hasn’t read any books yet, the bedside list will be empty so the library view should be first up. And if the user hasn’t downloaded any books—perhaps because the program is running for the first time—the search item should be the default.

Programmatically controlling a Pivot control is accomplished by setting the SelectedIndex property of Pivot. However, this property can’t be set before the Pivot control has loaded. But after the Pivot control has loaded, setting the property has the effect of visibly sliding the Pivot control to that item. I think I’d prefer a less-visible transition.

Logic involving the Pivot control gets messier when a book is downloaded. If the user navigates from the search item to the download page and then presses the phone’s Back key without downloading the book, navigation should go back to the search item. However, if the book is downloaded, then the download page should go back to the library Pivot item where the book is now listed. If the book is downloaded and the user chooses to jump right to the reading view, then navigating back to the home page should cause the bedside item to be visible, again to show the book just read.

I found myself using the State dictionary of the PhoneApplicationService to keep track of where the program has been and what it’s doing. The search and request Pivot items set a State dictionary entry named “downloadBook” before navigating to the download page, and that page sets a “successfulDownload” or “successfulDownloadAndRead” dictionary entry depending on whether the user has chosen to jump to the reading view or not. I’m not quite happy with the inelegance of this approach, and perhaps in the future I’ll find something that works a little better.

As Usual, Tombstoning

Obviously the larger a program gets, the nastier the tombstoning issues become. Consider the search screen shown in Figure 1. I wrote the Web service to return only 25 hits, but also to allow a program to get an additional 25 with each additional call triggered by the bottom button on the screen. Eventually there could be hundreds—even thousands—of items in the ListBox, depending on the specificity of the search criteria.

This is a good example of a tricky area of tombstoning. All the data in the ListBox could be regenerated by calling the Web service again, but that would take too much time. The items in the ListBox must be saved. But what you do not want to do is save and restore the contents in the OnNavigatedFrom and OnNavigatedTo overrides. This search control is part of a Pivot item on the program’s main page, and there’s a lot of navigation to and from this page that does not involve tombstoning. It’s fine to save and restore small objects in the navigation overrides, but not thousands of items in a ListBox. The ListBox contents should only be saved when the application is actually being tombstoned.

For this program, I experimented with a general-purpose technique for doing precisely this—in the App class I defined a property named TombstoneObjects:

public IDictionary<string, object> TombstoneObjects { set; get; }

Any class anywhere in the program can make use of this dictionary. The SearchControl class implements the ITombstonable interface that I discussed in last month’s column. In the SaveState method (which is called from the OnNavigationFrom override of MainPage), the control copies the contents of the ListBox to a List object and saves that to the TombstoneObjects dictionary. In the RestoreState method, it restores the contents of the ListBox, but only if the ListBox is empty.

The App class is responsible for saving and restoring the contents of TombstoneObjects to the State dictionary of PhoneApplicationService because the App class has the power to do this intelligently. It knows when the program is being tombstoned because it has installed handlers for the PhoneApplicationService events. The result is that very little extraneous work occurs if the program is not actually being tombstoned.

Although I wrote Phree Book Reader for Windows Phone 7, at the time I’m writing this I’ve been working with beta versions of the next release. In Windows Phone 7.5, applications are tombstoned less frequently, so it’s doubly important for these applications to avoid a lot of unnecessary tombstoning work.

Speaking of the next version of Windows Phone, I’ve already accumulated a list of features I’d like to add to Phree Book Reader when I subject it to the upgrade. Perhaps we’ll be revisiting the program in future columns.          


Charles Petzold* is a longtime contributing editor to* MSDN Magazine. His recent book, “Programming Windows Phone 7” (Microsoft Press, 2010), is available as a free download at bit.ly/cpebookpdf.

Thanks to the following technical expert for reviewing this article: Richard Bailey