August 2012

Volume 27 Number 08

Touch and Go - Viewing a Virtual World from Your Windows Phone

By Charles Petzold | August 2012

Charles PetzoldUntil the time of Copernicus—and for many years after—people believed the universe was constructed of a series of concentric celestial spheres surrounding the Earth. Although that model of the universe has been abandoned, it’s still convenient to employ the concept of a celestial sphere for identifying the location of objects in 3D space relative to ourselves as viewers.

A celestial sphere is particularly handy for programs that let you use a smartphone for viewing a world of virtual reality or augmented reality. With such programs, you hold the phone as if you’re taking a photograph or video through the camera lens, but what you see on the screen might not have anything to do with the real world.

Such a program needs to determine its orientation in 3D space so that by sweeping the phone in arcs the user can pan through this virtual world. With the Motion sensor I described in the last installment of this column, Windows Phone is capable of providing the necessary orientation information.

How do we translate from the information provided by the Motion sensor to the celestial sphere? It’s all about the coordinate system.

We’re all familiar with geographic coordinates that allow us describe a location on the surface of our planet. Any point on the surface of the Earth can be denoted by two numbers: latitude and longitude, both of which are angles with a vertex in the Earth’s center. Latitude is an angle relative to the equator: It’s positive for locations north of the equator and negative for locations south. The latitude of the North Pole is 90° and the latitude of the South Pole is -90°. Longitude involves angles between great circles that pass through the two poles measured from the Prime Meridian, which is the line of longitude that passes through Greenwich, England.

We live not only on the surface of a sphere, but also at the center of a conceptual celestial sphere. Several coordinate systems can be used to denote locations on this celestial sphere, but the one I’ll be focusing on is called the horizontal coordinate system because it’s based on the horizon.

Horizontal Coordinates

Using your outstretched arm, point to any object you see around you. That object has a location on the celestial sphere. What is that location? Move your straight arm up or down so it becomes horizontal—that is, parallel to the surface of the Earth. The angle your arm swings through during this movement is called the altitude.

Positive values of altitude are above the horizon; negative values are below the horizon. An object located straight up from you has an altitude of 90°, also called the zenith, and an object straight down has an altitude of -90°, called the nadir.

Now swing your horizontal outstretched arm so it’s pointing north. The angle your arm swings during this movement is called the azimuth. The altitude and azimuth together constitute a horizontal coordinate.

Notice that the horizontal coordinate gives you no information about how far away something is. During a solar eclipse, the sun and moon have the same horizontal coordinate. With any type of celestial coordinate system, everything is assumed to be on the interior surface of the celestial sphere.

The azimuth must be relative to a particular point on the compass. Most often, the azimuth is set at 0° for north, with increasing angles moving eastward. However, astronomers tend to set 0° at the south with increasing angles moving westward; at least that’s how Jean Meeus sums it up in his classic book, “Astronomical Algorithms” (Willmann-Bell, 1998).

Horizontal coordinates are analogous to geographic coordinates, except the perspective is different. Instead of being on the surface of a sphere, you’re at the center looking out. The azimuth is comparable to the longitude and the altitude is comparable to the latitude. Like circles of longitude, circles of azimuth are always great circles passing through the poles. Like circles of latitude, circles of altitude are always parallel to each other. The horizon plays the same role in horizontal coordinates as the equator in geographic coordinates.

Now pick up your Windows Phone and hold it so you’re looking at the screen while the camera lens points away from you. The direction the camera lens is pointing has a particular altitude and azimuth. Although that horizontal coordinate is conceptually a location on the interior of the celestial sphere, it’s also a direction from the viewpoint of the camera lens—mathematically, a three-dimensional vector.

As I’ve discussed in previous columns, the phone has an implicit coordinate system, where the positive Z axis extends out from the screen. That means the camera lens on the other side points in the direction of the 3D vector (0, 0, –1). As I demonstrated in the previous installment of this column (msdn.microsoft.com/magazine/jj190811), the Motion sensor in Windows Phone lets you obtain a 3D rotation matrix that describes how the Earth is rotated relative to the phone. To obtain a matrix that describes how the phone is rotated relative to the Earth, the matrix obtained from the Motion sensor must be inverted:

matrix = Matrix.Invert(matrix);

Use this inverted matrix to rotate the (0, 0, –1) vector:

Vector3 vector = Vector3.Transform(new Vector3(0, 0, -1), matrix);

Now you have a 3D vector that describes the direction the camera lens is pointing. That vector needs to be converted to altitude and azimuth angles.

If the phone is held upright—that is, with the transformed vector horizontal to the surface of the Earth—the Z component is 0, and the problem reduces to the well-known conversion from two-dimensional Cartesian coordinates to polar coordinates. In C#, it’s simply:

double azimuth = Math.Atan2(vector.X, vector.Y);

That’s an angle in radians. Multiply by 180 and divide by π to convert to degrees.

This formula implies that north has an azimuth of zero, and values increase in an eastward direction.

If you prefer south to be zero with increasing values in a westward direction, shift the result by 180° by changing the sign of the X and Y components.

That formula for the azimuth is actually valid regardless of the Z component of the transform vector.

That Z component is the sine of the altitude. Because the altitude ranges only between negative and positive 90°, it can be calculated using the inverse sine function:

double altitude = Math.Asin(vector.Z);

Again, multiply by 180° and divide by π to convert radians to degrees.

However, we’re still missing something, which you might recognize when you realize that we’ve translated a three-dimensional rotation matrix into a coordinate that has only two dimensions because it’s confined to the interior surface of the celestial sphere.

What happens when you aim the phone in a particular direction described by a 3D vector, and then rotate the phone around the vector like an axis? The vector doesn’t change, nor do the altitude and azimuth values, but the virtual reality scene on the phone’s screen should rotate relative to the phone.

This extra motion is sometimes referred to as tilt. It’s also an angle, but the calculation is a little more difficult than altitude and azimuth.

You can see that calculation in a HorizontalCoordinate structure I created that converts a Motion reading into altitude, azimuth and tilt, all in degrees. This structure is included in the AltitudeAndAzimuth project, which is among the downloadable code for this article. This program simply uses the Motion sensor to obtain the orientation of the phone, and then converts the information to horizontal coordinates. This project requires references to the Microsoft.Devices.Sensors assembly (for the Motion class) and the Microsoft.Xna.Framework assembly (for the 3D vector and matrix). The screen displays the transformed vector and the values from the HorizontalCoordinates structure. Figure 1 shows the phone held approximately upright with the lens pointed approximately east, and tilted clockwise a bit.

The AltitudeAndAzimuth Display
Figure 1 The AltitudeAndAzimuth Display

Getting the Big Picture

Suppose you want to view an image that’s much larger than the screen of your computer—or, in this case, your phone.

Traditionally, scrollbars have been involved. On a touchscreen, the scrollbars can be eliminated and the user can perform a similar scrolling operation using fingers.

But another approach is to conceptually wallpaper the interior of the celestial sphere with this large image and then view it by moving the phone itself. (Keep in mind that when you move the phone to view the image, you’re not moving the phone left and right or up and down in a plane. The movement has to be along arcs so that the altitude and azimuth are changing.)

How large can such an image be so that it pans across the screen in a natural way as the phone moves?

An average Windows Phone screen is probably about 2 inches wide and 3.33 inches tall. If you hold the phone 6 inches from your face, some simple trigonometry reveals that the phone occupies a field of view about 19° degrees wide and 31° tall. Holding the phone in landscape mode, these two fields of view are slices from the total azimuth of 360° degrees and altitude of 180°. Very roughly, then, the phone’s screen held 6 inches from your face in landscape mode occupies about 10 percent of the total field of view horizontally and vertically.

Or think of it this way: If you want to use your phone in portrait mode to pan over the surface of a bitmap, that bitmap can be somewhere in the region of 8000 pixels wide and 4800 pixels high.

That’s the idea of the BigPicture project, which contains links for downloading eight images (a mix of paintings, photographs, documents and drawings, mostly from Wikipedia), the largest of which is 5649 pixels wide and 4000 pixels high. You can easily add other images by editing an XML file, but based on my experience using the PictureDecoder.DecodeJpeg method, you’re likely to encounter out-of-memory exceptions if you go much larger.

Background File Transfers

Considering that most of the image files referenced by BigPicture are more than 2MB in size and one of them is 19MB, this seemed an ideal opportunity to make use of the facility added to Windows Phone to download files in the background.

In the BigPicture program, most of MainPage is devoted to maintaining a ListBox that lists the available files, and downloading them to isolated storage. Figure 2 shows the program with some images already downloaded (which are shown as thumbnails), one download in progress and others not yet downloaded.

The BigPicture Main Page
Figure 2 The BigPicture Main Page

To use the background file transfer, you create an object of type BackgroundTransferRequest, passing to it the URL of the external file and the URL of a location in isolated storage within the /shared/transfers directory. You can then obtain changes in status and progress via events while the program is running, and you can enumerate the active requests when your program starts up again.

When the BigPicture program starts up, MainPage searches isolated storage for any images that might have been previously downloaded. I discovered that files are downloaded directly to the filename you specify, and not to a temporary file with some other filename. This means that my program was encountering files that had not yet been fully downloaded, or whose downloads might have been canceled. I fixed several bugs in my program by using the /shared/transfers directory only for downloading files and not for permanent storage. When a download completes, the program moves that file to another directory and creates a thumbnail in yet another directory. For convenience, all three files have the same name but are distinguished by the directory in which they’re found.

When a file has been downloaded by BigPicture, you can tap the item in the ListBox and the program navigates to ViewPage, which is the real heart of the program.

Viewing the Big Image

ViewPage has two viewing modes, which you can alternate between by tapping on the screen. An animation takes you from one mode to the other.

In the normal mode, shown in Figure 3, the image is displayed in its pixel size, conceptually stretched to the interior of a celestial sphere. You navigate around the image by changing the orientation of the phone, conceptually pointing the phone toward the area of the celestial sphere you want to view. (It might help if you stand up and turn your whole body in different directions, and point the phone up and down as well.)

BigPicture Showing One Small Part of a Large Painting
Figure 3 BigPicture Showing One Small Part of a Large Painting

When you tap the phone, you shift to the zoom-out mode. The entire image is displayed unrotated in portrait mode, as shown in Figure 4. A rectangle displays the portion of the image that’s viewable in the normal mode. In this example, that rectangle is near the lower-right corner.

BigPicture Showing an Entire Large Painting
Figure 4 BigPicture Showing an Entire Large Painting

What happens at the edges? Because the bitmap is conceptually stretched to the interior of a celestial sphere, when you move the phone to the right beyond the right edge of the bitmap, you should then encounter the left edge. However, the layout system in Silverlight doesn’t wrap around in this way. If the program allowed opposite edges of a large bitmap to be visible, then two Image elements would be required. At the point where all four corners meet, four Image elements would be required.

I nixed that concept. Beyond the right edge of the bitmap is a gap equal to the maximum dimension of the phone’s display, and then the left edge appears. You’ll never see both edges in the display. This also solves the problem of what to do at the poles, where theoretically the top and bottom of the painting should be compressed to a point.

Figure 5 shows most of the XAML file for ViewPage. The Image element displays the bitmap itself, of course, and the None setting for the Stretch property indicates that it’s to be displayed in its pixel size. Normally a large image would be cropped by the layout system to the size of the display, and you wouldn’t be able to pan around the rest of the image. But putting everything inside a Canvas tricks the layout system into rendering the whole object. The Border with the embedded Rectangle is the rectangle visible in the zoomed-out mode, but it’s also visible hugging the inside of the screen in the normal mode. The CompositeTransform named imageTransform applies to both the Image and the Border. The other Composite-Transform named borderTransform applies only to the Border.

Figure 5 The XAML File for the BigPicture Image Viewing Page

<phone:PhoneApplicationPage ... >
  <Grid x:Name="LayoutRoot" Background="Transparent">
    <Canvas>
      <Grid>
        <Image Name="image" Stretch="None" />
        <Border Name="outlineBorder"
                BorderBrush="White"
                HorizontalAlignment="Left"
                VerticalAlignment="Top">
            <Rectangle Name="outlineRectangle"
                       Stroke="Black" />
            <Border.RenderTransform>
              <CompositeTransform x:Name="borderTransform" />
            </Border.RenderTransform>
        </Border>
        <Grid.RenderTransform>
          <CompositeTransform x:Name="imageTransform" />
        </Grid.RenderTransform>
      </Grid>
    </Canvas>
    <TextBlock Name="titleText"
               Style="{StaticResource PhoneTextNormalStyle}"
               Margin="12,17,0,28" />
    <TextBlock Name="statusText"
               Text="creating image..."
               HorizontalAlignment="Center"
               VerticalAlignment="Center" />
  </Grid>
</phone:PhoneApplicationPage>

The codebehind file starts a Motion sensor going and then applies the rotation matrix to create a HorizontalCoordinate object that it uses to set the properties of these two transforms. The ViewPage class also defines an InterpolationFactor dependency property that’s the target of an animation to transition between the two viewing modes. As InterpolationFactor is animated from 0 to 1, the view transitions between the normal and the zoom-out.

Figure 6 shows most of the math involved. One of the most important calculations occurs when the Motion sensor is updated. This is the calculation of the CenterX and CenterY properties of the CompositeTransform for the Image, and it’s where the altitude and azimuth come into play. Although this transform center is the point around which scaling and rotation occurs, further calculations put this point in the center of the display in the normal viewing mode. The rectangular border is also aligned with this point.

Figure 6 Much of the Transform Math for BigPicture

public partial class ViewPage : PhoneApplicationPage
{
  ...
  void OnLoaded(object sender, RoutedEventArgs args)
  {
    // Save the screen dimensions
    screenWidth = this.ActualWidth;
    screenHeight = this.ActualHeight;
    maxDimension = Math.Max(screenWidth, screenHeight);
    // Initialize some values
    outlineBorder.Width = screenWidth;
    outlineBorder.Height = screenHeight;
    borderTransform.CenterX = screenWidth / 2;
    borderTransform.CenterY = screenHeight / 2;
    // Load the image from isolated storage
    ...
    // Save image dimensions
    imageWidth = bitmap.PixelWidth;
    imageHeight = bitmap.PixelHeight;
    ...
    zoomInScale = Math.Min(screenWidth / imageWidth, 
      screenHeight / imageHeight);
    UpdateImageTransforms();
    ...
  }
  ...
  void OnMotionCurrentValueChanged(object sender,
          SensorReadingEventArgs<MotionReading> args)
  {
    ...
    // Get the rotation matrix & convert to horizontal coordinates
    Matrix matrix = args.SensorReading.Attitude.RotationMatrix;
    HorizontalCoordinate horzCoord = 
      HorizontalCoordinate.FromMotionMatrix(matrix);
    // Set the transform center on the Image element
    imageTransform.CenterX = (imageWidth + maxDimension) *
      (180 + horzCoord.Azimuth) / 360 - maxDimension / 2;
    imageTransform.CenterY = (imageHeight + maxDimension) *
      (90 - horzCoord.Altitude) / 180 - maxDimension / 2;
    // Set the translation on the Border element
    borderTransform.TranslateX = 
      imageTransform.CenterX - screenWidth / 2;
    borderTransform.TranslateY = 
      imageTransform.CenterY - screenHeight / 2;
    // Get rotation from Tilt
    rotation = -horzCoord.Tilt;
    UpdateImageTransforms();
  }
  static void OnInterpolationFactorChanged(DependencyObject obj,
              DependencyPropertyChangedEventArgs args)
  {
    (obj as ViewPage).UpdateImageTransforms();
  }
  void UpdateImageTransforms()
  {
    // If being zoomed out, set scaling
    double interpolatedScale = 1 + InterpolationFactor * 
      (zoomInScale - 1);
    imageTransform.ScaleX =
    imageTransform.ScaleY = interpolatedScale;
    // Move transform center to screen center
    imageTransform.TranslateX = 
      screenWidth / 2 - imageTransform.CenterX;
    imageTransform.TranslateY = 
      screenHeight / 2 - imageTransform.CenterY;
    // If being zoomed out, adjust for scaling
    imageTransform.TranslateX -= InterpolationFactor *
      (screenWidth / 2 - zoomInScale * imageTransform.CenterX);
    imageTransform.TranslateY -= InterpolationFactor *
      (screenHeight / 2 - zoomInScale * imageTransform.CenterY);
    // If being zoomed out, center image in screen
    imageTransform.TranslateX += InterpolationFactor *
      (screenWidth - zoomInScale * imageWidth) / 2;
    imageTransform.TranslateY += InterpolationFactor *
      (screenHeight - zoomInScale * imageHeight) / 2;
    // Set border thickness
    outlineBorder.BorderThickness = 
      new Thickness(2 / interpolatedScale);
    outlineRectangle.StrokeThickness = 2 / interpolatedScale;
    // Set rotation on image and border
    imageTransform.Rotation = (1 - InterpolationFactor) * rotation;
    borderTransform.Rotation = -rotation;
  }
}

When the Azimuth is 0 (phone facing north) and the Altitude is 0 (upright), the CenterX and CenterY properties are set to the center of the bitmap. Notice the inclusion of the maxDimension value so that these CenterX and CenterY properties can be set to values outside the bitmap. This allows for the padding when you sweep past the edges.

Most of the remainder of the calculations occur during the UpdateImageTransforms method, which is called when the Motion sensor reports a new value, or when the InterpolationFactor property changes during transitions. Here’s where the scaling and translation of the Image transform occurs, as well as rotation.

If you’re interested in understanding the interaction of these transforms, you might want to clean them up by eliminating all the interpolation code. Examine the simplified formulas when InterpolationFactor is 0 and when it’s 1, and you’ll see that they’re actually quite straightforward.


Charles Petzold is a longtime contributor to MSDN Magazine, and is currently updating his classic book “Programming Windows” (Microsoft Press, 1998) for Windows 8. His Web site is charlespetzold.com.

Thanks to the following technical expert for reviewing this article: Donn Morse