April 2009

Volume 24 Number 04

Service Station - Creating And Consuming Web Feeds

By Jon Flanders | April 2009

Code download available

Contents

Web Feeds
WCF and Feeds
Feed structure
Exposing a Feed
Implementing the Feed
AtomPub
Consuming Feeds

In the January 2009 issue of MSDN Magazine, I discussed the basics of using Representational State Transfer (REST) as the architecture for building client-server applications. I also started looking at how you could build such applications using the Windows Communication Foundation (WCF) features in the Microsoft .NET Framework 3.5.

This month, I'm going to talk about some additional Web feed capabilities of WCF that are enabled only because of the base support. While reading the previous Service Station column will help you understand this one, you can safely read this one without doing any harm to yourself (although some knowledge of WCF REST basics is helpful).

Web Feeds have been around for almost a decade as a way for applications to pull information from the World Wide Web. Traditionally used to expose articles and information from blogs, feeds are now commonly used to expose other types of data as well because of the simplicity of the architecture.

Feeds are simply XML data exposed via an HTTP URI. A feed reader is responsible for polling the URI at some interval and then processing the retrieved data. The reader, when it is an application intended to expose the feed data to a human, will notify the user that new data is available. In the case of an unattended feed reader (which could be any application code), typically some application action will be triggered by the acquisition of new data.

Considering that many feeds are exposing blog data, why this is the topic of a column about building services? The simplicity of feed formats for exposing data, plus the ubiquity of feed readers (and libraries to parse feeds) has led to many other types of data being exposed as feeds. ADO.NET Data Services, for example, is a service-enabled data layer built completely using feed technology (specifically using a standard built on top of Atom called Atom Publishing Protocol (AtomPub), which I'll discuss later in this article). The Microsoft Azure exposes much its data functionality in this way, as do services from a number of other companies.

One reason for the success of feeds is that they use a well-known and mostly standardized XML format. Rather than using a custom format per data provider, the industry has embraced two common feed schemas: Really Simple Syndication (RSS) and Atom. RSS is the older of the two formats, and although Atom has emerged as the broadly adopted standard, many feeds still expose both RSS and Atom versions. If you are building a service, specifically a RESTful service, you would be well served to consider building it using a feed, using RSS or Atom if the service is read-only, or using AtomPub if the service is read/write.

WCF and Feeds

On top of its base support for building RESTful endpoints, WCF in the .NET Framework 3.5 also includes support for exposing feeds. You can create a WCF HTTP-based endpoint and expose feeds using either common feed format. WCF in the .NET Framework 3.5 SP1 adds additional support for feeds by enabling you to easily implement an endpoint that supports the AtomPub specification.

In my opinion, one of the nice features of the feed support in WCF is that it abstracts away the need-to-know deep details of the Atom or RSS formats. WCF provides an object model that you use to build a generic object graph (the object model is more like Atom, since Atom has a somewhat richer model than RSS). You can then take the top level of this object graph and pass it to another framework-provided object known as a formatter. The formatter takes care of serializing the object graph data into the correct feed format. This is how you can easily expose both RSS and Atom from the same endpoint—and it also insulates you from future changes in feed formats, since all that will be needed is a new formatter. Your job is to fill in the object graph with data. The infrastructure takes care of converting your data into the correct feed format.

Feed structure

Before I go further, let me talk a little about the structure of feeds and how it relates to the WCF object model. Feeds are a pretty simple way to expose a collection of data (see Figure 1). A feed (generically) includes a title, a description, a published date, a last-updated date/time, as well as a collection of items. Each of these items is also a fairly simple piece of data, with a published date, a title, and content. The content generally contains the important data, whether that is a blog entry, a newspaper article, or some other piece of information. Each item can have hyperlinks that can link to an individual item or to an alternate representation of the item (like a full HTML page of its data rather than just a summary). See Figure 1for a visual representation of a feed.

Architecture of a Feed

Figure 1 Architecture of a Feed

Imagine I wanted to expose information about MSDN Magazinearticles through a feed. First, I need to decide what data to expose via this feed. Although there are a few options, I decided to implement this example as a feed of all the issues and articles published in MSDN Magazine. When a feed reader gets an update, it will show each article published as a separate item, with a small summary of the article as the content.

For this example, I'm going to be extending the code from my last column, where I built a generic RESTful service endpoint to serve up the data from MSDN Magazine. Again, you don't necessarily have to go back and read that article to understand this one, but there is a purpose to my choosing this particular scenario as my example, and it actually applies to both RESTful and SOAP-based Web services. Often, a feed is a good implementation detail to add to an existing service. The nice thing about WCF is that it is straightforward to add a feed endpoint to just about any service you've built with WCF regardless of whether the whole service is RESTful.

Exposing a Feed

The first thing I'm going to do is extend my service contract and definition to include an operation that can expose a feed. To do this, I've added a new method named GetIssuesFeed to my IMSDNMagazineService interface (see Figure 2).

Figure 2 Adding a GetIssuesFeed Method

[ServiceContract] public interface IMSDNMagazineService { [OperationContract] [WebGet(UriTemplate = "/")] IssuesCollection GetAllIssues(); [OperationContract] [WebGet(UriTemplate = "/feed?format={format}")] [ServiceKnownType(typeof(Atom10FeedFormatter))] [ServiceKnownType(typeof(Rss20FeedFormatter))] SyndicationFeedFormatter GetIssuesFeed(string format); [OperationContract] [WebGet(UriTemplate = "/{year}")] IssuesData GetIssuesByYear(string year); //remainder of interface commented for clarity ... }

There are a couple of things to note about this method. First, the return type is SyndicationFeedFormatter, part of the WCF feed support I referred to earlier generically as the formatter. It will reference the underlying object that I've filled with data, and it will generate the correct XML feed format depending on the type of the feed object.

The other thing to notice is the UriTemplate property of WebGetAttribute. This indicates to WCF what URI each method should respond to or handle. What is of interest in this case is that the GetIssuesFeed method has a URI template that is a literal: "/feed". GetIssuesByYear (which is the generic RESTful method that has nothing to do with feeds) has a variable path segment ("/{year}") in its URI. Since the URI parsing and routing mechanism turns everything into a string, it appears that these two UriTemplates would be in conflict.

The rules of UriTemplate parsing in WCF require that a literal match will match before a variable match. So if the incoming URI ends with "/feed", it will route to GetIssuesFeed, but any other URI with just that path segment will be routed to GetIssuesByYear. Note that I'm also using a query string to indicate which feed format the client is requesting. (I'll return RSS only if requested specifically; otherwise I'll return Atom.)

The third thing of interest on the GetIssuesFeed method is ServiceKnownTypeAttribute. This attribute is used by WCF to inform its serialization layer of any derived classes of the return value that might be passed as the actual return value (which it needs in order to correctly perform serialization). In this case, I potentially want to return both formats, and so I have two ServiceKnownTypeAttribute declarations, each referencing the appropriate derived class of SyndicationFeedFormatter: AtomFeedFormatter and Rss20FeedFormatter. It is fairly obvious from the class names, but AtomFeedFormatter returns a feed using the Atom format, and Rss20FeedFormatter returns a feed using the RSS 2.0 format. This is just normal WCF infrastructure, but you may not have seen this functionality unless you've tried to do polymorphism with WCF return values before.

Implementing the Feed

Whew, that's all without getting to the implementation. The implementation is actually fairly easy since it just needs to take my data about issues and articles and transfer that data into the WCF object model. The object model isn't based around SyndicationFeedFormatter—remember that's the object that does the formatting. The object model centers around a type named SyndicationFeed (see Figure 3). A SyndicationFeed object can be used to get a valid SyndicationFeedFormatter, and it is the object that holds onto each feed item. The items are part of the SyndicationFeed.Items collection.

SyndicationFeed Object Model

Figure 3 SyndicationFeed Object Model

The SyndicationFeed object model follows the idea of a generic feed, although its model is geared slightly towards Atom (with the Authors and Links collections, as well as the Content and Summary properties on SyndicationItem). Figure 4shows the code that implements the GetIssuesFeed method on my service.

Figure 4 GetIssuesFeed

public SyndicationFeedFormatter GetIssuesFeed(string format) { SyndicationFeedFormatter ret = null; SyndicationFeed myFeedData = new SyndicationFeed(); myFeedData.Title = new TextSyndicationContent("MSDN Magazine feed"); Articles articles = GetAllArticles(); SyndicationItem sitem = null; List<SyndicationItem> list = new List<SyndicationItem>(); myFeedData.Items = list; SyndicationLink altLink = null; foreach(var item in articles) { sitem = new SyndicationItem { Title = new TextSyndicationContent("MSDN Magazine Article: " + item.Title), Content = new XmlSyndicationContent(GenerateContent(item)), PublishDate = GetPublished(item), LastUpdatedTime = GetUpdated(item) }; altLink = MakeLinkForArticle(item); sitem.Links.Add(altLink); list.Add(sitem); } ret = new Atom10FeedFormatter(myFeedData); return ret; }

This is uncomplicated code and follows the same basic pattern as all feed code in WCF, which is to create a SyndicationFeed object, set its properties as appropriate, create a list of SyndicationItems, and after associating the list with the SyndicationFeed object (this isn't done automatically), populating the list with SyndicationItems containing your data. (You can download the full code from the MSDN MagazineWeb site, so for brevity, I haven't shown all the helper methods here,)

Filling in the SyndicationItem property does introduce a bit of complexity depending on whether you are planning on exposing RSS or Atom (or both). But essentially it's setting the Title, PublishedDate, LastUpdatedDate, and Content properties.

Another thing to note about this code is that LastUpdatedTime and PublishDate are of type DateTimeOffset, a new type that was added in the .NET Framework 3.5 to simplify working with exact dates and times as well as time zones.

Title and Content are a little different than you might expect. Title is of type TextSyndicationContent, and Content is the base SyndicationContent type. These types (and the other main derived type, XmlSyndicationContent) are used to manage the different parts of feeds that have content, but they sometimes vary in the content they contain. For example, the SyndicationItem.Content property could have plain text, some form of XML (even XHTML or HTML), or a link to some sort of binary content (hence the UrlSyndicationContent type). SyndicationContent is a needed layer of indirection that supports all these different types of data inside feed items.

Once I have this feed up and running, I can make a user agent request for it, and Internet Explorer is happy to pick up the feed and allow the user to subscribe to it (as would any feed reader user agent).

Another reason that many sites and services are moving towards Atom is because of Atom Publishing Protocol (AtomPub). Where Atom is a format for feeds, AtomPub is a specification for retrieving, creating, and updating resources. It's a specification built on top of the constraints of REST, so working with it is straightforward once you understand REST. Because most of the interaction is done with Atom instances, the format is also well known and understood.

Most of AtomPub centers around Atom feeds, but there are two new document types: service document and category. The .NET Framework 3.5 SP1 adds support for AtomPub via two new objects that support these new document types. (You could implement AtomPub with just the .NET Framework 3.5, but SP1 adds features that make your job much easier.)

An AtomPub service document is a metadata document that informs a user agent about what workspaces a particular AtomPub endpoint supports. A workspace is a collection of member resources, which are really hyperlinks to particular Atom feeds. While the service document can also specify things such as what media types each feed supports (like images or other binary files), all feeds are expected to support standard Atom entries.

The AtomPub specification also specifies how to use the uniform interface against each feed (member resource). You can see this mapping in Figure 5.

Figure 5 AtomPub Application of the Uniform Interface
Resource Uniform interface method Description
Service Document GET Once the URI is known by a user agent, the Service Document can be retrieved via GET
Category Document GET Used to get the representation of the category
Collection GET Gets the representation—an Atom feed
Collection POST Creates a new Atom entry
Member GET Gets an individual member, which can be an individual Atom entry or a binary file
Member PUT Modifies a member
Member DELETE Deletes a member

The types added by SP1 are ServiceDocument (similar to SyndicationItem) and AtomPub10ServiceDocumentFormatter (which serves the same purpose as SyndicationFeedFormatter). There also is a type named ResourceCollectionInfo to represent each collection in the service document. This type has a relationship to ServiceDocument somewhat similar to the relationship between SyndicationFeed and SyndicationItem, since ServiceDocument holds onto a list of ResourceCollectionInfos to generate the main content of the service document itself.

To support AtomPub on my endpoint, first I need to add another method to my contract:

[OperationContract] [WebGet(UriTemplate = "/servicedoc")] AtomPub10ServiceDocumentFormatter GetServiceDoc();

Then I can provide the implementation as shown in Figure 6. This will produce the service document shown in Figure 7.

Figure 6 AtomPub Service Document Implementation

public AtomPub10ServiceDocumentFormatter GetServiceDoc() { OutgoingWebResponseContext ctx = WebOperationContext.Current.OutgoingResponse; ctx.ContentType = "application/atomsvc+xml"; AtomPub10ServiceDocumentFormatter ret = null; //create the ServiceDocument type ServiceDocument doc = new ServiceDocument(); IncomingWebRequestContext ictx = WebOperationContext.Current.IncomingRequest; //set the BaseUri to the current request URI doc.BaseUri = ictx.UriTemplateMatch.RequestUri; //create a Collection of resources List<ResourceCollectionInfo> resources = new List<ResourceCollectionInfo>(); //create the Blog resource ResourceCollectionInfo mainBlog = new ResourceCollectionInfo("MSDNMagazine", new Uri("feed", UriKind.Relative)); //add the Accepts for this resource //remember this is the default if no accepts if present mainBlog.Accepts.Add("application/atom+xml;type=entry"); resources.Add(mainBlog); //create the Pictures resource ResourceCollectionInfo mainPictures = new ResourceCollectionInfo("Pictures", new Uri("pictures", UriKind.Relative)); //add the Accepts for this resource mainPictures.Accepts.Add("image/png"); mainPictures.Accepts.Add("image/jpeg"); mainPictures.Accepts.Add("image/gif"); resources.Add(mainPictures); //create the Workspace Workspace main = new Workspace("Main", resources); //add the Workspace to the Service Document doc.Workspaces.Add(main); //get the formatter ret = doc.GetFormatter() as AtomPub10ServiceDocumentFormatter; return ret; }

Figure 7 Defining the Service

<?xml version="1.0" encoding="utf-8"?> <service xml:base="https://localhost:1355/Issues.svc/servicedoc" xmlns="https://www.w3.org/2007/app" xmlns:a10="https://www.w3.org/2005/Atom"> <app:workspace xmlns:app="https://www.w3.org/2007/app"> <a10:title type="text">Main</a10:title> <app:collection href="feed"> <a10:title type="text">MSDNMagazine</a10:title> <app:accept>application/atom+xml;type=entry</app:accept> </app:collection> <app:collection href="pictures"> <a10:title type="text">Pictures</a10:title> <app:accept>image/png</app:accept> <app:accept>image/jpeg</app:accept> <app:accept>image/gif</app:accept> </app:collection> </app:workspace> </service>

This service document communicates that there is one workspace named Main that has two collections. One collection has the relative URI of "feed" and uses the default media type as its input (the media type for Atom entries). The other collection holds binary files, specifically image files (restricted to the media types listed) and has a URI of "pictures." Given this document, a user agent can easily interact with my AtomPub endpoint following the rules set out in Figure 5.

AtomPub is a useful standard that can be applied to many different scenarios, not just blogs and associated files (like images).

Consuming Feeds

The SyndicationFeed API can also work on the client side for consuming a feed. The various current feed formats (and there probably will be more in the future) make this a nontrivial task using custom code, so this is a very nice addition to the .NET Framework base library. Here's some simple code that demonstrates consuming a feed:

string uri = "https://localhost:1355/Issues.svc/feed"; XmlReader xr = XmlReader.Create(uri); SyndicationFeed feed = SyndicationFeed.Load(xr); Console.WriteLine("Feed title:{0}", feed.Title.Text); foreach(var item in feed.Items) { Console.WriteLine("Item {0}", item.Title.Text); }

Many companies are standardizing as much as possible around AtomPub for building services and frameworks. ADO.NET Data Services uses AtomPub to expose data entities. In turn, ADO.NET Data Services is used for all the table, blog, and queue functionality in Microsoft Azure. Windows Live Services is using AtomPub for most of its exposed services. The point here is that even if you aren't exposing Atom or AtomPub from any applications you are building, the chances are pretty high that you'll need to know how to interact and consume AtomPub services now or sometime in the near future. Hopefully what I've shown you here will help get you started.

ADO.NET Data Services or WCF?

ADO.NET Data Services and WCF in the .NET Framework 3.5 both provide ways to easily expose RESTful services that generate standards-compliant Web feeds. So when should you use ADO.NET Data Services and when should you use the built-in support in WCF?

WCF provides functionality for exposing and calling service endpoints. With WCF 3.5, service endpoints may be standard SOAP-based or follow a simple RESTful service interface style. To expose a RESTful endpoint, you annotate the methods within your service class with the UriTemplate used to define the resource identifier. URIs that match that template result in calls to your method, passing in parameters from the template. The set of annotated methods within your class make up your service contract.

To return results from a WCF service as an RSS or Atom-compatible feed, you create within your method a SyndicationFeed object containing the list of SyndicationItems representing the content you want to return. You then pass that SyndicationFeed in the creation of a SyndicationFeedFormatter of the appropriate type, and return that SyndicationFeedFormatter from your method.

ADO.NET Data Services is built on WCF and provides a queryable REST-based endpoint over an Entity Model, such as a model exposed by the Microsoft Entity Framework or other IQueryable data source. With ADO.NET Data Services, the model becomes the service contract. That model can be queried through standard URIs that, in addition to the service root and resource path, include query options such as filtering, ordering, paging, and graph expansion. In addition, ADO.NET Data Services provides features for controlling access permissions, exposing custom service operations, and customizing how resources are accessed (for example, for implementing user-based security) through query interception.

ADO.NET Data Services supports JSON and AtomPub specifications, including the ability to do inserts, updates, and deletes against the service (as allowed) through standard HTTP verbs. Thus, ADO.NET Data Services can be accessed from any HTTP client.

In addition, Microsoft ships an ADO.NET Data Services Client for the .NET Client Framework, Silverlight, and Ajax that provides a strongly typed, LINQ-based experience for querying and updating data against an ADO.NET Data Service-compliant endpoint (whether that endpoint be implemented using ADO.NET Data Services or a custom implementation, such as Azure Table).

So if you want to expose a REST-based queryable model over a data source such as an ADO.NET Data Provider or IQueryable source (including in-memory collections) and return results in a JSON or AtomPub compatible format that can be consumed through either a simple HTTP client or a strongly typed client, then ADO.NET Data Services provides the most functionality for the least amount of work.

On the other hand, if you want to implement a set of methods that hand-craft a feed based on custom application logic, then the built-in functionality within WCF makes it easy to expose those methods as a RESTful service capable of returning the feed in an RSS or Atom-compatible format.

—Michael Pizzo, Principal Architect, Data Programmability

Send your questions and comments to sstation@microsoft.com.

Jon Flanders is an independent consultant, speaker, and trainer for Pluralsight. He specializes in BizTalk Server, Windows Workflow Foundation, and Windows Communication Foundation.