Skip to main content
Project Astoria

The Architecture Journal

by Pablo Castro

Summary: Project Astoria delivers a set of patterns as well as a concrete infrastructure for creating andconsuming data services using Web technologies. This article explores howmodern data-centric Web applications and services are changing and the role Astoria can play in their new architectures.

Data is becoming increasingly available as afirst-class element in the Web. The proliferation of new data-drivenapplication types such as mashups clearly indicates that the broad availabilityof standalone data independent of any user interface is changing the waysystems are built and the way data can be leveraged. Furthermore, technologiessuch as Asynchronous JavaScript and XML (AJAX) and Microsoft Silverlight areintroducing the need for mechanisms to exchange data independently frompresentation information in order to support highly interactive userexperiences.

Project Astoria provides architects anddevelopers with a set of patterns for interacting with data services over HTTPusing simple formats such as POX (Plain Old XML) and JavaScript Object Notation(JSON). Closely following the HTTP protocol results in excellent integrationwith the existing Web infrastructure, from authentication to proxies tocaching. In addition to the patterns and formats, we provide a concreteimplementation that can automatically surface an Entity Data Model schema orother data sources through the HTTP interface.

While the software infrastructure provided by Astoria can be a useful piece of the puzzle when building new Web applications andservices, it is just one piece. Other elements in these applications need to beorganized in a way that enables interaction with data across the Web.

In this article, I will discuss thearchitectural aspects that are impacted by modern approaches for Web-enabledapplication development, focusing on how data services integrate into thepicture.

Contents

Separating Data from Presentation
Flexible Data Interfaces
Interacting with Data in the Presentation Layer
Introducing Business Logic
Extensibility and Alternate Data Sources
Deployment Scenarios: Applications and Services
Security
Closing Notes: Scope and Plans for Project Astoria
Resources
About the Author

Separating Data fromPresentation

The new generation of Web applications utilizestechnologies such as AJAX, Microsoft Silverlight, or other rich-presentationinfrastructures. One common characteristic of all these technologies is thatthey impose a change with respect to the way Web applications are built.

Figure 1 compares the flow of content intraditional and modern Web applications. Traditional data-driven Webapplications typically consist of a set of server-side pages or scripts thatrun when a request arrives; during execution they execute a few queries againsta database and then render HTML that contains both presentation information anddata embedded in it.

Click here for larger image

Figure 1: Traditional flow of content (left) and how itchanges in modern Web applications (right) (Click on the picture for a larger image)

An AJAX-based or Silverlight-based applicationdoes not follow that interaction model. Presentation information is shipped tothe Web browser along with code to drive the user interface, but without actualdata. That presentation information is typically a combination of HTML,Cascading Style Sheets (CSS), and JavaScript for AJAX applications and ExtensibleApplication Markup Language (XAML)/ DLLs for Silverlight. Once the code is upand running in the client, the initial user interface is presented and data isretrieved as the user interacts with the interface.

Interestingly, this new round of technologies isacting as a forcing function to push application organization toward a goalthat is common in many application architectures: strict separation of data andpresentation.

Serving user-interface elements is relativelystraightforward from the server perspective. Most of the time these are simplefile resources on the server, such as HTML or CSS files, media files. Servingdata is another story. Until now, interaction with data was something thathappened between the Web server and the database server; there was no need toexpose entry points accessible from code running across the Web in a Webbrowser or some other software agent. This is where Project Astoria kicks in.

Flexible Data Interfaces

There are a number of ways to expose data toclients that will consume it from across the Web. One approach that is enabledby existing technologies is to use an approach similar to Remote Procedure Call(RPC), where “functions” are exposed through an interface such as Web services(for example, a SOAP-based interface or a simple URI-based convention forinvoking methods and passing parameters). Microsoft Visual Studio has a matureset of tools that makes it straightforward to both create and consumeinterfaces built this way. The ASP.NET AJAX toolkit takes this to the nextlevel by enabling Visual Studio-created Web services to work with AJAX clients.

The main issue with this approach isflexibility. If interaction with data only happens through fixed, predefinedentry points then every new scenario or every variation of existing ones withslightly different data will typically require the creation of new entrypoints. While this level of control is occasionally desired, in many cases,more flexibility would increase development productivity and contribute to a moredynamic application.

Project Astoria introduces an alternative to theRPC approach that is based on the simple semantics of HTTP. Astoria takes aschema definition that describes each one of the entities that your applicationdeals with, along with the associations between entities, and exposes them overan HTTP interface. Each entity is addressable with a Uniform ResourceIdentifier (URI) and a URI convention allows applications to traverseassociations between entities, search entities, and perform other commonlyneeded operations on data.

The schema definition used by Astoria is anEntity Data Model (EDM) schema, which is supported directly by the ADO.NETEntity Framework. The Entity Framework also includes a powerful mapping enginethat allows developers to map the EDM schema to a relational database foractual storage.

In order to show interaction with Astoria services, I will use an example based on the well-known Northwind sample database.A data service on top of Northwind can be set up by creating an ASP.NETapplication, importing the Northwind schema from the database into an EDMschema using the EDM wizard, and then creating an Astoria data service pointingto that EDM schema.

Detailed steps for creating Astoria dataservices, as well as extensive documentation on the Astoria URI and payloadformats, can be found in the “Using Microsoft Codename Astoria” documentavailable on the Astoria Web site at http://astoria.mslivelabs.com.

With the service up and running, URIs can beused to browse the resources exposed by the data interface. The followingexamples use the experimental Astoria online service that hosts a few read-onlydata services. For instance: http://astoria.sandbox.live.com/northwind/northwind.rse/Customerswould return all of the resources in the Customers resource container. In thedefault XML format, it would look like this:

<DataService xml:base=”http://astoria.sandbox.live.com/northwind/northwind.rse”>

 <Customers>

 <Customer uri=”Customers[ALFKI]”>

 <CustomerID>ALFKI</CustomerID>

 <CompanyName>AlfredsFutterkiste</CompanyName>

 <ContactName>MariaAnders</ContactName>

 <ContactTitle>Sales Representative</ContactTitle>

 <Address>Obere Str. 57</Address>

 <City>Berlin</City>

 <Region />

 <PostalCode>12209</PostalCode>

 <Country>Germany</Country>

 <Phone>030-0074321</Phone>

 <Fax>030-0076545</Fax>

 <Orders href=”Customers[ALFKI]/Orders” />

 </Customer>

 <Customer uri=”Customers[ANATR]”>

 ...properties...

 </Customer>

 ...more customer entries...

 </Customers>

 </DataService>

It is also possible to point to a particularentity by using its keys: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers[ALFKI]would return a particular Customer resource; again, using the XML format in theexample:

<DataServicexml:base=”http://astoria.sandbox.live.com/northwind/northwind.rse”>

 <Customers>

 <Customer uri=”Customers[ALFKI]”>

 <CustomerID>ALFKI</CustomerID>

 <CompanyName>AlfredsFutterkiste</CompanyName>

 <ContactName>MariaAnders</ContactName>

 <ContactTitle>SalesRepresentative</ContactTitle>

 <Address>Obere Str. 57</Address>

 <City>Berlin</City>

 <Region />

 <PostalCode>12209</PostalCode>

 <Country>Germany</Country>

 <Phone>030-0074321</Phone>

 <Fax>030-0076545</Fax>

 <Orders href=”Customers[ALFKI]/Orders” />

 </Customer>

 </Customers>

 </DataService>

When a URI points to a specific resource such asthe one just discussed, it is possible not only to retrieve the resource usingan HTTP GET verb, but also to update it, using HTTP PUT, or delete it, usingHTTP DELETE.

Since the schema description provided to Astoria includes associations between entities, those can also be leveraged in the HTTPinterface. Continuing with the example, if each Customer resource is associatedwith a set of Sales Order resources, then the following URI represents the setof Sales Orders related to a particular Customer: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers[ALFKI]/Orders.

In addition to being able to point to specificresources and list resources in a container, it is also possible to filter,sort, and page over data to facilitate the creation of user interfaces on topof the data service. For example, to list all the Customer resources in thecity of London, sorted by contact name, an application can use this URI: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers[City eq ‘London’]?$orderby=ContactName.

The actual format of Astoria URIs is stillsubject to change, but the semantics and capabilities should remain relativelystable.

In all cases, data is exchanged in simpleformats such as XML or JSON. The actual format can be controlled by theclient-agent using regular HTTP content type negotiation. Data is encoded asresources that simply map properties in EDM entities into XML elements or JSONproperties, and they are hyperlinked to other resources that they haveassociations with. For example, the URI from the previous example: http://astoria.sandbox.live.com/northwind/northwind.rse/Customers[ALFKI] would result in the following response (in XML):

DataServicexml:base=”http://astoria.sandbox.live.com/northwind/northwind.rse”>

 <Customers>

 <Customer uri=”Customers[ALFKI]”>

 <CustomerID>ALFKI</CustomerID>

 <CompanyName>AlfredsFutterkiste</CompanyName>

 <ContactName>MariaAnders</ContactName>

 <ContactTitle>SalesRepresentative</ContactTitle>

 <Address>Obere Str. 57</Address>

 <City>Berlin</City>

 <Region />

 <PostalCode>12209</PostalCode>

 <Country>Germany</Country>

 <Phone>030-0074321</Phone>

 <Fax>030-0076545</Fax>

 <Orders href=”Customers[ALFKI]/Orders” />

 </Customer>

 </Customers>

 </DataService>

You can see that the response includes simpleproperties and hyperlinks to other resources (“Orders” in the example). Everyresource also includes a URI that represents the canonical location for it (inthe URI attribute of the Customer element above).

Interacting with Data in the Presentation Layer

There are a number of options for interactingwith data sources from the presentation layer. The first aspect that will scopethe available options for a given scenario is the nature of the client (forexample, browser versus rich client).

Since the Astoria interface is just plain HTTP,pretty much every environment with an HTTP client library can be used to consumedata services. The interface has been specifically designed to be easy to useat the HTTP level; the URI patterns are simple and human-readable, and thepayload formats use JSON or a subset of XML that keeps it straightforward.

For .NET applications, the Astoria toolkitincludes a client library that runs in the .NET Framework environment andpresents results coming from Astoria services as .NET objects; not only is thateasier for developers to use within the codebase of the client application, butit also integrates well with components that already operate on top of regular.NET objects. The library provides rich services such as graph management,change tracking, and handling of updates. (See Example 1.)

Click here for larger image

Example 1: Accessing an Astoria service from .NET codeusing the Astoria client library (Click on the picture for a larger image)

The Astoria client library runs both on the .NETFramework and inside Microsoft Silverlight. This enables the creation ofdesktop applications and Silverlight-based Web applications using the same API,just by referencing the corresponding Astoria assembly for the targetenvironment.

In the case of AJAX-style Web applications, most AJAX frameworks include easy-to-use wrappers for HTTP access to externalresources. Those wrappers even support materializing the response intoJavaScript objects if you indicate that the response will be in JSON format.(See Example 2.)

Click here for larger image

Example 2: Accessing an Astoria service from an AJAX application (example uses ASP.NET AJAX library (Click on the picture for a larger image)

Independent from the kind of application, theywill typically run in environments with relatively high latency between theclient and the data service, so the use of typical asynchronous executiontechniques should be the norm. The client API has built-in support forasynchronous request execution. For the case of AJAX applications, the XMLHTTPinterface, along with most wrappers, provides support for asynchronouslysubmitting requests. (See Example 3.)

Click here for larger image

Example 3: Using the asynchronous API in the Astoria client for .NET (Click on the picture for a larger image)

Introducing Business Logic

Astoria enablesdevelopers to simply point to a database or a prebuilt EDM schema and it willautomatically generate an HTTP view of it. While this is great for the initialiterations of an application, this wide-open interface to the data will oftennot be appropriate for production applications.

In most databases, there is a relatively clearsplit between two kinds of data (most typically, two kinds of tables): There isa part of the data that has enough implied semantics in itself; this is truefor simpler concepts such as “product category.” In those cases, business logicis usually thin or inexistent, and the direct interface to the data is goodenough. The other part of the data only makes sense with some business logicaround it; for example, the data that is shown needs to be restricted based onthe context, or modifications need to pass external validations, or when agiven value is changed, another side-effect needs to take place, and so on.

To address the requirement of being able tointroduce business logic that is tightly bound to certain pieces of data, Astoria supports two customization mechanisms: service operations and interceptors.

A default Astoria service consists entirely ofresource containers that are the entry points to the resource graph, such as/Customers or / Products. In addition to those, developers can define serviceoperations that encapsulate both business logic and queries. For example, in agiven application, it may not be desirable to list all customers; instead, theapplication could provide an entry point to list “my customers” and even thenthere could be a requirement where clients retrieve their customers for a givencity at a time. The developer can define a “MyCustomersByCity” serviceoperation that obtains the users’ identity from the context (from ASP.NET’sHttpContext.User property, for example) and can then formulate a query thatfactors in both the user identity and the city name that is passed as anargument. For example:

 [WebGet]

 public static IQueryable<Customer>CustomersByCity(No

 rthwindEntities db, string city)

 {

 if (city == null || city.Length < 3) thrownew

 Exception(“bad city”);

 

 var q = db.Customers.Where(“it.City = @city”,

 new

 ObjectParameter(“city”, city));

 

 // add user-based filter condition to q

 

 return q;

 }

would then be callable with a URI following thispattern: /MyCustomersByCity?city=Seattle

An interesting feature of Astoria is thatservice operations can opt for returning a query instead of the actual results,as shown in the previous example. When a query object is returned, the rest ofthe URI pattern can still be used; so for instance, client could still add an“orderby” option to the URI: /MyCustomersByCity?city=Seattle&$orderby=CompanyName

The service operation can also contain code toperform validations, log activity, or any other need. This provides a goodmiddle-ground between strict RPC, which makes building flexible UIs hard, andwide-open data interfaces that do not allow for control of the data that isflowing through the system.

For scenarios where preserving theresource-centric interface is desired, interceptors can be used. An interceptoris a method that is called whenever a certain action happens on a resourcewithin a given resource container. For example, a developer could register aninterceptor to be called whenever a change (POST/PUT/DELETE) is made to the“Products” resource container. The interceptor can perform validations, modifyvalues, and even choose to abort the request.

Extensibility and Alternate Data Sources

So far I have discussed Astoria in the contextof EDM and the ADO.NET Entity Framework. When using Astoria for surfacing datain a database, this is most likely the best choice; however, not all data is ina database.

In the first public CTP of Astoria, we have onlytargeted the Entity Framework. As we iterate on the design of the product, weare changing that to provide more options. Specifically, in order to enablescenarios where you want to expose data sources that are not databases, we willsupport using any LINQ-enabled data source to be exposed through the HTTPinterface.

LINQ defines a general interface calledIQueryable that allows consumers to dynamically compose queries without havingto know any details about the nature of the target for the query. It is up tothe actual implementation of IQueryable in each source to interpret ortranslate queries appropriately. This allows the Astoria runtime to take a“base query” and compose it with operations such as sorting and paging. (SeeFigure 2.)

Click here for larger image

Figure 2: Astoria architecture diagram illustratingIQueryable-based layering (Click on the picture for a larger image)

With this extensibility point in place,developers will be able to bring a broad set of data sources into the pictureand expose them through the HTTP interface. This ranges from access tospecialized data stores to using LINQ-based access libraries for onlineservices (for example, there are informal implementations of LINQ to Amazon andLINQ to Flickr that provide limited query capabilities to those Internetsites).

Deployment Scenarios: Applications and Services

A typical data-driven Web application today willhave its own database in addition to one or more Web servers. For enterpriseapplications, these servers are part of the IT infrastructure and forapplications hosted in ISPs, most ISPs provide database services.

Along the same lines, you can expect manyapplications that use Astoria as their data services and are built around AJAX or Silverlight to still have their own database. In those scenarios, the Astoria data services will be part of the Web application itself, and will be deployedtogether with the rest of the application components. This is one of thescenarios we target with Astoria, but it is not the only one we envision.

Another way of looking at Astoria is as atechnology for building data services for other systems to consume. Dataproviders can set up Astoria servers that other applications can interact with,both consuming and updating data as required and as allowed by the securitypolicies. The provider of the data could be either the same as the owner of theapplication that consumes the data or it can be a service for others toconsume.

Yet another way of looking at Astoria is as ageneral-purpose service for data storage. To explore this idea we have set upan experimental online service that hosts several sample data sets, includingthe Northwind and AdventureWorks sample databases, a subset of the MicrosoftEncarta articles and a snapshot of data that supports the Microsoft TagSpacesocial bookmarking site. Turning this into a real-world service requirestechnology well beyond just the HTTP interface and patterns, and that is aspace outside of the scope of the Astoria project, but we still find theexperimental service valuable as a learning tool, to see what applicationsdevelopers would build on top of it.

Security

Exposing a Web-facing data interface requirescareful thinking around securing access to make sure only the data that ismeant to be accessible is effectively accessible over the HTTP interface. Thisinvolves both an authentication infrastructure and proper authorizationpolicies.

While a lot of the design points in Astoria equally apply to Astoria as a component of an application and to Astoria as aservice, authentication is one of the areas where this is not the case.

When using Astoria data services as part of acustom Web application, authentication typically applies to all resourceswithin a given boundary, including access to data. Users would authenticateonce with the Web site and the system needs to be able to apply the credentialsto the data service as well as to the rest of the application. Astoria looksinto the ASP.NET API to find out whether a user is authenticated and to findout further details, so that an application that uses any authentication schemeproperly integrated with ASP.NET will automatically work with Astoria.

Click here for larger image

Figure 3: REST and Astoria (Click on the picture for a larger image)

Integration with ASP.NET authentication meansthat in typical cases common authentication-over-HTTP mechanisms will work.This includes “forms authentication”, integrated authentication (useful insidecorporate networks), and custom authentication schemes; it is evenstraightforward to roll a custom implementation of HTTP “Basic” authentication,which may be good enough when used over SSL connections, depending on thenature of the application.

For online services this becomes a much biggerchallenge. Besides the actual technological question of how authenticationhappens, there are higher level differences that need to be addressed first: Ifan application and the data service are from different sources, is the data inthe data service owned by the user of the application? If it is owned by theuser, then the application should not have access to the user’s credentials tothe data service. What is required in this scenario is a scheme where the userauthenticates with the data service independent of the application (which mayrequire authentication as well). We are exploring this space as part of the Astoria design effort, and although we have a reasonable understanding of the scenarios,we have not yet designed concrete technology to support them.

Once the authentication scheme is in place, aproper authorization model is required. The currently released version of Astoria (May 2007 CTP) has an over-simplistic implementation, where authentication policiescan be set at the resource container level ( /Customers, for example); on eachone of them the configuration can indicate whether authentication is requiredto read the resources in the container, and to write to them. This is notflexible enough for most applications, so clearly a more sophisticated schemeneeds to be provided.

Closing Notes: Scope and Plans for Project Astoria

The manner in which applications are beingwritten is changing. One of the key traits of emerging Web applications is thenew ways they interact with their data. There is a clear opportunity here tointroduce base technology to help the development community take on this newspace.

With Project Astoria, we aim at enabling the useof data as a first-class construct on the Web and across the application stack.We want to provide the infrastructure for creating Web data services, and alsocontribute to the creation of an ecosystem where service providers and serviceconsumers use a uniform interface for data. We would like to see UI controlsvendors, client library writers, and other players leverage the power for reuseof the data interfaces to build better tools for Web application creation.

We are starting with this vision at home. WithinMicrosoft we are closely collaborating with various groups to explore differentaspects where Astoria can play.

On the services side, we are working togetherwith the Windows Live organization, the Web3S folks in particular (they areresponsible for the data interfaces for Live properties), to explore a worldwhere every data interface is a Web3S/Astoria interface and can be consumed bythe various tools and controls out there.

On the tools side, the ASP.NET, WCF, and Astoriateams, as well as the various folks involved in Silverlight, are workingtogether to provide a solid end-to-end story for Web application development,which includes first-class tools and libraries for interacting with data fromWeb applications and services.

Project Astoria moves fast and focuses onsolving real-world challenges around the Web and data. We plan to go throughthe design process in a transparent way, so everyone can see what we are up to.It is an exciting time to be working on this space; I would encourage anyonethat has an interest in the topic to check out the Astoria Team blog, followour design discussions, and jump in whenever you have an opinion.

Resources

·        AstoriaTeam Blog: http://blogs.msdn.com/astoriateam

·        Pablo’s Blog: http://blogs.msdn.com/pablo

·        Project Astoria http://astoria.mslivelabs.com

About the Author

Pablo Castro is a technical lead in the SQLServer team. He has contributed extensively to several areas of SQL Server andthe .NET Framework including SQL-CLR integration, type-system extensibility,the TDS client-server protocol, and the ADO.NET API. Pablo is currently involvedwith the development of the ADO.NET Entity Framework and also leads the Astoria project, looking at how to bring data and Web technologies together. Beforejoining Microsoft, Pablo worked in various companies on a broad set of topicsthat range from distributed inference systems for credit scoring/risk analysisto collaboration and groupware applications.

 

This article was published in the Architecture Journal, a printand online publication produced by Microsoft. For more articles from thispublication, please visit the Architecture Journal Web site.