by Danny Thorpe
Summary: A shopper can walk intovirtually any store and make a purchase with nothing more than a plastic cardand photo ID. The shopper and the shopkeeper need not share the same currency,nationality, or language. What they do share is a global communications systemand global banking network that allows the shopper to bring their bank serviceswith them wherever they go and provides infrastructure support to theshopkeeper. What if the Internet could provide similar protections and servicesfor Web surfers and site keepers to share information? (9 printed pages)
IFrame URL Technique
Hiding Data in Bookmarks
Sender Identification
Sending to the Sender
Stateful Receiver
Application of Ideas
User Empowerment
Acknowledgments
Resources
Developing applications that live inside the Web browser is a lotlike window shopping on Main Street:lots of stores to choose from, lots of wonderful things to look at in thewindows of each store, but you can't get to any of it. Your cruel stepmother,Frau Browser, yanks your leash every time you lean too close to the glass. Shesays it's for your own good, but you're beginning to wonder if your short leashis more for her convenience than your safety.
Web browsers isolate pages living in different domains to preventthem from peeking at each other's notes about the end user. In the early daysof the Internet, this isolation model was fine because few sites placedsignificant application logic in the browser client, and even those that didwere only accessing data from their own server. Each Web server was its ownsilo, containing only HTML links to content outside itself.
That's not the Internet today. The Internet experience hasevolved into aggregating data from multiple domains. This aggregation is drivenby user customization of sites as well as sites that add value by bringingtogether combinations of diverse data sources. In this world, the Web browser'sdomain isolation model becomes an enormous obstacle hindering client-side Webapplication development. To avoid this obstacle, Web app designers have beenmoving more and more application logic to their Web servers, sacrificing serverscalability just to get things done. Meanwhile, the end user's 2GHz, 2GB dumbterminal sits idle.
If personal computers were built like a Web browser, you couldsave your data to disk, but you couldn't use those files with any other applicationon your machine, or anyone else's machine. If you decided to switch to adifferent brand of photo editor, you wouldn't be able to edit any of your oldphotos. If you complained to the makers of your old photo editor, they wouldsniff and declare "We don't know what that other photo editor might dowith your data. Since we don't know or trust that other photo editor, thenneither should you! And no, we won't let you use 'your' photos with them,because since we're providing the storage space for those photos, they'rereally partly our photos."
You couldn't even find your files unless you knew first whichapplication you created them with. "Which photo editor did I use forStevie's birthday photos? I can't find them!"
And what happens when that tragically hip avant-garde photoeditor goes belly up, never to be seen again? It takes all your photos with it!
Sound familiar? It happens to all of us every day using InternetWeb sites and Web applications. Domain isolation prevents you from using yourmusic playlists to shop for similar tunes at an independent online store(unrelated to your music player manufacturer) or at a kiosk within a retailstore.
Domain isolation also makes it very difficult to buildlightweight low-infrastructure Web applications that slice and dice data drawnfrom diverse data servers within a corporate network. A foo.bar.com subdomainon your internal bar.com corpnet is just as isolated from bar.com andbee.bar.com as it is from external addresses like xyz.com.
Nevertheless, you don't want to just tear down all the walls andpass around posies. The threats to data and personal security that the browser'sstrict domain isolation policy protects against are real, and nasty. Withcareful consideration and infrastructure, there can be a happy medium thatprovides greater benefit to the user while still maintaining the necessarysecurity practices. Users should be in control of when, what, and how much oftheir information is available to a given Web site. The objective here is notfree flow of information in all directions, but freedom for users to use theirdata where and when it serves their purposes, regardless of where their dataresides.
What is needed is a way for the browser to support legitimatecross-domain data access without compromising end user safety and control oftheir data.
One major step in that direction is the developing standardsproposal organized by Ian Hickson to extend xmlHttpRequest to supportcross-domain connections using domain-based opt-in/opt-out by the server beingrequested. (See Resources.) If thissurvives peer review and if it is implemented by the major browsers, it offershope of diminishing the cross-domain barrier for legitimate uses, while stillprotecting against illegitimate uses. Realistically, though, it will be yearsbefore this proposal is implemented by the major browsers and ubiquitous in thefield.
What can be done now? There are patterns of behavior supported byall the browsers which allow JavaScript code living in one browser domaincontext to observe changes made by JavaScript living in another domain contextwithin the same browser instance. For example, changes made to the width orheight property of an iframe are observable inside as well as outside theiframe. Another example is the iframe.src property. Code outside an iframecannot read the iframe's src URL property, but it can write to the iframe's srcURL. Thus, code outside the iframe can send data into the iframe via the iframe'sURL.
This URL technique has been used by Web designers since iframeswere first introduced into HTML, but uses are typically primitive,purpose-built, and hastily thrown together. What's worse, passing data throughthe iframe src URL can create an exploit vector, allowing malicious code tocorrupt your Web application state by throwing garbage at your iframe. Any codein any context in the browser can write to the iframe's .src property, and thereceiving iframe has no idea where the URL data came from. In most situations,data of unknown origin should never be trusted.
This article will explore the issues and solution techniques ofthe secure client-side cross-domain data channel developed by the Windows LiveDeveloper Platform group.
An iframe is an HTML element that encapsulates and displays anentire HTML document inside itself, allowing you to display one HTML documentinside another. We'll call the iframe's parent the outer page or host page, andthe iframe's content the inner page. The iframe's inside page is specified byassigning a URL to the iframe's src property.
When the iframe's source URL has the same domain name as theouter, host page, JavaScript in the host page can navigate through the iframe'sinterior DOM and see all of its contents. Conversely, the iframe can navigateup through its parent chain and see all of its DOM siblings in the host pageand their properties. However, when the iframe's source URL has a domaindifferent from the host page, the host cannot see the iframe's contents, andthe iframe cannot see the host page's contents.
Even though the host cannot read the iframe element's srcproperty, it can still write to it. The host page doesn't know what the iframeis currently displaying, but it can force the iframe to display something else.
Each time a new URL is assigned to the iframe's src property, theiframe will go through all the normal steps of loading a page, including firingthe onLoad event.
We now have all the pieces required to pass data from the host tothe iframe on the URL. (See Figure 1.) The host page in domain foo.com canplace a URL-encoded data packet on the end of an existing document URL in thebar.com domain. The data can be carried in the URL as a query parameter usingthe ? character (http://bar.com/receiver.html?datadatadata) or as a bookmarkusing the # character (http://bar.com/receiver.html#datadatadata). There's abig difference between these two URL types which we'll explore in a moment.
.gif)
Figure 1. iframe URL data passing
The host page assigns this URL to the iframe's src property. Theiframe loads the page and fires the page's onLoad event handler. The iframepage's onLoad event handler can look at its own URL, find the embedded datapacket, and decode it to decide what to do next.
That's the iframe URL data passing technique at its simplest. Thehost builds a URL string from a known document url + data payload, assigns itto the src property of the iframe, the iframe "wakes up" in theonLoad event handler and receives the data payload. What more could you askfor?
A lot more, actually. There are many caveats with this simpletechnique:
· No acknowledgement of receipt—The host page has no idea if the iframesuccessfully received the data.
· Message overwrites—The host doesn't know when the iframe has finishedprocessing the previous message, so it doesn't know when it's safe to send thenext message.
· Capacity limits—A URL can be only so long, and the length limitvaries by browser family. Firefox supports URLs as long as 40k or so, but IEsets the limit at less than 4k. Anything longer than that will be truncated orignored.
· Data has unknown origin—The iframe has no idea who put the data into itsURL. The data might be from our friendly foo.com host page, or it might beevil.com lobbing spitballs at bar.com hoping something will stick or blow up.
· No replies—There's no way for script in the iframe to passdata back to the host page.
· Loss of context—Because the page is reloaded with every message,the iframe inner page cannot maintain global state across messages
Should we use ? or # to tack data onto the end of the iframe URL?Though innocuous enough on the surface, there are actually a few significantdifferences in how the browsers handle URLs with query params versus URLs withbookmarks. Two URLs with the same base path but different query params aretreated as different URLs. They will appear separately in the browser historylist, will be separate entries in the browser page cache, and will generateseparate network requests across the wire.
URL bookmarks were designed to refer to specially marked anchortags within a page. The browser considers two URLs with the same base path butwith different bookmark text after the # char to be the same URL as far asbrowser history and caches are concerned. The different bookmarks are justpointing to different parts of the same page (URL), but it's the same pagenonetheless.
The URLs http://bar.com/page.html#one,http://bar.com/page.html#two, and http://bar.com/page.html#three are consideredby the browser to be cache-equivalent to http://bar.com/page.html. If we usedquery params, the browser would see three different URLs and three differenttrips across the network wire. Using bookmarks, however, we have at most onetrip across the network wire; subsequent requests will be filled from the localbrowser cache. (See Figure 2.)
Figure 2. Cache equivalence of bookmark URLs (Click on thepicture for a larger image)
For cases where we need to send a lot of messages across theiframe URL using the same base URL, bookmarks are perfect. The data payloads inthe bookmark portion of the URL will not appear in the browser history orbrowser page cache. What's more, the data payloads will never cross the networkwire after the initial page load is cached!
The data passed between the host page and the iframe cannot beviewed by any other DOM elements on the host page because the iframe is in adifferent domain context from the host page. The data doesn't appear in thebrowser cache, and the data doesn't cross the network wire, so it's fair to saythat the data packets are observable only by the receiving iframe or otherpages served from the bar.com domain.
Perhaps the biggest security problem with the simple iframe URLdata-passing technique is not knowing with confidence where the data came from.Embedding the name of the sender or some form of application ID is no solution,as those can be easily copied by impersonators. What is needed is a way for amessage to implicitly identify the sender in such a way that could not beeasily copied.
The first solution that pops to mind for most people is to usesome form of encryption using keys that only the sender and receiver possess.This would certainly do the job, but it's a rather heavy-handed solution,particularly when JavaScript is involved.
There is another way, which takes advantage of the criticalimportance of domain name identity in the browser environment. If I can send asecret message to you using your domain name, and I later receive that secretas part of a data packet, I can reasonably deduce that the data packet camefrom your domain.
The only way for the secret to come from a third-party domain isif your domain has been compromised, the user's browser has been compromised,or my DNS has been compromised. All bets are off if your domain or your browserhave been compromised. If DNS poisoning is a real concern, you can use https tovalidate that the server answering requests for a given domain name is in factthe legitimate server.
If the sender gives a secret to the receiver, and the receivergives a secret to the sender, and both secrets are carried in every data packetsent across the iframe URL data channel, then both parties can have confidencein the origin of every message. Spitballs thrown in by evil.com can be easilyrecognized and discarded. This exchange of secrets is inspired by the SSL/httpsthree-phase handshake.
These secrets do not need to be complex or encrypted, since thedata packets sent through the iframe URL data channel are not visible to anythird party. Random numbers are sufficient as secrets, with one caveat: TheJavaScript random-number generator (Math.random()) is not cryptographicallystrong, so it is a risk for producing predictable number sequences. Firefoxprovides a cryptographically strong random-number generator (crypto.random()),but IE does not. As a result, in our implementation we opted to generate strongrandom numbers on the Web server and send them down to the client as needed.
Most of the problems associated with the iframe URL data passingtechnique boil down to reply generation. Acknowledging packets requires thereceiver to send a reply to the sender. Exchanging secrets requires replies inboth directions. Message throttling and breaking large data payloads intomultiple smaller messages require receipt acknowledgement.
Figure 3. Message in a Klein Bottle (Click on the picture for alarger image)
So, how can the iframe communicate back up to the host page? Notby going up, but by going down. The iframe can't assign to anything in itsparent because the iframe and the parent reside in different domain contexts.But the bar.com iframe (A) can contain another iframe (B) and A can assign to B'ssrc property a URL in the domain of the host page (foo.com). foo.com host pagecontains bar.com iframe (A) contains foo.com iframe (B).
Great, but what can that inner iframe do? It can't do much withits parent, the bar.com iframe. But go one more level up and you hit pay dirt:B's parent's parent is the host page in foo.com. B's page is in foo.com,B.parent.parent is in foo.com, so B can access everything in the host page andcall JavaScript functions in the host page's context.
The host page can pass data to iframe A by writing a URL to A'ssrc property. A can process the data, and send an acknowledgement to the hostby writing a URL to B's src property. B wakes up in its onLoad event and passesthe message up to its parent's parent, the host page. Voilà.Round-trip acknowledgement from a series of one-way pipes connected together ina manner that would probably amuse Felix Klein, mathematician and bottlewasher.
To maintain global state in the bar.com context across multiplemessages sent to the iframe, use two iframes with bar.com pages. Use one of theiframes as a stateless message receiver, reloading and losing its state withevery message received. Place the stateful application logic for the bar.comside of the house in the other iframe. Reduce the messenger iframe page logicto the bare minimum required to pass the received data to the stateful bar.comiframe.
An iframe cannot enumerate the children of its parent to findother bar.com siblings, but it can look up a sibling iframe usingwindow.parent.frames[] if it knows the name of the sibling iframe. Each time itreloads to receive new data on the URL, the messenger iframe can look up itsstateful bar.com sibling iframe using window.parent.frames[] and call afunction on the stateful iframe to pass the new message data into the statefuliframe. Thus, the bar.com domain context in browser memory can accumulatemessage chunks across multiple messages to reconstruct a data payload largerthan the browser's maximum URL length.
The Windows Live Developer Platform team has developed theseideas into a JavaScript "channel" library. These cross-domain channelsare used in the implementation of the Windows Live Contacts and Windows LiveSpaces Web controls (http://dev.live.com), intended to reside on third partyWeb pages but execute in a secure iframe in the live.com domain context. Thecontrols provide third party sites with user-controlled access to their WindowsLive data such as the user's contacts list or Spaces photo albums. The channelobject supports sending arbitrarily large data across iframe domain boundarieswith receipt acknowledgement, message throttling, message chunking, and senderidentification all taking place under the hood.
Our goal is to groom this channel code into a reusable library,available to internal Microsoft partners as well as third party Web developers.While the code is running well in its current contexts, we still have some workto do in the area of self-diagnostics and troubleshooting; when you get thechannel endpoints configured correctly, it works great, but it can be a realnightmare to figure out what isn't quite right when you're trying to get it setup the first time. The main obstacle is the browser itself—tryingto see what's (not) happening in different domain contexts is a bit of achallenge when the browser won't show you what's on the other side of the wall.
Hardly 40 years ago, a shopper on Main Street USA had to go toconsiderable effort to convince a shopkeeper to accept payment. If you didn'thave cash (and lots of it), you were most likely out of luck. If you hadforeign currency, you'd need to find a big bank in a big city to exchange forlocal currency. Checks from out of town were rarely accepted, and store creditwas offered only to local residents.
Today, shoppers and shopkeepers share a global communicationssystem and global banking network that allows the shopper to bring their bankservices with them wherever they go, and helps the shopkeeper make sales theyotherwise might miss. The banking network also provides infrastructure supportto the shopkeeper, helping with currency conversion, shielding from credit riskand reducing losses due to fraud.
Now, why can't the Internet provide similar protections andempowerments for the wandering Web surfer, and infrastructure services for Website keepers? Bring your data and experience with you as you move from site tosite (the way a charge card brings your banking services with you as you shop),releasing information to the site keepers only at your discretion. The Internetis going to get there; it's just a matter of how well and how soon.
Kudos to Scott Isaacs for the original iframe URL data-passingconcept. Many thanks to Yaron Goland and Bill Zissimopoulos for theirconsiderable contributions to the early implementations and debugging of thechannel code, and to Gabriel Corverra and Koji Kato for their work in the morerecent iterations. "It's absolute insanity, but it justmight work!"
XMLHttpRequest 2, Ian Hickson
http://www.mail-archive.com/public-webapi@w3.org/msg00341.html
http://lists.w3.org/Archives/Public/public-webapi/2006Jun/0012.html
Anne van Kesteren's Weblog
http://annevankesteren.nl/2007/02/xxx
Danny Thorpe is a developer on theWindows Live Developer Platform team. His badge says "Principal SDE,"but he prefers "Windows Live Quantum Mechanic," as he spends much ofhis time coaxing stubborn little bits to migrate across impenetrable barriers.In past lives, he worked on "undisclosed browser technology" atGoogle, and, before that, he was a Chief Scientist at Borland and ChiefArchitect of the Delphi compiler. At Borland, he had thegood fortune to work under the mentorship of Anders Hejlsberg, ChuckJazdzewski, Eli Boling, and many other Borland legends. Prior to joiningBorland, he was too young to remember much. Check out his blog at http://blogs.msdn.com/dthorpe.
This article was published in the Architecture Journal, a printand online publication produced by Microsoft. For more articles from thispublication, please visit the Architecture Journal Web site.