An Introduction to the Web Services Architecture and Its Specifications
Luis Felipe Cabrera
© 2004 . All rights reserved.
Summary: This introduction to the Web services architecture describes the design principles underlying the architecture and foundational technologies for Web services. Features are described and linked to the specifications that formally define them. This paper also serves as a reference guide to all the specifications in the architecture. (51 printed pages)
The contents of this white paper reflect the features of the various Web services specifications at revision levels current as of the publication date. As the Web services specifications are refined and additional specifications are released, this paper will be updated to reflect the most recent changes.
The Web Services Architecture
XML and the Infoset
Message Exchange Patterns
Message Integrity and Confidentiality
Trust Based on Security Tokens
Agreement Coordination Protocols – Reliable Messaging and Transactions
Enumeration, Transfer, and Eventing
Appendix A: Glossary
Appendix B: XML Infoset Information Items
Appendix C: Common Security Attacks
Appendix D: References
Web Services Specifications
The Web has been a phenomenal success at enabling simple computer/human interactions at Internet scale. The original HTTP [HTTP] and HTML [HTML] protocol stack used by today's Web browsers has proven to be a cost-effective way to project user interfaces onto a wide array of devices. A key factor in the success of HTTP and HTML was their relative simplicity—both HTTP and HTML are primarily text-based and can be implemented using a variety of operating systems and programming environments.
Web services take many of the ideas and principles of the Web and apply them to computer/computer interactions. Like the World Wide Web, Web services communicate using a set of foundation protocols that share a common architecture and are meant to be realized in a variety of independently developed and deployed systems. Like the World Wide Web, Web services protocols owe much to the text-based heritage of the Internet and are designed to layer as cleanly as possible without undue dependencies within the protocol stack.
An important area in which Web services differ from the World Wide Web is scope. HTTP and HTML were designed around "read-mostly" interactive browsing of content that is often static, or at least highly cacheable. In contrast, the Web services architecture is designed for highly dynamic program-to-program interactions. In the Web services architecture, many kinds of distributed systems may be implemented. Examples include synchronous and asynchronous messaging systems, distributed computational clusters, mobile-networked systems, grid systems, and peer-to-peer environments. The broad spectrum of requirements in program-to-program interactions forces the Web services protocol stack to be much more general purpose than the first Web protocols. However, like the Web, Web services rely on a small number of specific protocols. We discuss these at more length later.
We envision that the next generation of mainstream applications will be based on autonomous Web services. The implications of autonomy are central to the architecture, and they will be explored throughout this paper. The technical content of this paper describes the infrastructure protocols defining the Web services architecture and a key concept needed to build autonomous distributed applications—the concept of contracts.
The core principles that have driven the design and implementation of the Web service architecture protocols are as follows:
- Message orientation—using only messages to communicate between services and realizing that messages often have a life beyond a given transmission event.
- Protocol composability—avoiding monoliths through the use of infrastructure protocol building blocks that may be used in nearly any combination.
- Autonomous services—allowing endpoints to be independently built, deployed, managed, versioned, and secured.
- Managed transparency—controlling which aspects of an endpoint are (and are not) visible to external services.
- Protocol-based integration—restricting cross-application coupling to wire artifacts only.
The remainder of this section describes these principles in detail.
Web services communicate using messages. They place a significant emphasis on how individual messages are formed and processed. Unlike RPC systems in which messages are strictly subordinate to the local programming experience, the Web services architecture is built with messages as the atomic unit of communication. This is true not only of the wire format used for message exchanges (SOAP [SOAP]), but also for the descriptions of a given Web service (WSDL [WSDL]). Granted, some developers may choose to view a Web service using the remote procedure call metaphor; however, this decision is local to that developer's code and not visible on the wire.
Web services assume SOAP as the lowest layer in the protocol stack and isolates message transfer from transport details. Ideally, protocol-specific bindings do not leak into application semantics. This approach provides a sound base for achieving service interoperability among development platforms, and provides for richer communication patterns. Web services have typically relied on HTTP as the underlying message transport. By leveraging the open extensibility of HTTP POST, many Web services have been bootstrapped using off-the-shelf Web technologies. As more sophisticated applications of Web services emerge, the importance of other transports becomes clear. For example, it is cumbersome, at best, to implement full-duplex message exchanges over HTTP's strict request/reply discipline. In contrast, sending SOAP messages over TCP [TCPIP] using a lightweight framing protocol allows any two-party message exchange pattern to be implemented trivially.
Web services may distribute the processing of a given message across multiple network nodes, each of which contributes some piece of functionality, such as access checks, content-based routing, or application-specific validation. This distributed processing model implies that a given message may need to traverse two or more message transports prior to arriving at the ultimate receiver. For that reason, much of the early protocol work in Web services was focused on providing end-to-end secure and reliable message delivery over arbitrary transports. For the simplest deployments in a single trust domain where a secure and reliable transport is available (e.g., TLS [TLS] over TCP or HTTP), more robust protocols such as WS-Security [WS-Security] or WS-ReliableMessaging [WS-RM] are optional. For richer deployments, these latter protocols are essential.
Protocols are said to compose when they can be used either independently or in combination. Numerous domain-specific communication protocols are effectively "silos" in which protocol designers find themselves coining new mechanisms for dealing with security, reliability, error reporting, etc. Given the broad applicability of Web services, this approach to defining an entire protocol for each vertical domain breaks when domains overlap and becomes extremely costly in terms of both initial development and ongoing support costs.
To avoid these costs, the Web services protocol suite is designed as a family of composable protocol building blocks. By design, each of the infrastructure protocols defines a fine-grained unit of functionality. For example, the basics of signing and sealing message contents is generic enough that it is specified once (in WS-Security) and then leveraged by various infrastructure protocols and application-level protocols.
Web service protocol composition is based on the modular architecture of SOAP. SOAP's architecture anticipates the composition of infrastructure protocols through the use of a flexible header mechanism. One advantage of this approach is that the protocol surface area for a particular application is based on the actual features used by that application. A given protocol imposes absolutely no cost to applications that do not use it. Software operating on computing devices of various scales can use the exact protocols they need, maximizing the applicability of the architecture. A second advantage is that new protocols can be introduced at any time to complement the existing ones and extend functionality. The ability to innovate is thus built-in to the architecture. The challenge of getting a coherent and comprehensive view of the spectrum of available protocols is real. Addressing this challenge is the goal of this document.
Specification profiles are provided to define usage constraints and the best practices for use of these specifications in various combinations. More information on specific profiles is provided in the section on Interoperability Profiles, the WS-I Basic Security Profile, and the Device Profile for Web services.
Web services are autonomous agents whose development, deployment, operation, management, and security all vary independently from those of the service's consumer. This "forced independence" has several important ramifications that permeate the architecture.
Service autonomy has deep implications on how versioning is implemented. As a service's implementation evolves, changes in the universe of compatible consuming applications are inevitable. Having reasonable tools for managing these changes is critical to the correct operation of Web service-based systems. At the most basic level, SOAP provides a protocol evolution model based on SOAP headers. SOAP headers are expected to be added or removed to a given message format over the lifetime of a given protocol. As new headers are introduced, the upgrade policy is carried in the header itself. Headers that may be safely ignored are simply inserted into the message. Headers that cannot be safely ignored are annotated with a mustUnderstand attribute, indicating that their insertion is a breaking change and that only recipients that recognize the header may process the message. This basic model for extensibility is most visible in SOAP itself; however, it is mirrored in various other Web service protocols, including WSDL. More importantly, this principle is used by Web services to add new protocol functionality (e.g., security) to a single messaging format (SOAP). Ultimately, the primary feature of SOAP is extensibility—new versions of SOAP are not needed for each new protocol requirement.
The autonomous nature of services also requires a greater emphasis on explicit management of trust between applications. Because many services have network addresses that are visible from the public Internet, a greater degree of care is needed to ensure that malicious agents cannot compromise the integrity of a service. Part of this care takes the form of stronger input validation, which often can be automated using various schema languages. A more interesting aspect of this increased focus on system integrity is the use of explicit trust models. A key feature of WS-Security–based systems is that a given message may contain multiple security tokens. Some of these tokens may correspond to user identities or principals. Other tokens may correspond to rights that are granted to a particular user or application, and can be cryptographically validated as part of a broader authorization scheme.
Autonomous services must also maintain control over the resources they manage. In particular they need to recycle resources created by requests of services that will not interact again. Resource reclamation policies are needed for all kinds of resources. A popular scheme is the use of leases as exemplified by subscriptions for event notification later in this paper. Asynchronous messaging allows services to have complete local control over the scheduling of message processing. Services also have flexibility regarding message transmissions, connectivity management, and independent failure modes.
Finally, the autonomous nature of Web services invariably requires processes or systems that were at one time centralized to move to a federated model. This is true not only of security identities, but also of service directories and systems management as well. These new systems have to operate well in the presence of unbounded message latencies, independent failure modes, and when services are intermittently connected to the network.
All implementation details are private to a service. The message-oriented façade that every service exposes provides ample insulation from the implementation choices made by a particular service developer. This opacity is critical to service autonomy, and allows flexibility of programming models, operating systems, and other implementation details. It also allows the substitution of one service implementation for another. Ideally, as long as both services respond to the same set of request messages with comparable results, the requestor is none the wiser that a different implementation of a service has been used.
Given the emphasis on autonomous services, it is somewhat ironic that Web services place a great deal of emphasis on the transparency of implementation-specific information. For example, if a service were completely opaque, one could not tell whether the set of input messages accepted by Service A were similar to or different from those accepted by Service B. For this reason, services are expected to make their publicly visible aspects transparent to the outside world. This transparency is achieved through the use of contracts, machine-readable descriptions of the messages a service sends or receives, as well as the abstract capabilities and requirements of the service.
Public service descriptions are essential for creating a rich ecosystem for tools and execution environments, and play an important part in achieving interoperability between heterogeneous systems. Development tools rely on service descriptions in order to create programmer-friendly language bindings for the messages a service accepts or sends. Deployment tools rely on service descriptions to wire up a deployed service to one or more publicly visible endpoint addresses. Management tools rely on service descriptions to track whether a given service is operating within its expected collection of input and output messages. Finally, the runtime binding of a requestor to a given service can take advantage of service descriptions to ensure that a compatible service is selected during the normal execution of an application.
Managed transparency not only applies to descriptions of services, but also to the messages themselves. Unlike monolithic protocols of the past, SOAP and WS-Security together provide a flexible security layer in which different parts of a message may have distinct security characteristics. This means that the sender may elect to leave some aspects of the message completely transparent and visible to all potential readers, while encrypting other parts of the message for only a trusted set of readers to see. In the general case, each message part may be encrypted for a distinct set of readers. Those message parts that are not encrypted can be signed to protect them against tampering.
Application integration is simplified when message-based protocols are used for all communication. By building a self-contained system for description and messaging that is devoid of programming language or operating system details, Web services have shown that it is possible for applications running in truly disparate environments to communicate securely and reliably. The only way this could be made to work was to assume no shared OS, no shared virtual machine, and no shared programming language or abstraction. Independence from underlying implementation technology is the key to Web services interoperability. Web services have influenced many aspects of software development. The primary contribution of Web services has been the emphasis on protocol-based software integration.
The influence of protocol-based integration on the industry at large can be seen in the increasing emphasis on service-oriented architectures. Both Web services and service orientation owe much to the ideas of component software, distributed objects, and message-oriented middleware. Information encapsulation and polymorphism have been adopted from object orientation, and mandatory use of interfaces and the use of runtime metadata have been adopted from component software. Distributed objects contributed notions of context that flows among entities and broker-based bindings. Of course, message-oriented middleware brought the use of queues, relays. and explicit message passing.
Web services communicate using a concrete set of protocols based on a common architecture with SOAP as its foundation. In contrast, service orientation is an abstract set of ideas and concepts that can be manifested in any number of ways (much like object-orientation before it).
Web services can be used to implement a service-oriented system, but service-orientation does not necessitate the use of Web service protocols, nor does the use of Web service protocols ensure that the overall system design is service-oriented. That stated, Microsoft is investing heavily in making the combination of Web services and service-orientation an important part of Windows.
The broad adoption of service-oriented architectures is accelerating as a result of several underlying factors. Network infrastructure is now pervasive, enabling cost-effective computer-to-computer communication. Systems built from Web services provide a software development approach that enables legacy applications to be incorporated in incremental steps. Finally, service autonomy provides for more robust applications.
This introduction has presented the main enablers, motivators, requirements, and principles that guide the Web services architecture. The rest of the paper presents the core technologies that underlie the architecture, followed by a tour through the collection of specifications that define Web services.
The rest of this paper provides a detailed introduction to the Web services architecture. We review the Web services components and mechanisms they build upon, in support of the architecture's design. Each feature of the architecture is presented in the context of the specifications where it is defined.
This section presents core specifications used to formulate messages in the Web services architecture: XML, SOAP, and WS-Addressing [WS-Addressing]. Web services rely on XML for the basic underlying data model, SOAP for the message processing and data model, and WS-Addressing for addressing services and identifying messages independent of transport.
For all messaging systems, the selection of the unit of information transfer is an important decision. Simply put, a common understanding of exactly what constitutes a message is required. In Web services, a message is an XML document information item as defined by the XML Information Set, or Infoset, [XML-Infoset]. The Infoset is an abstract data model that is compatible with the text-based XML 1.0 [XML-10] and is the foundation of all modern XML specifications (XML Schema [XML-Schema], XML Query [XML-Query], and XSLT 2.0 [XSLT-20]). By basing the Web services architecture on the XML Infoset rather than on a specific representation format, the architecture and core protocol components are compatible with alternative encodings.
The Infoset models an XML document in terms of a set of 'information items'. The set of possible information items generally maps to the various features in an XML document, such as elements, attributes, namespaces, and comments. Each information item has an associated set of properties that provide a more complete description of the item. The eleven types of information items in an XML document are described in Appendix B. Every well-formed XML document consists of exactly one document information item and at least one element information item.
In addition to the pure text-based encoding of the Infoset, the Web services architecture also supports an Infoset encoding that allows opaque binary data to be interleaved with traditional text-based markup. The W3C XML-binary Optimized Packaging (or XOP [XOP]) format uses multi-part MIME [MIME] to allow raw binary data to be included into an XML 1.0 document without resorting to base64 encoding. A companion specification, SOAP Message Transmission Optimization Method, or MTOM, [MTOM], then specifies how to bind this format to SOAP. XOP and MTOM are the preferred approach for mixing raw binary with text-based XML and replaces the now deprecated SOAP with Attachments (SwA) and WS-Attachments/DIME.
SOAP provides a simple and lightweight mechanism for exchanging structured and typed information between peers in a decentralized, distributed environment using XML. SOAP was designed to reduce the engineering cost of integrating applications built on different platforms as much as possible with the assumption that the lowest-cost technology has the best chance of gaining universal acceptance. A SOAP message is an XML document information item that contains three elements: <Envelope>, <Header>, and <Body>.
The Envelope is the root element of the SOAP message and contains an optional Header element and a mandatory Body element. The Header element is a generic mechanism for adding features to a SOAP message in a decentralized manner. Each child element of Header is called a header block, and SOAP defines several well-known attributes that can be used to indicate who should deal with a header block (role) and whether processing it is optional or mandatory (mustUnderstand), both described below. When present, the Header element is always the first child element of the Envelope. The Body element is always the last child element of the Envelope, and is a container for the "payload" intended for the ultimate recipient of the message. SOAP itself defines no built-in header blocks and only one payload, which is the Fault element used for reporting errors.
All Web services messages are SOAP messages that take full advantage of the XML Infoset. The fact that both the message payload and the protocol headers employ the same model can be used to ensure the integrity of infrastructure headers as well as application bodies. Applications may route messages based on the content of both the headers and the data inside the message. Tools that have been developed for the XML data model may be used for inspecting and constructing complete messages. These benefits were not available in architectures such as DCOM, CORBA, and RMI, where protocol headers were infrastructural details opaque to the application.
SOAP messages are transmitted one-way from sender to receiver. Multiple one-way messages can be combined into more sophisticated patterns. For instance, a popular pattern is a synchronous request/response pair of messages.
Any software agent that sends or receives messages is called a SOAP node. The node that performs the initial transmission of a message is called the original sender. The final node that consumes and processes the message is called the ultimate receiver. Any node that processes the message between the original sender and ultimate receiver is called an intermediary. Intermediaries are used to model the distributed processing of an individual message. The collection of intermediary nodes traversed by the message and the ultimate receiver are collectively referred to as the message path.
To allow parts of the message path to be identified, each node participates in one or more roles. SOAP roles are a categorization scheme that associates a URI-based [RFC1630] name with abstract functionality (e.g., caching, validation, authorization). The base SOAP specification defines two built-in roles: Next and UltimateReceiver. Next is a universal role in that every SOAP node other than the sender belongs to the Next role. UltimateReceiver is the role that the terminal node in a message path plays. This is typically the application, or in some cases, infrastructure that is performing work on behalf of the application.
The body of a SOAP envelope is always targeted at the ultimate receiver. In contrast, SOAP headers may be targeted at intermediaries or the ultimate receiver. To provide a safe and versionable model for processing messages, SOAP defines three attributes that control how intermediaries and the ultimate receiver process a given header block—role, relay, and mustUnderstand. The role attribute is used to identify which node the header block is targeted at. The mustUnderstand attribute indicates whether that node may ignore the header block if it is not recognized. Header blocks marked mustUnderstand="true" are called mandatory header blocks. Header blocks marked mustUnderstand="false" or that have no mustUnderstand attribute are called optional header blocks. The relay attribute indicates whether that node should forward unrecognized optional headers or discard them.
Every SOAP node must use these three attributes to implement the SOAP processing model. The following steps define that model:
- Identify all header blocks of the SOAP message intended for the current SOAP node using the role attribute (the absence of this attribute implies the header block is for the ultimate receiver).
- Verify that all mandatory header blocks identified in Step 1 can be processed by the current SOAP node using the SOAP mustUnderstand attribute. If a mandatory header block cannot be processed by the current SOAP node, the message must be discarded and a distinguished fault message must be generated.
- Process the message. Optional message elements may be ignored.
- If the SOAP node is not the ultimate receiver of the message, all header blocks identified in Step 1 that are not relayable are removed and the message is then relayed to the next SOAP node in the message path. The SOAP node is free to insert new header blocks into the relayed message. Some of these header blocks may be copies of header blocks identified in Step 1.
The SOAP processing model is designed to allow extensibility and versioning. The mustUnderstand attribute controls whether the introduction of a new header block is a breaking or non-breaking change. Adding optional headers blocks (e.g., headers marked mustUnderstand="false") is a non-breaking change, as any SOAP node is free to ignore it. Adding mandatory headers blocks (e.g., headers marked mustUnderstand="true") is a breaking change, in that only SOAP nodes that are aware of the header block's syntax and semantics are able to process the message. The role and relay attributes compose with mustUnderstand to distribute this processing model along a message path.
The messaging flexibility provided by SOAP allows services to communicate using a variety of message exchange patterns, satisfying the requirements of distributed applications. We exploit several of them in the core building blocks of the architecture. Several patterns have proven particularly useful in distributed systems. The use of remote procedure calls, for example, popularized the synchronous request/response message exchange pattern. When message delivery latencies are uncontrolled, asynchronous messaging is needed. When the asynchronous request/response pattern is used, explicit message correlation becomes mandatory.
Broadcast transports popularized one-to-many message transmissions. The original sender imposing its messages on the recipients by just sending them is referred to as the push model. While this model is effective in local-area networks, it does not scale well to wide-area networks nor offer recipients an option to regulate the message flow.
Another useful pattern is based on an application's ability to express interest in particular kinds of messages, making the publish/subscribe pattern quite popular. By explicitly subscribing to message sources (or topics), applications have a more controlled flow of relevant information.
The pull model is used when a recipient explicitly requests a message from a source. This makes message flow the recipient's responsibility. The pull pattern can also be combined with publish/subscribe. It is well suited for situations where recipients may be intermittently disconnected from the sources.
SOAP is defined independently of the underlying messaging transport mechanism in use. It allows the use of many alternative transports for message exchange, and allows both synchronous and asynchronous message transfer and processing.
One example of a system that requires both multiple transports and asynchronous messaging is one that communicates between a Web service on a land-based, high-speed network backbone and an intermittently connected Web service hosted on a cellular phone. Such a system requires a single message to travel over different transports depending on which network hop the message is moving between. Such a system also shows one example of where message delivery latency cannot be accurately determined. Instead of attempting to determine or bound message delivery latency, the Web service developer should build the system assuming the full power of asynchronous message passing. Unlike when using remote procedure calls, asynchronous messaging allows the sender to continue processing after each message transmission without being forced to block and wait for a response. Of course, synchronous request-response patterns can be built on the foundation of asynchronous messaging.
Since Web service protocols are designed to be completely independent of the underlying transport, selection of the appropriate mechanism can be deferred until runtime. This allows Web service applications the flexibility to determine the appropriate transport as the message is sent. Additionally, the underlying transport may change as the message is routed between nodes, and again, the mechanism selected for each hop can vary as required.
Despite this general transport independence, most first-generation Web services communicate using HTTP, as this is one of the primary bindings included within the SOAP specification. HTTP uses TCP as its underlying transport protocol. However, TCP's design introduces processing overhead this is not always necessary. Several application protocol patterns more closely match the semantics of the User Datagram Protocol, or UDP [UDP]. These patterns are particularly useful for devices and other resource-constrained systems.
UDP does not have the delivery guarantees of TCP; it provides best-effort datagram messaging. It also requires fewer resources to implement than TCP. In addition, UDP provides multi-cast capabilities, allowing a sender to simultaneously transmit a message to multiple recipients. The specifications for binding SOAP messages to UDP are published in SOAP-over-UDP [SOAP-UDP].
For messages to be routed and addressed in this multi-transport world, a common mechanism is needed for critical messaging properties to be carried across multiple transports. The WS-Addressing specification defines three sets of SOAP header blocks for this purpose.
The Action header block is used to indicate the expected processing of a message. This header block contains a single URI that is typically used by the ultimate recipient to dispatch the message for processing.
The MessageID and RelatesTo header blocks are used to identify and correlate messages. The MessageID and RelatesTo headers use simple URIs to uniquely identify messages—typically these URIs are transient UUIDs.
The To/ReplyTo/FaultTo header blocks are used to identify the agents that are to process the message and its replies. These headers rely on a WS-Addressing-defined structure called an endpoint reference that bundles together the information needed to properly address a SOAP message.
Endpoint references are the most important aspect of WS-Addressing, as they provide support for finer-grained addressing than just a URI. They are used extensively throughout the Web services architecture. Endpoint references contain three critical pieces of information: a base address, and optional sets of reference properties and reference parameters. The base address is a URI that is used to identify an endpoint, and appears in the To header block of every SOAP message targeted at that endpoint. Reference properties and reference parameters are collections of arbitrary XML elements used to complement the base address by providing additional routing or processing information for the message. They are represented as literal header elements. When using an endpoint reference to construct a message for the endpoint, the sender is responsible for including all reference properties and reference parameters as header blocks.
The distinction between reference properties and reference parameters is in how they relate to a service's metadata. The policy and contract of a Web service is based on its base address and reference properties only. Typically, the base address and reference properties identify a given deployed service and the reference parameters are used to identify specific resources that are managed by that service.
Reference properties and parameters are simply opaque XML elements that are expected to be processed by only the ultimate receiver. They help ensure that information that can be used for dispatch, routing, indexing, or other sender-side processing activities is included with a given message. While intermediaries are not expected to process this information, it is possible that some intermediaries, such as firewalls or gateway services, may use certain reference properties or parameters for message routing and/or processing.
There are many uses for reference properties. Two simple examples are classes of service and private entity identifiers. In the class of service example, reference properties may be used to differentiate between a Web service for standard customers and one for "gold" customers that provides a higher quality of service and enhanced capabilities—possibly through extra operations or additional bindings—logically forming two different endpoints. Properties such as these are set only once in a session and then reused throughout the rest of the interaction. An example of a second kind of use of a reference property is a mechanism to identify a customer in a manner private to the originating system. A combination of these two types of reference properties could enable efficient message dispatch to the appropriate collection of servers and efficiently finding the application state that relates to a particular user. These examples also show how data that refers to instances of services and data that refers to instances of users can be represented in reference properties.
In particular, reference properties also help address collections of WSDL entities that share a common URL and scope. WSDL, the XML format for describing Web services as a set of endpoints operating on messages, first specifies its entities abstractly, and then concretely binds them to specific instances. In particular, messages and operations are abstractly defined, and are then bound to an endpoint with network transport and message format information. So, from the WSDL perspective, when targeting different concrete entities, like input or output messages, portType bindings, ports, or services in a Web service using a common URL, the corresponding Endpoint Reference's reference properties should be different. WSDL is presented in more detail in the Metadata section.
Two examples of reference parameter use are infrastructural and application-level. An infrastructural example of a reference parameter can be a transaction/enlistment ID sent to a Transaction Processing monitor. In a book-purchase scenario, the ISBN number of a book can be an application-level example of a reference parameter.
All Web service interaction is performed by exchanging SOAP messages as described in the previous section. To provide for a robust development and operational environment, services are described using machine-readable metadata. Metadata enables interoperability. Web service metadata serves several purposes. It is used to describe the message interchange formats the service can support, and the valid message exchange patterns of a service. Metadata is also used to describe the capabilities and requirements of a service. This last form of metadata is called the policy of a service. Message interchange formats and message exchange patterns are expressed in WSDL. Policies are expressed using WS-Policy. Contracts are expressed using all three kinds of metadata described above. Contracts are abstractions that insulate applications from the internal implementation details of the services they rely upon.
The Web Service Description Language, or WSDL, was the first widely adopted mechanism for describing the basic characteristics of a Web service. Messages described in WSDL are grouped into operations that define the basic message patterns. The operations are grouped into interfaces called ports that specify an abstract contract for a service. Finally, ports and bindings are used to associate portTypes with concrete transports and physical deployment information. A WSDL description is a first step in automatically identifying all characteristics of the target service and enabling software development tools.
WSDL specifies what a request message must contain and what the response message will look like in unambiguous notation. The notation that a WSDL file uses to describe message formats is based on XML Schema. This means it is both programming-language neutral and standards-based, which makes it suitable for describing service interfaces that are accessible from a wide variety of platforms and programming languages. In addition to describing message contents, WSDL may define where the service is available and what communications protocol is used to talk to the service. This means that the WSDL file can specify the base elements required to write a program to interact with a Web service. Several tools are available to read a WSDL file and generate the code required to produce syntactically correct messages for a Web service.
While WSDL is a good starting point, it is not sufficient to describe all aspects of a Web service. WSDL allows only a rather small set of properties to be expressed. Examples of more detailed information that is necessary for Web services include the following:
- Operational characteristics: The service supports SOAP version 1.2.
- Deployment characteristics: The service is available only between 9 a.m. and 5 p.m.
- Security characteristics: Kerberos [KERBEROS] tickets are required for access to the service.
First generation Web services must exchange metadata out of band using proprietary protocols. This issue is addressed with WS-Policy [WS-Policy]. WS-Policy provides a general-purpose model and syntax to describe and communicate the policies of a Web service. It specifies a base set of constructs that can be used and extended by other Web service specifications to describe a broad range of service requirements and capabilities. WS-Policy introduces a simple and extensible grammar for expressing policy assertions and a processing model to interpret them. Assertions may be combined into logical alternatives.
Policy assertions allow programmers to add appropriate metadata to service information either at development time or at runtime. Examples of development time policies include the maximum allowed message size, or the exact version of a supported specification. Examples of runtime policies include mandatory service down time or the unavailability of a Web service during a given administrative procedure such as regular hardware maintenance. Examples of policies that relate to security are presented later in this paper.
Individual policy assertions may be grouped to form policy alternatives. Policies are collections of policy alternatives. To facilitate interoperability, policies are defined in terms of their XML Infoset representation. A compact form for policies is defined to reduce the size of policy documents while preserving interoperability.
Policies are used to convey conditions for interaction between two Web service endpoints. Satisfying assertions in a policy usually results in behavior that reflects these conditions. Thus, policy assertion evaluation is central to identifying compatible behaviors. A policy assertion, the building block for policies, is supported by a requestor if and only if the requestor satisfies the requirement, or accommodates the capability, corresponding to the assertion. In general, this determination uses domain-specific knowledge. A policy alternative is supported by a requestor if and only if the requestor supports all the assertions in the alternative. This is determined mechanically using the results of the policy assertions. Also, a policy is supported by a requestor if and only if the requestor supports at least one of the alternatives in the policy. This determination is also mechanical once the policy alternatives have been evaluated. Note that although policy alternatives are meant to be mutually exclusive, it cannot be decided in general whether or not more than one alternative can be supported at the same time.
To convey policy in an interoperable form, a policy expression is an XML Infoset representation of a policy. The normal form policy expression is the most straightforward Infoset; equivalent, alternative Infosets allow compactly expressing a policy through a number of constructs. Policy expressions are the base building block for policies. Two operators are used to express their assertions: All and ExactlyOne. The All operator specifies that all the assertions present in a collection of policy alternatives have to hold for the policy assertion to be satisfied. The ExactlyOne operator specifies that exactly one of the assertions present in a collection of policy alternatives has to hold for the policy assertion to be satisfied.
Policies layer on top of, and augment, WSDL descriptions. Policies are associated with Web services metadata, such as WSDL definitions or UDDI [UDDI] entities through the use of WS-PolicyAttachment [WS-PA]. Policies may be associated with resources either as an intrinsic part of their definition, or separately. Mechanisms are defined for each of these purposes. In particular, policies may also be used with individual SOAP messages. When multiple policy attachments are made for an entity, they jointly determine the effective policy for the entity. Care must be taken when attaching policies at different levels of the WSDL hierarchy, since the net result for each level of a hierarchy is the effective policy. As a general rule for self-description and human-understandable clarity, it is preferable to be verbose and repeat a policy assertion at each level of a hierarchy that it applies, than to be terse and rely on the mechanism that computes the effective policy. In a WSDL document a message exchange with a deployed endpoint could contain effective policies in all four subject types simultaneously.
The combination of WS-Policy and WS-PolicyAttachment provides an increased ability to programmatically discover and reason about the policies supported by other services. Flexibility to add policies is an important complement to the WSDL information that describes the message interactions.
WSDL and WS-Policy both define formats for metadata but do not specify mechanisms for acquiring or accessing metadata for a given service. In general, service metadata can be discovered using a variety of techniques. To enable services to be self-describing, Web services architecture defines SOAP-based access protocols for metadata in WS-MetadataExchange [WS-MEX]. The GetMetadata operation is used to retrieve metadata that is found at the endpoint reference of the request. The Get operation is similar but is designed to retrieve metadata that is referenced in a metadata section, and is to be retrieved at the endpoint reference where it is stored.
The metadata exchanged using WS-MEX can be described as a resource. A resource is defined as any entity addressable by an endpoint reference where the entity can provide an XML representation of itself. Resources form the basis needed to build state management in Web services.
A Profile is a set of guidelines for the use of Web services specifications beyond the core protocols. These guidelines are necessary because of the specification's general-purpose design. In some instances, developers need additional help in determining which Web services features should be used to meet a particular requirement. Interoperability Profiles also resolve ambiguities in areas where the Web services specifications are not clear enough to ensure that all implementations process SOAP messages in the same way.
The WS-I Basic Profile
The first Web services profiles were published by the Web Services-Interoperability Organization (WS-I) [WS-I]. WS-I has finalized its first profile, simply titled the Basic Profile 1.0 [WSI-BP10]. This profile provides guidance primarily on the interoperable use of SOAP 1.1 and WSDL 1.0.
This section presents the specifications used in the Web services architecture to provide message integrity, authentication and confidentiality, security token exchange, message session security, security policy expression, and security for a federation of services within a system. The specifications providing these features are WS-Security, WS-Trust [WS_Trust], WS-SecureConversation [WS-SecureConv], WS-SecurityPolicy [WS-SecurityPolicy], and WS-Federation [WS_Federation].
Security is a fundamental aspect of computer systems, especially those systems comprised of Web services. Security has to be robust and effective. Since systems may only make hard-wired assumptions about the format of messages and legal message exchanges, security must be built based on explicit, agreed-upon mechanisms and assumptions. The security infrastructure should also be flexible enough to support the wide variety of security policies required by different organizations.
When a secure transport is available between the communicating Web services, such as the Secure Sockets Layer (SSL) and Transport Layer Security (TLS), building a secure solution is simplified. With a secure transport, the services need not concern themselves with maintaining integrity and confidentiality for individual messages; they can rely on the underlying transport. However, existing transport-level security is a solution limited only to point-to-point messaging. If intermediaries are present when using a secure transport, the initial sender and the ultimate receiver need to trust those intermediaries to help provide end-to-end security, since each hop is secured separately. In addition to explicit trust of all intermediaries, other risks such as local storage of messages and the potential for an intermediary to be compromised must be considered.
To maximize the reach of Web services, end-to-end security must be provided when intermediaries are not trusted by the communicating endpoints. This requires higher-level security protocols. End-to-end message security is a richer alternative to point-to-point transport-level security, since it supports the loosely coupled, federated, multi transport, and extensible environment that SOAP-based Web services require. This powerful and flexible infrastructure can be developed from a combination of existing technologies and Web services protocols while mitigating many of the security risks associated with point-to-point messaging.
Even though the security requirements for Web services are complex, no new security mechanisms were invented to satisfy the needs of SOAP-based messaging. Existing approaches to distributed systems security, such as Kerberos tickets, public key encryption technologies, X.509 [X509] certificates, and others proved to be sufficient. New mechanisms were necessary only to apply these existing security approaches to SOAP. These new security protocols were designed with extensibility in mind in order to allow new approaches to be incorporated in the future. A primary design objective was to provide mechanisms for self-describing security properties designed for SOAP and the rest of the Web services architecture.
Web services security is based on the requirement that incoming messages prove a set of assertions made about a sender, a service or other resource. We call these claims, or security assertions. Examples of security claims include identity, attributes, key possession, permissions, or capabilities. These assertions are encoded in binary security tokens wrapped in XML. In traditional security terminology, these security tokens represent a mix of capabilities and access controls.
Various approaches are used to create security tokens. A Web service may build a custom security token from local information. Alternately, a security token may be retrieved from specialized services such as a X.509 certificate authority or a Kerberos domain controller. To automate communication between services, a mechanism to express security requirements is required.
Services may express their security requirements using policy assertions as specified in WS-SecurityPolicy. This specification is described in a later subsection of this paper. By retrieving these policy assertions, an application may build messages that comply with the requirements of the target service. This combination of features provided by claims, security tokens and policies, and the ability to retrieve them from a Web service is powerful.
The general Web services security model supports several more specific security models, such as identity-based authorization, access control lists, and capabilities-based authorization. It allows the use of existing technologies such as X.509 public-key certificates, XML-based tokens, Kerberos shared-secret tickets, and password digests. The general model is sufficient to construct systems that use more sophisticated approaches for higher-level key exchange, authentication, policy-based access control, auditing, and complex trust relationships. Proxies and relay services may also be used. For example, a relay service can be built to enforce a security policy at a trust boundary; messages going outside the boundary are encrypted while those that stay within the boundary are unencrypted. This flexibility and degree of sophistication is not present in previous solutions.
The common security attacks described in Appendix C include a base taxonomy of system threats that should be carefully considered when choosing Web services security features.
The remainder of this section explores the application of the Web services security model. The two key topics are securing communications and securing applications. A secure message transport is not assumed, nor is it necessary for secure Web services.
Message-level security is the key building block for end-to-end security. When using message-level security, no transport-level security is required. Requirements for message-level security are message integrity, message authentication, and confidentiality. Message integrity ensures that a message cannot be changed without detection. Use of XML Signature [XMLSIG] ensures that message modifications can be detected.
Message authentication identifies the principal that sent the message. If public key encryption is used, the unique identity of the principal can be determined. The use of public key encryption with keys certified by a trusted source provides this authentication. However, if symmetric key encryption is used, this is not the case – only the group of principals that know the shared secret can be identified.
Message confidentiality ensures that a message cannot be read by an unauthorized third party during transmission. SOAP messages are kept confidential through the use of XML Encryption [XMLENC] in conjunction with security tokens.
Mechanisms for integrity, authentication, and confidentiality take the original message (or parts of the message) as input, and product-appropriate data (such as a checksum) as output. For example, the signature of an XML element could, in a simple case, be implemented as the asymmetric encryption of a hash of all the characters of the XML element. This encrypted hash could then be stored and transmitted in the message.
XML documents can be thought of as strings of characters. The character-by-character comparison is critical security operations such as XML signatures. A one-character difference is a different result. Serialization is the method used to represent objects "on the wire". For example, serialization is used to create the XML representation of a SOAP message. Any inessential typographical variations produced by different serialization software are ignored by message processing software, but significantly impact the security software. The Infoset representation of an XML message ameliorates this issue. For XML signatures to work messages must be transformed to an XML form that is consistent for all parties. Canonicalization is the term used to describe the method used to produce a consistent view of the non-critical information such as line breaks, tab spaces, ordering of attributes and the style of closing tags. Signatures include the canonicalization method used to enable the recipient of a message to process the security information in a manner consistent with the sender. The specific canonicalization method in use by a service is a useful policy assertion to place at a WSDL portType binding or a WSDL Port.
WS-Security specifies mechanisms for message integrity and confidentiality, and single message authentication. For message integrity, the specification details how a cryptographic signature is represented and associated with specific parts of the SOAP message. The approach allows arbitrary well-formed fragments of the message to have separate signatures. In a similar manner, confidentiality is achieved through the encryption of well-formed fragments of the message. Authentication is achieved using digital signatures.
The WS-Security specification describes common security mechanisms in use today, but does not preclude new ones from being added in the future. Since the SOAP processing model uses the header elements to make processing decisions, great care must be exercised when deciding which elements in a SOAP message to encrypt.
Web service designers must be aware of how the message will be processed in deciding which elements are to be encrypted and which encryption algorithms to use. These decisions are even more important when specific header elements need to be processed by third parties or intermediaries. If those parties are not privy to the appropriate decryption data, or to the conventions used in encrypting the XML elements, they will not be able to operate correctly. In addition, each processing node must have a common understanding of the security information included in the message.
One natural choice for encrypting an XML element in a header is to encrypt it completely, substituting the original element for one that is of type encrypted data. Drawbacks to this straightforward approach exist. Intermediaries, for example, have a hard time determining which elements must be processed (those adorned with the mustUnderstand="1" attribute). Also, as the element type is changed, determining its original type is difficult.
An alternative approach is to transform the element to one where all the key attributes needed for correct SOAP processing are preserved and the original element is encrypted and placed in a distinguished sub-element. The advantage of this approach is that correct processing can be achieved even by intermediaries that do not know how to decrypt the element. A drawback to this approach is that it requires the convention used to represent the original element to be understood by all parties. While WS-Security does not currently provide guidance on this approach, we expect future work to do so. The alternate method is preferred because it enables the correct processing of all SOAP header elements.
Several kinds of security tokens are described in WS-Security's profile specifications. Profiles have been developed for tokens representing user names, X.509 certificates, and XML-based security tokens. XML-based security tokens include the Security Assertion Markup Language (SAML) [SAML] and the eXtensible rights Markup Language/Rights Expression Language (REL) [REL]. Specifications for the use of Kerberos tickets are also under development.
The WS-I Basic Security Profile
One of the newest interoperability Profiles to be published by WS-I is the Basic Security Profile (BSP) [WSI-BSP10]. This Profile provides implementation guidance for WS-Security and various security tokens, such as Username [WS-SecUsername] and X.509 certificate tokens [WS-SecX509]. It is designed to complement and compose with the WS-I Basic Profile.
Security tokens are required to provide an end-to-end security solution. These security tokens must be shared, either directly or indirectly, between the parties involved in message processing. Each party also must determine if the asserted credentials can be trusted. These trust relationships are based on the exchange and brokering of security tokens and in the supporting trust policies that have been established. How much of a brokered token is trusted, for example, is determined by system administrators and the trust relationships they have established.
Services that provide security tokens can be quite varied. This is where each of the underlying security technologies is first used by a Web service. In order to provide a uniform solution irrespective of the security technology, new protocols were designed for security token exchange between trust domains.
WS-Trust [WS-Trust] complements WS-Security with protocols for requesting, issuing and brokering security tokens. In particular, operations to acquire, issue, renew, and validate security tokens are defined. Another feature of the specification is a mechanism to broker new trust relationships. Network and transport protection mechanisms such as IPsec or TLS/SSL can be used in conjunction with WS-Trust for different security requirements and scenarios.
Security token acquisition can be done directly by explicitly requesting one from an appropriate issuer, or indirectly by delegating the acquisition to a trusted third party. Tokens may also be acquired out-of-band. For example, the token may be sent from a security authority to a party without the token having been explicitly requested. To complete the picture, system administrators determine initial trust relationships designating, for example, a given service as a trusted root service. This approach is similar to what is used to bootstrap security on the Web today. All tokens obtained from this service are trusted to the same extent as the trusted root itself. For example, if a root is trusted for only claims A and B, and a message contains claims A, B and C, then only claims A and B in the message are trusted. Configuration flexibility is provided through trust relationship delegation.
To address scenarios where a set of exchanges between the parties is required prior to returning, or issuing, a security token, mechanisms are specified for validation, negotiation and exchange. A particular form of exchange called a "challenge" provides a mechanism for a party to prove that it possesses a secret associated with a token. Other types of exchanges include legacy protocol tunneling. WS-Trust defines how to extend the specification for additional token exchange protocols beyond these two examples.
Security tokens expressing security claims are issued by a trusted root or one through a delegation chain. These security claims are used to verify that the message complies with the security policies in place. They also verify that the attributes of the claimant are proven by the signatures. In brokered trust models, i.e., those where a trusted intermediary dispenses security tokens, the signature may not verify the identity of the claimant, but may instead verify the identity of the intermediary. This intermediary may simply assert the identity of the claimant.
Some mechanisms for message authentication and confidentiality can be computationally expensive. In particular, many encryption techniques consume substantial processing power. These costs are generally unavoidable when messages are secured individually. However, when two Web services exchange many messages, more efficient and robust approaches for message confidentiality than those defined in WS-Security are available. These mechanisms, based on symmetric encryption, should be used when securing sessions of messages.
WS-SecureConversation [WS-SecConv] defines a security context between two communicating parties based on shared secrets, such as symmetric encryption. A security context is shared between the parties for the lifetime of a session. Session keys are derived from a shared secret and are used to decrypt the individual messages sent in the conversation. The security context is represented on the wire as a new security token type (the Security Context Token, or SCT).
Three different ways of establishing a security context among the parties of a secure conversation are defined. First, a security token service may create them, and the initiating party has to fetch it to propagate it. Second, one of the communicating parties creates the security context and propagates it in a message to the other party. Third, the security context is created through negotiation and exchanges. Web services select the approach most appropriate for their needs.
Security contexts can be amended when necessary. An example of a requirement to update a security context is the need to extend the context's expiration time.
A security context token implies or contains a shared secret. This secret is used for signing and/or encrypting messages. When using a shared secret, the parties may choose a different key derivation pattern to use. For example, four keys may be derived so that two parties can sign and encrypt using separate keys. In order to keep the keys fresh and to maintain a high level of security, subsequent derivations should be used. Securing sessions using this approach is preferred. The WS-SecureConversation specification defines a mechanism to indicate which derivation is being used within a given message. Each derivation algorithm is identified with a URI.
WS-SecurityPolicy [WS-SecurityPolicy] complements WS-Security by specifying security policy assertions in a language conformant to WS-Policy. Its six assertions relate to security tokens, message integrity, message confidentiality, message visibility to SOAP intermediaries, constraints on the security header, and the age of a message. For example, a policy assertion may require that all messages be signed using public keys from a given authority, or that authentication be based on Kerberos tickets.
Application security requires additional mechanisms beyond what we have presented so far. Identities, for example, are valid within a trust domain but most likely meaningless in other trust domains. For services in different trust domains to be able to validate identities, appropriate mechanisms are needed. WS-Federation defines mechanisms to enable identity, account, attribute, authentication, and authorization information sharing across trust domains. By using these mechanisms, multiple security domains may federate by allowing and brokering trust of identities, attributes, and authentication among participating Web services. The specification extends the WS-Trust model to allow attributes and pseudonyms to be integrated into the token issuance mechanism, resulting in a multi-domain identity mapping mechanism. These mechanisms support single sign on, sign out and pseudonyms, and describe the role of specialized services, for attributes and pseudonyms.
A large variety of requirements may be addressed through identity federation. One example is associating an employee with its employer. In this case, Jane, from CompanyA makes a purchase from OfficeSupplyStore.com. CompanyA and OfficeSupplyStore.com have a purchasing contract. Since Jane's identity is associated with CompanyA, she can be authorized to make a purchase under the contract.
A second example is mapping a single person to multiple pseudonyms. Joe may be known at work as email@example.com. He may also have other identities, such as firstname.lastname@example.org and email@example.com. Through identity federation, systems can determine that each of these identities is the same Joe.
Two broad classes of requestors (message senders) are defined in the Web services federated security architecture: passive and smart (active). A passive requestor [WS-FedPassive] is service that only uses HTTP and never issues security tokens. A smart requestor is a service that is capable of issuing messages containing security tokens, such as those described in WS-Security and WS-Trust. A traditional HTTP-based web browser is an example of a passive requestor. Profile specifications have been developed to define the behaviors of these two kinds of requestors.
For smart requestors, the active requestor profile [WS-FedActive] specifies how single sign on, sign out, and pseudonyms are integrated into the Web services security model using SOAP messages. In effect, the profile describes how to implement the model described in WS-Federation in the context of smart requestors. It specifies requirements on various kinds of security tokens. As an example of one of these security token requirements, when not using a secure channel, tokens for X.509 certificates must contain the authority's name and a signature over the whole token. The profile also requires that X.509 tokens contain the subject identifier uniquely identifying the subject for whom the token was granted.
This section presents the functional components of the Web services architecture that are used to locate Web services on a network and to determine the service's availability: UDDI and WS-Discovery [WS-Discovery].
Web service discovery is a key enabler for automating connections to services without human intervention. The Web service approach to discovery mirrors the two most common approaches to finding information in a computer system: looking in a well-known location, or broadcasting a request to all available listeners. The UDDI registries serve as the directory, and discovery protocols are used to broadcast requests.
The Universal Description, Discovery, and Integration protocol, or UDDI, specifies a protocol for querying and updating a common directory of Web service information. The directory includes information about service providers, the services they host, and the protocols those services implement. The directory also provides mechanisms to add metadata to any registered information.
The UDDI directory approach can be used when Web service information is stored in well-known locations. Once the directory is located, a series of query requests can be sent to obtain the desired information. UDDI directory locations are obtained out of band, usually through system configuration data.
Web service providers have various options for how they deploy UDDI registries. Deployment scenarios fall into one of three categories: public, extra-enterprise and intra-enterprise. To support public deployments, a group of vendors led by Microsoft, IBM and SAP host the UDDI Business Registry [UBR]. The UBR is a public UDDI registry that is replicated across multiple hosting organizations, serving as both a resource for Internet-based Web services and a testbed for Web services developers. While the public UDDI implementation has received the most attention to date, early adopters use the extra- and intra-enterprise approach more often. In these two deployment scenarios, a private registry is deployed by an organization, and much tighter control over the types of information registered is possible. These private registries may be dedicated to only one organization or to groups of business partners. UDDI also defines protocols for replication between registries and for trust federation across deployments. Using these protocols further increases the number of deployment scenarios available to implementers.
For all deployment scenarios, UDDI directories contain detailed information about Web services and where they are hosted. A UDDI directory entry has three primary parts—the service provider, Web services offered, and bindings to the implementations. Each of these parts provides progressively more detailed information about the Web service.
The most general information describes the service provider. This information is not targeted at Web services software, but at a developer or implementer that needs to contact someone responsible for the service directly. Service provider information includes names, addresses, contacts and other administrative details. All UDDI entries have multiple elements for multi-language descriptions.
The list of available Web services is stored within a service provider entry. These services may be organized depending on their intended use: they may be grouped into application area, geography, or any other scheme that is appropriate. Service information stored in a UDDI registry includes simply a description of the service and a pointer to the Web service implementations it contains. Links to services hosted by other providers, called 'Service Projections', may also be registered.
The final part of a UDDI service provider entry is the binding to an implementation. This binding associates the Web service entry to the exact URI(s) to identify where the service is deployed, specifies the protocol to use for access, and contains references to the exact protocols that are implemented.
These details are sufficient for a developer to write an application that invokes the Web service. The detailed protocol definition is provided through a UDDI entity called a Type Model (or tModel). In many cases, the tModel references a WSDL file describing the SOAP Web service interface, but tModels are also flexible enough to describe almost any kind of resource.
For each provider or service registered in UDDI, additional metadata from standard taxonomies (such as NAICS [NAICS] and the older SIC industry codes [SIC]) or other identification schemes (such as an Edgar Central Index Key) can be used to categorize the information and improve search accuracy. The set of available taxonomies and identifier schemes is readily extensible as a part of any implementation, so it can be customized to support any specific geographic, industry, or corporate requirements.
Dynamic Web service discovery is provided in a different manner. As an alternative to storing information in a known registry, dynamically discovered Web services explicitly announce their arrival and departure from the network. WS-Discovery defines protocols to announce and discover Web services through multicast messages.
When a Web service connects to a network, it announces its arrival by sending a Hello message. In the simplest case, these announcements are sent across the network using multicast protocols—we call this an ad-hoc network. This approach also minimizes the need for polling on the network. In order to limit the amount of network traffic and optimize the discovery process, a system may include a Discovery Proxy. A Discovery Proxy replaces the need to send multicast messages with a well-known service location, transforming an ad-hoc network into a managed network. Using configuration information, collections of proxy services may be linked together to scale the discovery service to groups of servers, scaling from one machine to many.
Since the Discovery Proxies are themselves Web services, they may announce their presence with their own special Hello message. Web services receiving this message may then take advantage of the proxy's services, and are no longer required to use the noisier one-to-many discovery protocols.
When a service departs from a network, WS-Discovery specifies a Bye message to be sent to either the network or the Discovery Proxy. This message informs the other services on the network that the departing Web service is no longer available.
To complement this basic approach to service announcement and departure, WS-Discovery defines two operations, Probe and Resolve to locate Web services on a network. For ad-hoc networks, Probe messages are sent to a multicast group, and target services that match the request return a response directly to the requestor. For managed networks utilizing a Discovery Proxy, Probe messages are unicast to the Discovery Proxy instead. The Resolve message is used when a Web service is to be located by name. The Resolve message is only sent in multicast mode. Resolve is analogous to the Address Resolution Protocol, or ARP [ARP], that converts an IP address to its corresponding physical network address.
The WS-Discovery specification also allows system configurations where the Probe message is sent to a Discovery Proxy that has been established by some other administrative means, such as by using a well-known DHCP record.
The ability to find services dynamically enables Web service management bootstrapping. In conjunction with WS-Eventing [WS-Eventing] and other protocols, more sophisticated management services can be built using this dynamic discovery infrastructure.
Dynamic discovery also extends the Web services architecture to devices, such as those that might implement the Universal Plug & Play (UPnP) protocols today. This is an important step in making the architecture truly universal. With WS-Discovery and WS-Eventing, devices such as printers or storage media, for example, may be incorporated into a system as Web services without the need for specialized tools or protocols.
The Device Profile for Web Services specification
The Device Profile for Web Services [WS-DP] specification provides guidance on what subset of the Web services architecture specification family should be implemented on resource-constrained devices. This Profile attempts to find the balance between the rich capabilities that are available and those that are most important when making tradeoffs due to resource constraints.
This section presents the components of the Web services architecture that provide reliable message delivery, transactional behavior, and the ability to provide explicit coordination between a collection of Web services. The specifications that define this functionality are WS-ReliableMessaging, WS-Coordination [WS-Coord], WS-AtomicTransaction [WS-AT] and WS-BusinessActivity.
When multiple Web services must complete a joint unit of work or operate under a common behavior, there must be common agreement on what protocols to use. This minimum amount of coordination among Web services is unavoidable. Coordination protocols are also required to be able to determine and to agree that a common goal has been reached. Every interaction between Web services can be viewed as a kind of coordination. Agreement coordination protocols bring the architecture an improved chance that the participant services will succeed in what they set up to do jointly. The Web services architecture is designed to function properly in the face of transports that lose messages and services that malfunction.
Any multi-party coordination can be built up from two-party coordination by successively joining in more participants as needed. Two-party coordination may be spontaneous, or it may require a designated coordinator. An example of a popular spontaneous coordination protocol is the synchronous request-response messaging pattern. This is one of the simplest forms of agreement coordination; for each work request, the recipient Web service must complete all of the expected work before returning any data to the requestor. Both parties follow this strict pattern with no need of an explicit coordination service. A second example of spontaneous coordination is presented in The Three-Leg Handshake section. WS-ReliableMessaging is an example of a very general multi-message spontaneous coordination and is described in the next section.
Many conditions may interrupt an exchange of messages between two services. This is especially an issue when unreliable transport protocols such as HTTP 1.0 and SMTP [SMTP] are used for transmission or when a message exchange spans multiple transport-layer connections. Messages may be lost, duplicated or reordered, and Web services may fail and lose volatile state. WS-ReliableMessaging is a protocol that enables the reliable delivery of messages based on specific delivery assurance characteristics. The specification defines three different assurances that may be used in combination:
- At-Least-Once Delivery: Each message is delivered at least one time.
- At-Most-Once Delivery: Duplicate messages will not be delivered.
- In-Order Delivery: Messages are delivered in the same order they were sent.
The combination of at-least-once and at-most-once assurances results in an exactly-once delivery assurance. Due to the transport-independent design of the Web services architecture, all delivery assurances are guaranteed irrespective of the communication transport or combination of transports used. Using WS-ReliableMessaging simplifies system development due to the smaller number of potential delivery failure modes that a developer must anticipate.
Reliable message delivery does not require an explicit coordinator. When using WS-ReliableMessaging, the participants must recognize the protocol based on the information sent in SOAP message headers. The set of messages transmitted as a group is referred to as a message sequence. A message sequence can be established by either the initiator/sender or the Web service, and often by both when establishing a duplex association. Sequences are established explicitly using the CreateSequence and CreateSequenceResponse messages. When the desired end result is to have two one-way sequences acting as a duplex sequence, the Initiator provides the sequence that the Web service is to use. The ID of this sequence is included by the initiator in the CreateSequence message.
Several policy assertions are defined in WS-ReliableMessaging. These policy assertions are expressed using the mechanisms defined in WS-Policy.
Reliable messaging protocols simplify the code that developers have to write to transfer messages under varying transport assurances. Instead, the underlying infrastructure verifies that messages have been properly transferred between the endpoints, retransmitting messages and detecting duplication when necessary. Applications do not need any additional logic to handle the message retransmissions, duplicate message elimination, message reordering, or message acknowledgement that may be required to provide the delivery assurances. The implementation of WS-ReliableMessaging is distributed across the initiator and the service. Those characteristics that are not visible 'on the wire', such as message delivery order, are provided by the implementation of the WS-ReliableMessaging specification. While characteristics such as message retransmissions due to transport losses are handled by the messaging layer unbeknownst to the application, other end-to-end characteristics such as in-order delivery require that both the messaging infrastructure and the receiver application collaborate. It is interesting to note that providing a message ordering "as received" on the receiver when the sender expects "as sent" is an incorrect implementation of in-order. Providing an order "as sent" on the receiver when the sender expects "as received" is a correct implementation of in-order.
Some families of N-way coordination protocols require a designated coordinator to shepherd a unit of work through a number of cooperating services. One example is when activities must be coordinated between services that are not all expected to be connected at the same time. As long as each participant and the coordinator communicate at some time, coordination may happen and agreement on the outcome may be reached. The Web services architecture defines some simple operations for designated coordinators.
The WS-Coordination specification defines an extensible coordination framework to support scenarios where explicit coordinators are required. This protocol introduces a SOAP header block, called a Coordination Context, to uniquely identify the piece of joint work that is to be undertaken. To initiate a joint piece of work, a Web service sends a Coordination Context to one or more target services. Receipt of a Coordination Context alerts a recipient service that joint collaboration is requested. The Coordination Context contains enough information for the request recipient to determine whether to participate in the work. The exact information contained within the Coordination Context varies depending on the kind of work that is requested.
The set of coordination types is open-ended. New types may be defined by an implementation, as long as each service participating in the joint work has a common understanding of the required behavior. For example, atomic transactions are one of a few initial cornerstone coordination types that have been defined in the Web services architecture.
If the requested coordination type is understood and accepted, a Web service uses the WS-Coordination registration protocol to notify the coordinator and participate in the joint work. A Coordination Context includes an endpoint reference for the coordinator and identifiers of the possible behaviors that may be selected. The registration operation specifies the behavior supported by the participating Web service. Once the registration message is sent to the coordinator, the Web service participates in the work according to the protocols they have subscribed for. Registration is the key operation in the coordination framework. It allows the "wiring together" of different Web services that desire to coordinate to perform a joint unit of work.
WS-AtomicTransaction specifies traditional ACID transactions [Gray & Reuter] for Web services. Within the context of the atomic transaction coordination type, three protocols are defined: a Completion protocol, and two variants of a Two-Phase Commit protocol. The Completion protocol is used to initiate commit processing. A Web service registered for Completion has the ability to tell the designated coordinator when commit processing is to begin. This protocol also defines messages to communicate the final result of the transaction to the initiator. However, the protocol does not require that the coordinator ensure that the initiator process the result. In contrast, other behaviors in WS-AtomicTransaction do require the coordinator to ensure that participants process the coordination messages.
The Two-Phase Commit (2PC) protocol brings all registered participants to a common commit or abort decision, and ensures that all participants are informed of the final result. As its name indicates, it uses two rounds of notifications to complete the transaction. Two variants of this protocol are defined: Volatile 2PC, and Durable 2PC. Both protocols use the same messages on the wire (corresponding to the operations of prepare, commit, and abort), but Volatile 2PC had no durability requirements. The volatile 2PC protocol is to be used by participants that manage volatile resources, such as cache managers or window managers. These participants are contacted in a first round of notifications by the coordinator and do not require a second round of notifications.
The durable 2PC protocol is to be used by participants managing durable resources such as databases and files. When a commit processing has been initiated, these participants are contacted for the first time after all the volatile 2PC participants have been contacted. This enables, for example, caches to be flushed. Durable 2PC participants require the full two rounds of notifications to achieve the all-or-nothing behavior imposed by the coordinator and to complete the transaction. These behaviors are most appropriate for scenarios where resources can be held for the duration of the transaction, and the transactions are typically very short-lived. This protocol guarantees that under normal processing the coordinator will contact all of the participants with the outcome of the first phase. For transactions that are expected to require more time to complete, or when resources such as locks cannot be held, alternate behaviors are defined by other coordination protocols.
Several policy assertions are defined in WS-AtomicTransaction. These policy assertions are expressed using the mechanisms defined in WS-Policy.
A pattern that has proven to be very useful when building distributed systems is the use of transactional durable queues to provide store-and-forward asynchronous message delivery. In this pattern, atomic transactions are exploited at each of the transmission endpoints. At the sender side, the sending application delivers a message to a durable queue in an atomic transactional manner where the application and the queue manager both use WS-AtomicTransaction to coordinate. Only if there is no error in processing the message is it considered successfully delivered to the queue.
Then, the queue subsystem takes over the delivery of the message between the originating queue and the recipient queue. This transmission step can be done at a time that is later from that when the message was placed in the originating queue. In addition, the location of the originating queue need not coincide with the location of the application from which the message originated.
Analogously, the application that retrieves the message from the recipient queue does so using an atomic transaction. In that manner, a message can be removed from the queue only when there are no processing errors.
Long Duration Activities
WS-BusinessActivity [WS-BA] specifies two protocols for long-running transactions. Instead of holding locks on resources until the transaction is committed, the WS-BusinessActivity specification is based on compensating actions. The underlying transaction model is the so-called open nested transaction [Gray & Reuter]. These protocols codify how pairs of loosely coupled services reach agreement that they have ended a joint task. In one protocol, the coordinator explicitly communicates the participants that no more work is being requested on behalf of the joint task. In the second protocol, the participant is the one that notifies the coordinator that the work on behalf of the joint task has been completed. The use of compensatory actions provides a mechanism to finish tentative operations without leaving locks on them. A compensation operation is to be issued if, for whatever reason, the system desires to undo the effects of the finished tentative operation.
Both WS-AtomicTransaction and WS-BusinessActivity leverage WS-Coordination to manage collaboration between Web services.
The three-leg handshake connection establishment and tear-down protocol is an example of a coordination protocol that does not require a designated coordinator service. To establish a connection, the sender sends a request to the receiver. This request establishes a session. If accepted, the receiver responds positively to this request with an acknowledgement message. A third message is transmitted by the sender as an acknowledgement to the acknowledgement, verifying that both parties know that the other party has established a session.
The teardown protocol is analogous. One of the parties sends the other party a session teardown request. The recipient responds with an acknowledgement of the teardown message. Upon reception of this acknowledgement, the party that originated the teardown message completes the message exchange by sending an acknowledgement to the acknowledgement.
This section presents specifications that provide enumeration of service resources, their state management, and event notification in the Web services architecture. They are based on WS-Enumeration [WS-Enum], WS-Transfer [WS-Transfer], and WS-Eventing.
Many scenarios require data exchange using more than just a single request/response message pair. Types of applications that require these longer data exchanges include database queries, data streaming, the traversal of information such as namespaces, and enumerating lists. Enumeration, in particular, is achieved though establishing a session between the data source and the requestor. Successive messages within the session transport the collection of elements being retrieved. No assumptions are made on the approach used by the service to organize the items that will be produced. What is expected is that under normal processing circumstances the enumeration will produce all the underlying data before the end of the session.
WS-Enumeration specifies protocols to establish an enumeration session and to retrieve sequences of data. The enumeration protocols allow the data source to provide a session abstraction, called an enumeration context, to the consuming service. This enumeration context represents a logical cursor through a sequence of data items. The requestor then uses this enumeration context over a span of one or more SOAP messages to request the data. The enumerated data is represented as XML Infosets. The specification also allows a data source to provide a custom mechanism for starting a new enumeration. Since an enumeration session may require several message exchanges, the session state must be retained.
State information regarding the progress of the iteration may be maintained between requests by either the data source or the consuming service. WS-Enumeration allows the data source to decide, on a request-by-request basis, which party will be responsible for maintaining state for the next request. This flexibility enables several kinds of optimizations. One optimization example is allowing a server to avoid saving any cursor state between invocations. As message latencies can be large for a service supporting several simultaneous enumerations, not preserving state may yield substantial savings in the total amount of information that must be maintained. Service implementations on resource-constrained devices, such as cell phones, may not be able to maintain any state information at all.
The basic operations required to manage the data entities accessed through Web services are defined in WS-Transfer [WS-Transfer]. An understanding of WS-Transfer requires two new terms to be introduced: factory and of resource. A factory is a Web service that can create a resource from its XML representation. WS-Transfer introduces operations that create, update, retrieve and delete resources. It should be noted that the state maintenance for a resource is at most subject to the "best efforts" of the hosting server. When a client receives the server's acceptance of a request to create or update a resource, it can reasonably expect that the resource now exists at the confirmed location and with the confirmed representation, but this is not a guarantee, even in the absence of any third parties. The server may change the representation of a resource, may remove a resource entirely, or may bring back a resource that was deleted. This lack of guarantees is consistent with the loosely coupled model brought by the Web. If desired, services may offer additional guarantees that are not required by the Web services architecture.
WS-Transfer's create, update, and delete operations extend the capabilities of the read-only operations found in WS-MetadataExchange. The retrieve operation is exactly the same Get operation in WS-MetadataExchange. The Create request is sent to a factory. The factory then creates the requested resource and determines its initial representation. The factory is assumed to be distinct from the resource being created. The new resource is assigned a service-determined endpoint reference that is returned in the response message.
The Put operation updates a resource by providing a replacement representation. A one-time snapshot of the representation of a resource, identical to the Get operation in WS-MetadataExchange, can be retrieved by using the Get operation in WS-Transfer. After a successful Delete operation the resource is no longer available through the endpoint reference. These four metadata management operations form the basis needed to build state management in Web services.
In systems made up of services that communicate among each other, possibly using asynchronous messaging, there are many scenarios where information produced by one service is of interest to another service. Due to poor scaling characteristics, polling is often not an appropriate mechanism for obtaining such information; too many unnecessary messages are sent through the network. Instead, the architecture requires a mechanism for explicit notifications when events occur. Even more important is the requirement that the binding of a source service and a consumer service be done dynamically at runtime. The Web services architecture supports this through a lightweight eventing protocol.
WS-Eventing specifies mechanisms to enable the following four entities to interact: subscribers, subscription managers, event sources, and event sinks. This allows a Web service, when acting as a subscriber, to register interest in specific events that are provided by another Web service (the event source). This registration is called a subscription. WS-Eventing defines operations a service can provide that allow subscriptions to be created and managed. When an event source determines that an event has occurred, it will provide that information to the subscription manager. The subscription manager can then communicate the event to all of the matching subscriptions. This is similar to publishing topics in a traditional publish/subscribe event notification system. The Web services architecture provides complete flexibility in the way that topics are defined, organized, and discovered; it provides a common infrastructure for managing subscriptions that may be leveraged in many different application arenas.
Subscriptions lease resources that must be eventually recovered. The primary mechanism used to reclaim resources is an expiration time for each subscription. There is also a mechanism to query the status of subscriptions. Additional operations to help subscribers manage their collection of subscriptions, including renewal, notifications, and unsubscribe requests are also defined. Of course, any service is free to end a subscription at any time, consistent with the principle of autonomy for all Web services. The subscription-end message may be used by the event source to notify subscribers of the premature termination of a subscription.
While the general pattern of asynchronous, event-based messages is common, different applications often require alternate event delivery mechanisms. For instance, a simple asynchronous message may be optimal in some cases, while other situations may work better if the event sink can control the flow and timing of message arrival through polling. Polling is also necessary when the sink cannot be reached from the source, such as in the case where the sink is behind a firewall. The notion of delivery modes was introduced in WS-Eventing to support these requirements. Delivery mode is used as an extension point to provide a means for subscribers, event sinks, and event sources to establish tailored delivery mechanisms. The management specification described below makes use of this mechanism.
An event broker may be used to aggregate or redistribute notifications through different sources. A broker may also be used as a stand-alone subscription manager. These two approaches are supported by WS-Eventing. Brokers can play several important roles in a system. Topics may be organized for use by certain classes of applications. Brokers may act as notification aggregators that combine event information from multiple sources. They may also act as filters, receiving more messages than the ones they use for their own notifications. This flexibility is required to deploy robust and scalable notification systems.
Management features are the final aspect of the Web services architecture to discuss. These features are defined in the WS-Management [WS-Management] specification.
WS-Management builds on several components of the architecture, providing a common set of operations that are required by all systems management solutions. This includes the ability to discover the presence of management resources and navigation among them. Individual management resources such as settings and dynamic values can be retrieved, set, created, and deleted. The contents of containers and collections, such as large tables and logs, can be enumerated. Finally, event subscriptions and specific management operations are defined. In each of these areas, WS-Management defines only minimal implementation requirements.
Care has been taken so that conformant WS-Management implementations can be deployed to small devices. At the same time, it has been designed to scale up to large datacenter and distributed installations. In addition, mechanisms are defined independent of any implied data models or system health models. This independence supports its application to all kinds of Web services.
WS-Management requires that managed resources be referenced using endpoint references with specific additional information. This information includes the URL of the agent that provides access to the resource, the unique identifier URI of the resource type that the resource belongs to, and zero or more keys that identify the resource. These keys are assumed to be name/value pairs. The mapping of this information to a WS-Addressing endpoint reference is as follows: the URL of the resource is mapped to the address property, the resource type identifier is mapped to a specific reference property named ResourceURI (in the appropriate XML namespace), and each key is mapped to a reference parameter named Key with an attribute called Name.
To accommodate the messaging needs of management services, three qualifiers are defined for the operations. The SOAP representation of these qualifiers is in header elements. An operation timeout specifies a deadline after which the operation need not be serviced. The locale element is used when translations of the underlying information are needed or expected. Finally, a freshness qualifier is provided to request up-to-date values and prohibit returning stale data.
For data access using the WS-Transfer operations, WS-Management specifies three more qualifiers. The Get operation may be qualified with the SummaryPermitted header and the NoCache header. The SummaryPermitted qualifier enables transmission of abbreviated representations, when available. The NoCache qualifier requires transmission of fresh data, disallowing caching of the information. For the Put and Create operations, the ReturnResource qualifier mandates the service to return the new representation of a resource. ReturnResource allows resource-constrained Web services to retain no state when updating a resource.
WS-Management defines three custom delivery modes for event notification: batched, pull, and trap. Each of these modes is identified by a URI. These URIs are used when establishing subscriptions. The batched delivery mode enables a subscriber to receive multiple event messages bundled in single SOAP messages. The subscriber may also request that a maximum number of events be included in the bundle, a maximum amount of time that the service should take accumulating events, and a maximum amount of data that should be returned. The pull delivery mode allows the data producing service to maintain a logical queue of events so that the subscriber can poll for on-demand for notifications. This polling is done using WS-Enumeration with an enumeration context returned with the subscription response message. Finally, when UDP multicast is an appropriate messaging mechanism, the trap delivery mode allows an event source to use it. In trap mode, the event source can send its notifications to a predetermined UDP multicast address.
This paper introduces the functional building blocks of the Web services architecture and its underlying principles. Each building block is defined in terms of a protocol specification. We expect the functional scope and guiding principles described in this paper to remain unaltered. However, we do expect the architecture to expand to support additional scenarios. Being able to accommodate innovation is a fundamental strength of the architecture.
Great care has been taken to ensure that the various Web service protocols can be cleanly composed with each other; while they have been designed together, they can be used in a wide range of combinations. As functional building blocks they behave like a traditional development framework. When needed, such as for SOAP attachments, we have developed new solutions that fit cleanly within the architecture. The focus on composition is not a deterrent from rich functionality.
The architecture's SOAP messaging foundation assures wide reach. SOAP messaging supports both asynchronous and synchronous patterns in a transport-independent manner. There is no infrastructure more flexible. To accelerate broad adoption of the Web services architecture, the specifications have been authored with an extensive collection of technical partners. Partnering with these key technology providers accelerates the deployment of devices and of programming environments that support the on-the-wire protocols. Achieving wide reach, widespread adoption, and scale-independent constructs are three of our core goals.
We strive to ensure that the architecture can be implemented on any platform and in any programming language. This is facilitated by the message-based and protocol-based nature of the architecture. When necessary, such as only using WS-Security for message integrity, confidentiality and authentication, and expressing metadata only using WS-Policy, we have restricted the universe of technical approaches to increase the level of interoperability. Ideally, as long as implementations faithfully follow the protocol specifications of the architecture they will be able to communicate with any other Web service. The truth is on the wire.
Special thanks to Omri Gazitt for his continued support and multiple suggestions, to Alan Geller, Jim Johnson and Rodney Limprecht for thorough reviews and excellent insights, to Dan Simon for his description of common security attacks, and to Chris Kaler for his guidance and multiple revisions of the security section. John Shewchuk's contributions to the architecture section also were a great help. Thanks to Phil Bernstein, Shy Cohen, Paul Cotton, David Langworthy, Andrew Layman, Brad Lovering, Jeffrey Schlimmer, Scott Seely, Satish Thatte, Marvin Theimer, Jorgen Thelin, and Hervey Wilson for their reviews and encouragements.
Last but not least, the following individuals have participated in the definition of the Web services architecture: (alphabetical) Tony Andrews, Bob Atkinson, Keith Ballinger, Don Box, John Brezak, Allen Brown, Luis Felipe Cabrera, Erik Christensen, George Copeland, Michael Coulson, Giovanni Della-Libera, Brendan Dixon, Mike Dusche, Colleen Evans, Max Feingold, Henrik Frystyk Nielsen, Praerit Garg, Omri Gazitt, Alan Geller, Josh Gray, Martin Gudgin, Destry Hood, Efim Hudis, Tomasz Janczuk, Jim Johnson, Ryan Johnson, John Justice, Gopal Kakivaya, Chris Kaler, Johannes Klein, Scott Konersmann, Brian LaMacchia, Dave Langworthy, Andrew Layman, Paul Leach, Al Lee, Rodney Limprecht, Joe Long, Steve Lucco, John Manferdelli, Ashok Malhotra, Jonathan Marsh, Steve Millet, Angela Mills, Stefan Pharies, Scott Robinson, Yordan Rouskov, Sujay Sahni, Jeff Schlimmer, Oliver Sharp, John Shewchuk, Yasser Shohoud, Dan Simon, Jeff Spelman, Keith Stobie, Satish Thatte, Robert Wahbe, Elliot Waingold, Richard Ward, Hervey Wilson, Kenny Wolf and Eric Zinda.
Digest—A digest is a cryptographic checksum of an octet stream.
Federation—A federation is a collection of trust domains that have established mutual pair-wise trust. The level of trust may vary, but typically includes authentication and may include authorization.
Identity Mapping—Identity mapping is a method of creating relationships between identity properties. Some Identity Providers may make use of identity mapping.
Identity Provider (IP)—Identity Provider is an entity that acts as an authentication service to end requestors. An Identity Provider also acts as a data origin authentication service to service providers (this is typically an extension of a security token service).
Message—A message is a complete unit of data available to be sent or received by services. It is a self-contained unit of information exchange. A message always contains a SOAP envelope, and may include additional MIME parts as specified in MTOM, and/or transport protocol headers.
Security Token Service—A security token service (STS) is a Web service that issues security tokens (see [WS-Security]). That is, it makes assertions based on evidence that it trusts, to whoever trusts it (or to specific recipients). To communicate trust, a service requires proof, such as a signature to prove knowledge of a security token or set of security token. A service itself can generate tokens or it can rely on a separate STS to issue a security token with its own trust statement (note that for some security token formats this can just be a re-issuance or co-signature). This forms the basis of trust brokering.
Signature—A signature is a value computed with a cryptographic algorithm and bound to data in such a way that intended recipients of the data can use the signature to verify that the data has not been altered and has originated from the signer of the message, providing message integrity and authentication. The signature can be computed and verified with either symmetric or asymmetric key algorithms.
Single Sign On (SSO)—Single Sign On is an optimization of the authentication sequence to remove the burden of repeating actions placed on the requestor. To facilitate SSO, an element called an Identity Provider can act as a proxy on a requestor's behalf to provide evidence of authentication events to 3rd parties requesting information about the requestor. These Identity Providers (IP) are trusted 3rd parties and need to be trusted both by the requestor (to maintain the requestor's identity information as the loss of this information can result in the compromise of the requestors identity) and the Web services which may grant access to valuable resources and information based upon the integrity of the identity information provided by the IP.
System—A collection of services implementing a particular functionality. Synonymous with Distributed Application.
Trust Domain—A Trust Domain is an administered security space in which the source and target of a request can determine and agree whether particular sets of credentials from a source satisfy the relevant security policies of the target. The target may defer the trust decision to a third party (if this has been established as part of the agreement) thus including the trusted third party in the Trust Domain.
An XML document may contain eleven types of information items. Below we list and define those allowed by SOAP and mention the others. The six information items allowed by SOAP are:
- Document: One document information item is present in each information set. It is used to reference all other information items.
- Element: One element information item is included in the information set for each XML element in the document. Access to all elements is provided by recursively following Child properties.
- Attribute: One attribute information item is included in the information set for each attribute in the document. Additional attribute information items are present for namespaces.
- Namespace: One namespace information item is contained within the information set for each namespace that is in scope for its parent element.
- Character: One character information item is included in the information set for each data character in the document.
- Comment: One comment information item is included in the information set for each comment in the document, except for those appearing in the DTD.
The five information items not allowed by SOAP but present in the original definition of XML Infoset are: Processing Instruction, Document Type Declaration, Unexpanded Entity Reference, Unparsed Entity, and Notation.
Attacks against distributed systems can be divided along several axes. They can be directed against one or more of the hosts in the system, or against the communication between them. Attacks can be intended to disrupt operations, obtain confidential information, or perform unauthorized actions within the system. They can attack the cryptographic and other security-focused techniques used in the system, or attempt to bypass them by attacking the systems and network layers below or the application layers above.
The following is a brief, non-exhaustive list of security attack classes, organized according to these axes, together with standard countermeasures for each:
- Attacks on hosts:
- Denial-of-service (DoS) attacks disrupt host operations by overwhelming their ability to respond.
When directed at the cryptographic layer, a DoS usually attempts to force the host to repeatedly perform the computationally expensive public-key operations needed for certain authentication or key exchange protocols. The typical defense against such attacks is to delay public-key operations until the legitimacy of the interlocutor can be verified by less expensive means, such as symmetric cryptography or "puzzles".
DoS attacks on the underlying network layer or the overarching application layer are very difficult to prevent, particularly if the attacker has massive resources at his disposal, and the traffic is indistinguishable from a "flash crowd" of legitimate traffic. Network infrastructure typically had to be deployed in a way that funnels traffic down to manageable levels.
- Host confidentiality or authorization attacks attempt to compromise privacy or identity.
These attacks may exploit vulnerabilities in the host software to gain control of the host. Proper security administration, such as installation of patches, firewall configuration, and reducing the privilege of exposed applications, is the usual countermeasure.
Another type of attack exploits weaknesses in the system or applications, such as incorrectly set policies or application logic errors, that allow for confidentiality or authorization compromises short of general host compromise. Proper security policy administration and careful application programming are the only defenses against such attacks.
"Spoofing" attacks are where an attacker attempts to obtain authorization for various actions by assuming the identity of a different, authorized party, and acting accordingly. Secure authentication protocols, properly used, can prevent spoofing, as long as both the host and the authorized party carefully guard the cryptographic secrets used for authentication.
- Denial-of-service (DoS) attacks disrupt host operations by overwhelming their ability to respond.
- Attacks on communication:
- DoS attacks on the network attempt to disrupt communication with a service. Like those on the host network layer, these can only really be addressed using network infrastructure means.
- Attacks on the confidentiality of network communication attempt to compromise privacy on the wire.
Direct monitoring of cleartext communication can be prevented through use of encryption.
Cryptanalysis attacks can be made infeasible by sufficiently strong cryptographic algorithms, with sufficient key sizes.
- Attacks on the authorization of network communication attempt to compromise identity.
Message forgery attacks, in which the attacker attempts to inject messages into a conversation, and message alteration attacks, in which the attacker modifies the messages sent in a conversation, can be prevented with message security protocols that include message authentication.
Message replay attacks, in which the attacker injects previously sent (and hence correctly authenticated) messages into a conversation can be detected and addressed through sequence numbers, or the combination of timestamps and message caches.
An Ethernet Address Resolution Protocol [RFC826]. David C. Plummer. November 1982. Internet Engineering Task Force.
HTML 4.01 Specification. Ed. Dave Raggett, et al. 24 December 1999. W3C.org.
Hypertext Transfer Protocol – HTTP/1.1 (RFC 2616). Ed. R. Fielding, et al. June 1999. The Internet Society.
The Kerberos Network Authentication Service (V5). J. Kohl and C. Neuman. September 1993. Internet Engineering Task Force.
Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. Ed. N. Freed, et al. November 1996. Internet Engineering Task Force.
Information technology – Multimedia Framework (MPEG-21) – Part 5: Rights Expression Language. International Organization for Standardization (ISO/IEC 21000-5:2004).
Universal Resource Identifiers in WWW (RFC 1630). Ed. T. Berners-Lee. June 1994. Internet Engineering Task Force.
Assertions and Protocol for the OASIS Security Assertion Markup Language (SAML) V1.1. Ed. Eve Maler, et al. 2 September 2003. OASIS-Open.org.
Simple Mail Transfer Protocol. Ed. J. Klensin. April 2001. Internet Engineering Task Force.
Transmission Control Protocol. Ed. Jon Postel. September 1981. Defense Advanced Research Projects Agency.
Internet Protocol. Ed. Jon Postel. September 1981. Defense Advanced Research Projects Agency.
The TLS Protocol. T. Dierks, et al. January 1999. The Internet Society.
User Datagram Protocol. J. Postel. August 1980. Internet Engineering Task Force.
Data Networks and Open System Communications Directory (ITU-T Recommendation X.509). June 1997. International Telecommunication Union.
XSL Transformations (XSLT) Version 2.0. Ed. Michael Kay. 12 November 2004. W3C.org
XML Information Set (Second Edition). Ed. John Cowan, et al. 4 February 2004. W3C.org.
Extensible Markup Language (XML) 1.0 (Third Edition). Ed. Tim Bray, et al. 4 February 2004. W3C.org
XML Encryption Syntax and Processing. Ed. Takeshi Imamura, et al. 10 December 2002. W3C.org.
XQuery 1.0: An XML Query Language. Ed. Scott Boag, et al. 23 July 2004. W3C.org
XML-Schema Part 0: Primer. Ed. David Fallside. 2 May 2001. W3C.org.
XML-Schema Part 1: Structures. Ed. Henry Thomson, et al. 2 May 2001. W3C.org.
XML-Schema Part 2: Datatypes. Ed. Paul Biron, et al. 2 May 2001. W3C.org.
XML Signature Syntax and Processing. Ed. Donald Eastlake, et al. 12 February 2004. W3C.org.
Simple Object Access Protocols (SOAP) 1.1. Ed. Don Box, et al. 8 May 2000. W3C.org.
SOAP-over-UDP. Harold Combs, et al. September 2004. BEA, Lexmark, Microsoft, and Ricoh.
SOAP Message Transfer Optimization Mechanism. Ed. Noah Mendelsohn, et al. 8 July 2004. W3C.org.
UDDI Version 2.04 API Specification. Ed. Tom Bellwood. 19 July 2004. OASIS-Open.org.
Web Service Description Language (WSDL) 1.1. Ed. Erik Christensen, et al. 15 March 2001. W3C.org.
Web Services Addressing (WS-Addressing). Don Box, et al. August 2004. BEA, IBM, and Microsoft.
Web Services Atomic Transaction (WS-AtomicTransaction). Luis Felipe Cabrera, et al. September 2003. BEA, IBM, and Microsoft.
Web Services Business Activity Framework (WS-BusinessActivity). Luis Felipe Cabrera, et al. January 2004. BEA, IBM, and Microsoft.
Web Services Coordination (WS-Coordination). Luis Felipe Cabrera, et al. September 2003. BEA, IBM and Microsoft.
Web Services Dynamic Discovery (WS-Discovery). John Beatty, et al. February 2004. Microsoft Corporation.
Web Service Enumeration (WS-Enumeration). Don Box, et al. September 2004. Microsoft Corporation.
Web Services Eventing (WS-Eventing). Luis Felipe Cabrera, et al. September 2004. BEA, Microsoft, and TIBCO.
Web Services Federation Language (WS-Federation). Siddharth Bajaj, et al. 8 July 2003. IBM, Microsoft, BEA, RSA Security, and VeriSign.
WS-Federation: Active Requestor Profile. Siddharth Bajaj, et al. 8 July 2003. IBM, Microsoft, BEA, RSA Security, and VeriSign.
WS-Federation: Passive Requestor Profile. Siddharth Bajaj, et al. 8 July 2003. IBM, Microsoft, BEA, RSA Security, and VeriSign.
Web Services Metadata Exchange (WS-MetadataExchange). Keith Ballinger, et al. March 2004. BEA, IBM, Microsoft, and SAP.
Web Services Policy Framework (WS-Policy). Don Box, et al. 3 September 2004. BEA, IBM, Microsoft, and SAP.
Web Services Policy Attachment (WS-PolicyAttachment). Don Box, et al3 September 2004. BEA, IBM, Microsoft, and SAP.
Web Services Reliable Messaging (WS-ReliableMessaging). Ruslan Bilorusets, et al. March 2004. BEA, IBM, Microsoft, and TIBCO.
Web Services Secure Conversation Language (WS-SecureConversation). Steve Anderson, et al. May 2004. BEA Systems, Inc., Computer Associates International, Inc., International Business Machines Corporation, Layer 7 Technologies, Microsoft Corporation, Netegrity, Inc., Oblix Inc., OpenNetwork Technologies Inc., Ping Identity Corporation, Reactivity Inc., RSA Security Inc., VeriSign Inc., and Westbridge Technology.
Web Services Security: SOAP Message Security (WS-Security). Ed. Anthony Nadalin, et al. March 2004. OASIS-Open.org.
Web Services Security Policy Language (WS-SecurityPolicy). Giovanni Della-Libera, et al. 18 December 2002. IBM, Microsoft, and VeriSign.
Web Services Security: Username Token Profile V1.0. Ed. Anthony Nadalin, et al. March 2004. OASIS-Open.org.
Web Services Security: X.509 Token Profile V1.0. Ed. Phillip Hallam-Baker, et al. March 2004. OASIS-Open.org.
Web Service Transfer (WS-Transfer). Ed. Don Box, et al. September 2004. Microsoft Corporation.
Web Services Trust Language (WS-Trust). Steve Anderson, et al. May 2004. BEA Systems, Inc., Computer Associates International, Inc., International Business Machines Corporation, Layer 7 Technologies, Microsoft Corporation, Netegrity, Inc., Oblix Inc., OpenNetwork Technologies Inc., Ping Identity Corporation, Reactivity Inc., RSA Security Inc., VeriSign Inc., and Westbridge Technology, Inc.
XML-binary Optimized Packaging (XOP). Ed. Noah Mendelsohn, et al. 8 June 2004. W3C.org.
Basic Profile Version 1.0. Ed. Keith Ballinger, et al. 16 April 2004. The Web Services-Interoperability Organization.
Basic Security Profile Version 1.0 (Working Group Draft). Ed. Abbie Barbir, et al. 12 May 2004. The Web Services-Interoperability Organization.
Devices Profile for Web Services. Shannon Chan, et al. May 2004. Microsoft Corporation.
North American Industry Classification System (NAICS). NAICS Association.
Transaction Processing: Concepts and Techniques. Jim Gray and Andreas Reuter. Morgan-Kaufmann, 1993.