.gif)
by Fernando Gebara Filho
Summary: The purpose of this article is to show the
readers how digital identities are evolving over time, from the needs of a
single user running within a single computing platform to today’s world of
multiplatform identity federation and privacy concerns. It can be used to help
infrastructure architects design effective identity infrastructures that are
not over-engineered, but instead based on current and projected needs of the
organization.
Contents
A (Very) Brief (and Simplified) History of Identities
A Nontechnical View of the Digital-Identity Anatomy
The Managed and the Unmanaged User Spaces
Competencies, Commoditization, and Real Value
The Challenges of Identity Strategies
Conclusion
There was a time when very few people needed a digital identity.
They were specialized workers in a brand new profession, nowadays known as
Information Technology (IT). They used a very limited set of devices, accessing
a very limited set of applications. There was no online access; the only way to
make the machines understand what to do was to feed them tremendous quantities
of punched cards, loaded with machine-like language constructs. There were very
few bad guys those days; the worst were the ones who hustled you, so that all
of those carefully ordered punched cards were spread out over the floor.
This is not the reality today. We live in a globally connected
world; the vast majority of people who use the technologies that once were
reserved for scientists and hardcore engineers are now lay users—with little to
no understanding of all of the processes that are running inside their
computers. There are unknown numbers of very bad guys who want to spy on you in
order to determine your password to your banking account, so that they are free
to steal from you without the need for proximity. These malicious users can
live far way from our homes, but it takes just a few seconds for them to track
all of our activities. Some of them don’t even want our passwords; they want to
steal our home-processing power to run complex hacking algorithms or
distributed attacks to carefully chosen targets in the Internet. And, so,
computer programmers must constantly work on digital-identity technologies in
order to handle all of those changing landscapes.
A (Very) Brief (and Simplified) History of Identities
In the beginning...
… there was almost no interest in creating and managing
identities and their security contexts. Why? We lived in a world of mainframes
and mini-computers, submitting huge computational jobs through punched cards
and printing stacks and stacks of paper on mechanical printers (but only if we
were IT professionals or attending University classes at that time). Our
identity was nothing more than an identifier, determining who submitted the job
and who owned that big amount of paper (usually, printed on the first page of
the paper stack).
There was no security context at all in our identities. The user
name/password pair was even printed in the punched card set, so that there was
absolutely no secrecy involved. However, there was no need for it, especially
in the commercial/academic world; except for a few individuals, there was no
interest in stealing other people’s jobs (JCL jobs, that is). The only
necessary secrets were in the realm of military installations. Identities were
used only in the context of a single machine. If you wanted to use another
computer, another user name/password pair had to be created, and there was no
connection among the identities in the machines that you were allowed to use.
Basically, identities were not used to really identify you. Their
only purpose was to generate an identity under which a process was run and the
results could be sent to you. There was a very weak connection between you and
your digital identity.
With the advent of distributed computing, network logon became a
necessity, and technologies and protocols were specially created to handle
those needs. But they evolved from the context of the so-called workgroup
computers to the full domain-based central directory. Workgroup computers were
really a set of workstations with a “master” element that took care of
presenting the individual members as a cohesive entity. However, this was only
a view of the reality: You could enumerate workgroup members and resources but,
when trying to access one of them, you had to be registered in the local
identity database of the member that held the resource; and this workgroup
member was responsible for checking if the credentials you used were correct.
The workgroup concept evolved alongside the network. When file
and print servers became popular, they were also responsible for holding the
user identity database and running the algorithms that checked the presented
credentials’ validity. Initially, one server was enough for daily jobs, but as
quickly as we could spell “network,” the need for networked servers showed up
in our lives. This brought the challenge of presenting the user identity as a
unique entity among all of those computational resources: If I wanted to use a
printer, no matter which server held the printer queue, I had to identity
myself using my single set of network credentials and get the job done. In the
first years of the networked servers, a simple and effective (at that time)
artifact was used: identity database replication. All servers that were part of
a known and trusted set of servers replicated its user database, effectively
implementing the concept of a single sign-on to network resources. Obviously,
this mechanism had its limitations. When dealing with a large number of
servers, replication delays and even inconsistencies were commonplace.
This may have been one of the first times when there was a clear
relationship between identities and the individual who held them, because the
same set of credentials (user name/password) were used to access a set of
network resources.
Then came the concept of the network domain. In it, a set of
workstations and servers are managed under a central credential database,
effectively allowing the creation of a common security context among all domain
network resources and processes. In the network domain model, not only users
but printers, workstations, and services are assigned a set of credentials that
allow the execution of processes and communications among them. A user’s
credentials will validate only on a workstation that is part of the same
domain.
This model also allowed the creation of trust relationships
between disparate domains. With it, users who are controlled by domain A can
access resources from domain B, if and only if domain B is set to trust the
credentials from domain A. This allowed for more flexible identity management,
because there was no need to replicate or clone identities from domain A to
domain B if the trust relationship has been previously established.
Unfortunately, this model requires that all domains that are part
of the same trust mesh use the same set of technologies, making it very
difficult to share resources among loosely coupled directories or directories
from different technology platforms.
New sets of technologies were created and standardized to handle
the transmission of user identities among loosely coupled network domains. They
are collectively called identity-federation systems: A predefined,
cross-platform, standardized set of protocols designed exclusively to transmit
user security contexts to allow one network domain to share resources with
another network domain. These sets of standards-based protocols are friendly to
the Internet infrastructure, allowing the sharing of resources even in the
absence of dedicated network links.
As can be inferred from the preceding paragraphs, digital
identities had to evolve from a single pair of user name/password to a very
complex set of protocols that transport lots of user-related claims and
attributes.
A Nontechnical View of the Digital-Identity Anatomy
Because of this evolution of the use and sharing of identities, a
new understanding of digital identities was needed. In a simplified,
nontechnical view, a digital identity can be seen as a set of at least four
layers (see Figure 1):
.jpg)
Figure 1. Layered view of a digital identity
1.
Identifier
2.
Credentials
3.
Main profile
4.
Context-based profile
In this model, the innermost layer is the unique identifier of
this digital identity. There should be no identical unique identifiers in the
same security domain. Users must have unique identifiers to be easily
recognizable by any system to which they have access.
To have access to this identifier, the user must present to the
security system a set of credentials that will be checked against the computing
secrets stored in the identity database. This is the nearest that we can get in
terms of proof of possession. A user is entitled to access their own unique
identifier if and only if that user presents the correct set of credentials to
unlock this magic number.
After the user unlocks the unique identifier, a set of common
attributes (called the main profile) is accessible to a system that may require
a little more knowledge than the unique identifier itself. This can be, for
example, the user name, department, Social Security number, company name, and
so on. These attributes do not vary from system to system; they are the same
wherever the user logs on. They are a fixed set of values that are tied to the
user during the lifetime of the logon session.
But not all systems need to share information. There are sets of
information that are only meaningful in the context of a system or related
systems. For example, frequent-flier miles are meaningful only under the
context of an airline carrier, losing most of their semantic meaning if they
are moved from one carrier to another. However, they share the same semantic
context when they are used by the mileage program associates (restaurants,
hotels, credit cards, and so on). These sets of attributes are stored in the
context-based profile.
One digital identity can have only one unique identifier, a
limited set of credentials (user name/password pair, digital certificate/PIN
pair, biometrical data), a unique set of main profile attributes, and an
unlimited set of context-based profile attributes.
The separation of the unique identifier from the set of
credentials that are used to access it allows us to evolve the security
mechanisms without the need to create a different ID for the same user over
time. It makes a unique identifier last for the lifetime of the individual who
is the subject of the identity. This is also the base of the current
identity-federation systems: No matter what protocols and credentials are used
to unlock the unique identifier, the same value will be presented any time that
it is required.
The separation of the unique identifier from the multiple sets of
profile attributes also helps us evolve the semantic and syntactic meaning of
these attributes without affecting the user ID. One can use binary or
XML-encoded data to store and/or present user-profile attributes without
interfering in the now stable unique identifier.
The Managed and the Unmanaged User Spaces
Today, digital identities fall into two different, and sometimes
incompatible, worlds: the managed and the unmanaged user spaces. What are the
differences between them, and how can we build an effective digital-identity
strategy?
Managed users are the ones to whom you can apply corporate
security policies during the lifetime of their interactions with your network
resources. They are full-time and part-time employees, vendors, trainees, and
so on. They normally have signed a labor (or equivalent) contract with your
company, and so they are subject to all corporate policies regarding working
hours, safe Internet browsing, workstation and laptop security procedures, and
so forth. The internal IT staff controls every device that it uses and is able
to block the access to any network resource when it detects misbehavior. Also,
the managed users are subject to full auditing on their activities and are not
the owners of the data that they store on local and network storage.
Unmanaged users live in a different world and have completely
different security and privacy requirements. The unmanaged user is your
customer or your business partner; it can be you, a managed user in your
company’s internal network, but a customer (unmanaged user) at an online
bookseller site. Unmanaged users are the owners of the information they
produce, the owners of their financial data and your IT staff has limited
action, except in the case of misuse of the resources or fraudulent actions.
Essentially, they use your systems when they are connected to the Internet and
do not have direct access to the resources in your corporate network. They can
imagine what your network looks like, but you have to create and manage all
necessary resources (or views) that hide the internal network’s topology and
functional specifications from their eyes.
Unmanaged users pose another, different set of challenges than
that of managed users: privacy. Privacy issues are difficult to handle and
exposure of user data to theft can put your company’s reputation at risk. These
challenges resulted in the proposal of Kim Cameron’s seven laws of identity
(see http://www.identityblog.com/stories/2005/05/13/TheLawsOfIdentity.pdf
for more information):
Law #1 User
Control and Consent
Technical identity systems must only reveal information
identifying a user with the user’s consent.
Law #2 Minimal
Disclosure for a Constrained Use
The solution that discloses the least amount of identifying
information and best limits its use is the most stable long-term solution.
Law #3 Justifiable
Parties
Digital-identity systems must be designed so that the disclosure
of identifying information is limited to parties that have a necessary and
justifiable place in a given identity relationship.
Law #4 Directed
Identity
A universal-identity system must support both “omnidirectional”
identifiers for use by public entities and “unidirectional” identifiers for use
by private entities—thus, facilitating discovery while preventing unnecessary
release of correlation handles.
Law #5 Pluralism
of Operators and Technologies
A universal-identity system must channel and enable the
interworking of multiple identity technologies run by multiple identity
providers.
Law #6 Human
Integration
The universal-identity metasystem must define the human user to
be a component of the distributed system integrated through unambiguous
human-machine communication mechanisms offering protection against identity
attacks.
Law #7 Consistent
Experience Across Contexts
The unifying identity metasystem must guarantee its users a
simple and consistent experience, while enabling separation of contexts through
multiple operators and technologies.
Competencies, Commoditization, and Real Value
Faced with so many choices and challenges in identity management,
some users are asking: Should this be at the core of my IT services or should I
delegate all of these activities to a third-party? This question leads to a
deeper business question: What is the value of all of these identities for my
company? How much time/money do I spend with identity management, and what is
the corresponding return on investment? In the world of Software + Services and
Software as a Service, how do I create value from identity management? Is there
value in outsourcing it? We must choose solutions to these problems carefully,
because (as expected) they are highly dependent on each company’s business
model and current technology infrastructure.
When drilling down the business justifications and current
technology trends, one can reach the following conclusions:
1.
Identity management is not a core business activity for the company.
2.
There is no value that can be obtained from identity management.
3.
Identity management is still a nontrivial technology, so that specific
competencies must be nurtured inside the IT department.
4.
Identities are becoming a commoditized IT function.
You have to understand how digital identities are being used by
your systems and what kind of resources you are providing, allowing, or denying
access to. When protecting internal network resources, it makes sense to retain
the identity-management competency in-house, mainly because it is being used to
build and maintain strict access-control lists (ACL) and is generating tons of
logged data that will be used later when required by law or by internal
auditing procedures. On the other side, customer access to internal resources
is rare, and you can get much more value from a consistent and updated
user-profile and historical interaction data than from the identity-management
activities. Customers usually have access only to application-generated views
of data and business logic, and do not interact directly with internal IT data
or equipment.
There is another aspect when dealing with customer-related data.
Some countries restrict the storage location of their citizens to in-country
hosted servers, demanding that you build and manage a local server
infrastructure to handle their data and associated tasks.
This brings us back to the digital-identity anatomy presented
earlier. Companies can extract more value from the information contained in the
main and context-based profiles than from the identifier and credentials
layers. Only companies that are dedicated to the business of generating and
managing digital identities (like digital-certificate issuers) generate value
from the two innermost layers. The handling of the processes of authentication
is being commoditized, with little value coming from them. The value lies in
the information that can be generated from users, and not the processes that
authenticate the user.
The Challenges of Identity Strategies
If you are building a new digital-identity strategy for your
company, first think about the audiences with which you will be dealing and the
infrastructure that your company will be using.
There are two audiences you will be exposed to: internal users
and external customers. You cannot get rid of your internal users:
They will be the ones who will help the company be successful,
the ones who will create value for the business. They are part-time and
full-time employees, vendors, and trainees. Depending on your business, you
might not have a significant number of external online customers. Customers
will always exist, but they might not always be represented by digital
identities and might not be worth tracking online. However, if you are part of
the IT team of an online bookseller, for example, you will have to think about
how you will handle the IDs for external customers.
For internal users, you might build an identity-management
infrastructure to handle all authentication, authorization, and internal network
resource accesses, providing the correct auditing functions for future behavior
and fraud analysis. However, if your company is a start-up and wants to
minimize the costs of building and maintaining its own infrastructure (and if
the regulations allow), you will probably use cloud services to host your
computing needs. If so, use either state-of-the-art identity-management systems
that are able to federate with external systems or identity cloud services
(like Windows Live ID) to handle all of your internal users’ needs.
For external users, I tend to recommend the use of identity cloud
services for managing IDs. As explained earlier, the value lies in the main and
context-based profiles, and not in the identifier and credential layers. You
would have to build an infrastructure that was big enough to handle current and
future customer audiences, which can lead to a big infrastructure that is
dedicated to customer identity management. Also, think about the benefits of
attracting more users, because you are using a system that already has millions
of preconfigured users that will not have to register a new set of credentials
and will not have to learn new procedures for authentication.
Conclusion
Digital identities and their use are still evolving, based on the
evolution of online services that are provided by the ubiquity of the Internet.
Identity management is no longer merely a set of procedures for authentication,
authorization, and provisioning of user accounts. If you understand how a
digital identity works and how the protocols are evolving over time, you will
be able to build for your company a much stronger and lasting identity
architecture. Today, building a new identity architecture means working with
systems and protocols that allow identity federation—further enhancing the use
of your internal and/or external identities.
About the author
Fernando Gebara Filho is an infrastructure architect in the
Development and Platform team at Microsoft Brazil. He joined Microsoft in 1999
as an infrastructure consultant at the local Microsoft Consulting Services
team, where he specialized in Active Directory and Microsoft Exchange projects.
In 2004, Fernando joined the Development and Platform group in the
infrastructure-architecture role, where he had a chance to get more involved in
the identity-management space and get a more agnostic view of the subject.
This article was published in the Architecture Journal, a print
and online publication produced by Microsoft. For more articles from this
publication, please visit the Architecture
Journal Web site.