Microsoft
Tahoe
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
First
Look
Document
Management and Much More
By Tom Rizzo
You may
have already heard something about Microsoft's new product, now code-named
Tahoe. Tahoe is designed to make generating and managing information easier,
while providing a rich development environment for collaborative applications.
In this article, we'll look at what Tahoe is, and is not, its features, and how
a developer can build rich collaborative applications on the Tahoe platform.
When we
set out to build Tahoe, we had a number of clear goals in mind provided by the
needs of our customers and partners. Customers tell us that it's difficult to
locate information. With the explosion of intranets, file shares, and the
Internet, information is plentiful, but finding the desired information can be
difficult. Novice end users and experienced knowledge workers alike, find it
hard to organize and manage information. Many customers tell us that their
current knowledge systems are either tricky to set up or difficult to manage. We
also had a particular request for a departmental collaboration server that
didn't call for the stringent infrastructure requirements of Microsoft Exchange
2000.
Microsoft
Tahoe meets these goals by providing a simple and powerful way to access
corporate and Internet information. Tahoe also provides a mainstream
out-of-the-box document management system that integrates with the tools
knowledge workers use to create information, including Microsoft Office, Web
browsers, and the new Web Folders in Microsoft Office 2000. Finally, Tahoe
ships a departmental version of the Web Storage System from Microsoft Exchange
2000 that allows developers to build solutions that can be rolled out
departmentally without requiring modifications to the Active Directory.
Moreover,
Tahoe doesn't require the Active Directory to be deployed. Tahoe can also work
with Windows NT 4.0 domains. Tahoe provides all the collaborative power of the
Web Storage System without incurring the infrastructural overhead of rolling
out Active Directory across an entire organization. There are some limitations
(explained at the end of this article) if Active Directory isn't employed. If
those limitations are a concern, you can always build a solution on Exchange,
and then integrate Tahoe features to get around them. We will.
Document
Management Overview
The
foremost feature of Microsoft Tahoe is that it provides out-of-the-box document
management. Yet Tahoe targets a different market than most document management
systems. Tahoe has 80 to 90 percent of the functionality of other document
management systems. Where Tahoe has the edge is that most knowledge workers
don't need high-end document management to effectively manage information.
Rather than erring on the side of requiring complex infrastructure and
burdening client-side requirements, Tahoe is easy to use and deploy. You won't
see high-end features such as replicated document management with replicated
document lock management. If you need this type of high-end functionality, work
with one of the Microsoft partners that provides these capabilities.
Let's
see how Tahoe simplifies document management by examining the features of its
document management system, including check-in/check-out, linear versioning,
document profiling, role-based security, and publishing/approval routes.
Check-in/check-out. Tahoe allows users to check-out
documents, effectively disabling other users from editing those documents. When
users are done with a document, they can check the document back in.
Linear
versioning. Tahoe
also provides linear versioning support. When a user checks a document out,
Tahoe will "version" that document so the user can roll back to a previous
version. This provides a backup in case the changes that were incorporated into
a later version are no longer wanted. This version history is maintained
through check-in and check-out. When a user checks a document back in, a "dot
version" such as version 1.1, is created. When a document is marked as
published, a major version is made. For example, if you publish version 1.1 of
your document, the published version is 2.0. The user can go back to a previous
document version at any time. FIGURE 1 shows the document version interface.
FIGURE 1: Tahoe document
version interface.
Document
profiling. Tahoe
lets users fill out document profiles, which are the metadata for the document.
Examples of document profile properties can include author, category, or a
customized property. There can be multiple document profiles. Knowledge
coordinators can set up these profiles and make them mandatory for document
check-in. FIGURE 2 shows the interface for setting the document profiles
available in a folder. FIGURE 3 shows filling in a document profile when
checking in a Tahoe document through Office.
FIGURE 2: The interface for
setting document profiles in a folder.
FIGURE 3: Filling in a
document profile when checking in a Tahoe document.
Role-based
security. The
role-based security in Tahoe simplifies administration. Instead of displaying
the familiar ACL editor user interface to end users, Tahoe exposes a role-based
security system in which users can fall into three primary roles: Reader,
Author, and Coordinator. A Reader only has permissions to read the documents in
the folder. An Author can create and edit documents. A Coordinator can read,
write, edit, set permissions, and set document profiles for the folder.
Beyond
folder-level access, Tahoe provides an easy way for users to deny read access
to users at the document level as well. These roles are mapped back to Windows
security settings by the Tahoe system. FIGURE 4 shows the user interface for
setting role-based security.
FIGURE 4: User interface for
setting role-based security.
Publishing
and approval routing capabilities. Publishing turns the document into a major version. If approval
is set up, publishing will trigger an approval route before officially
publishing the document. The approval route can be a serial route to approvers,
or a parallel route sending the document to all approvers at once. The parallel
route has the option of final document approval only if all involved approve,
or if at least one approves. FIGURE 5 shows the interface for setting up
document routing and approval.
FIGURE 5: The interface for
setting up document routing and approval.
Ubiquitous
Client Access
Tahoe
simplifies document management by integrating with products people know and
use. Office 2000 is a key client of Microsoft Tahoe. Tahoe extends Office 2000
menus with document management options such as check-in, check-out, and
publishing. With these extensions, you can take advantage of the document
management features of Tahoe directly from Office.
Tahoe
also integrates with the Web Folders feature of Microsoft Office and Microsoft
Windows. Through the Web Folders interface, which eventually uses WebDAV (Web-based
Distributed Authoring and Versioning; a set of HTTP extensions) to communicate with the server,
you can browse and perform operations on your Tahoe information.
Tahoe
extends the Windows Explorer to provide richer views and semantics for the
document management capabilities. For example, you can see a Tahoe view in the
Windows Explorer, shown in FIGURE 6. This provides information about the
document on the left, while the list view on the right has the custom document
profile information. This extension and integration with Windows Explorer makes
it easier for users to find the information they need.
FIGURE 6: Viewing Tahoe in
Windows Explorer.
The
final way you can access Tahoe is through standard Web browsers. Tahoe provides
an out-of-the-box portal site which can incorporate not only Tahoe data, but
also data from business applications. The Tahoe portal is built using Web Parts
and the new Digital Dashboard 2.0 framework. (Web Parts are reusable components that wrap Web-based content such as
XML, HTML, and scripts with a standard property schema that controls how the
Web Parts are rendered in a digital dashboard. The Web Part SDK is available
at: http://www.microsoft.com/DirectAccess/Products/SBS/CRK/files/digital_dashboard/CD/webparts.htm.)
This
means that any of the Web Parts you build can be integrated into the Tahoe
portal. Tahoe exposes its functionalities, such as search and subscriptions, as
Web Parts. You can take these Web Parts and integrate them into the dashboards
that you build. All of the document management features of Tahoe are displayed
through standard browsers, such as Internet Explorer and Netscape Navigator. By
integrating Digital Dashboard, the Tahoe portal provides an extensible platform
(via Web Parts) for you as a developer. FIGURE 7 shows the Tahoe portal.
FIGURE 7: The Tahoe portal.
Extensive
Search Features
Beyond
document management and portal capabilities, Tahoe provides rich content
indexing and search capabilities as well. Tahoe can "crawl" Office documents,
intranet and Internet Web sites, file shares, Exchange 5.5 and 2000 servers,
other Tahoe servers, and Lotus Notes servers. Unlike the content indexing
included directly in Exchange 2000 which can only support Exchange 2000 data
sources in a single index, Tahoe searches can crawl these multiple data sources
and store the results in a single index. This means you can search all of these
data sources with a single query. And Tahoe provides integrated security so
users don't see unnecessary results.
Tahoe
also provides a saved-query search engine. This engine matches documents
against saved queries as documents are indexed. This makes for a faster search
since Tahoe is smart about matching the documents to queries that users have
saved. This allows users to subscribe to a query, e.g. whenever Tom Rizzo is
the author, notify me. Then when a new document meets that criteria, the user
will be notified by the engine via portal or email.
Developers
can take advantage of the content indexing through the familiarity of standard
APIs. For example, if content indexing is enabled and you perform an ADO query
against a source that is content-indexed, Tahoe will leverage that index to
provide the best results. Through queries, the relevance ranking of the
returned result sets can be changed based on valuing attributes specified in
the query. For example, if you did a search on Web Storage System against all
the documents in a Tahoe index, you may want to rank documents that also
include Exchange 2000 as more relevant. With Tahoe search, you can provide
these capabilities in your applications.
Tahoe
search is enterprise-ready. You can deploy multiple indexing or search servers
across the organization to offload the indexing and user queries to dedicated
boxes. Tahoe can support full and incremental crawls of data. For some data
sources such as Tahoe data sources, you can set up a document-change
notification for the indexing engine, so a document is indexed immediately if
it changes.
Departmental
Web Storage System
One of
the great things about Tahoe for developers is that the product is built on the
Web Storage System. For those of you who have not heard of the Web Storage
System, this is the enhanced collaborative technology in Exchange 2000, which
added ADO/OLEDB support, WebDAV, Installable File System, server events, and
workflow to Exchange. (For a detailed introduction to the Web Storage System,
see "Exchange 2000 & Web Storage System: Getting to Your Semi-structured
Data" by Alex Gomez, http://www.officevba.com/features/2000/10/vba200010ag_f/vba200010ag_f.asp.)
Since
Tahoe builds on this technology, all of this support is included in the
product. And since Tahoe doesn't require the infrastructure needed for
Exchange, Web Storage System solutions can be deployed departmentally, or in
enterprises that doesn't have an Exchange environment. This is a great win for
developers who want to build rich collaborative applications, but don't want to
install Exchange 2000.
Tahoe
ships with, and extends the familiar Microsoft Exchange object models such as
Collaboration Data Objects (CDO) for Exchange 2000. Tahoe adds document
management capabilities to CDO such as check-in/check-out, version control, and
easier schema manipulation.
Some
Limitations
Tahoe
does have some limitations, the main ones having to do with document
management. As mentioned earlier, Tahoe isn't targeted at companies with
high-end document management needs; it's intended users are departmental or
small organization customers. This doesn't mean you can't use multiple Tahoe
servers in a large organization to meet its document management needs. Search
and content indexing - as well as the portal - are enterprise-ready.
Conclusion
Whew!
That was just a quick overview of the capabilities of Microsoft Tahoe. Tahoe
does have other capabilities that are beyond the scope of this article. To
learn more about Tahoe, visit the Microsoft Web site. As Tahoe does not yet
have an official name, there is also no Web site. Keep an eye on the Microsoft
Web site at http://www.microsoft.com
for updates, and download the beta of Tahoe now to give it a try. I guarantee
you will like what you see.
Tom Rizzo is
a product manager at Microsoft. He focuses on helping developers understand and
leverage the new Web Storage System. Tom is also the author of two Microsoft
Press books Programming Microsoft Outlook and Exchange, 1st and 2nd editions (March 1999,
June 2000). You can reach Tom at thomriz@microsoft.com.