Office

Relive the Moment by Searching Your IM Logs with Custom Research Services

John R. Durant

This article discusses:

  • Using Research Services to search IMs
  • Storing and indexing IMs
  • The Research Services Development Extras extensions
  • Fast, easy Research Service development with Visual Studio
This article uses the following technologies:
Office 2003, SQL, Visual Basic

Code download available at:ResearchServices.exe(158 KB)

Contents

Building the Solution
The Windows Service
Querying via the Research Service
Modifications to the Solution
Conclusion

If you are like me, you communicate regularly via instant messaging (IM). Over time, my IM conversations have gone from tentative trials to vital missives that are an inexorable part of my day-to-day work. Often, my IM conversations contain important information that I want to keep and possibly reuse. Fortunately, MSN® Messenger 6.2 has a feature to keep a conversation history permanently in XML format.

This article shows you how to leverage that conversation history further by consolidating IM exchanges so they are indexed, searchable, and ultimately reusable using the Microsoft® Office 2003 Research and Reference task pane. For example, when writing in Microsoft Word or using Word as your e-mail editor, you can quickly search your conversation history for messages that contain a term you have typed in your document. Using smart tag technology in the task pane allows you to insert the text of IM conversations directly into your document.

One of the strengths of this solution is that it takes advantage of the existing capabilities of both MSN Messenger and Office 2003. For the Office side of things, the solution uses the Research task pane which is a new feature in Office 2003. The Research task pane allows users to quickly locate and use the information that they need without leaving the Office application that they are working in. It is available in the 2003 versions of Word, Excel, Outlook®, PowerPoint®, Publisher, and Visio®. You can also see the pane when working in Microsoft Internet Explorer. Because the Research feature interface is tightly integrated with these Office 2003 applications, it can both extract information from the Office application and insert content into it. In the solution we'll explore here, you will see that using the Research task pane, users can search the store of IM conversations and insert conversation text into the document a user is modifying.

Interacting with your solution using smart tag actions and the Research task pane is actually only a part of the overall solution. What sets the sample for this article apart is that it uses a scalable framework that allows for full-text indexing and greater extensibility. While you could wire up the Research task pane directly to the folder where MSN Messenger stores your conversation history, this has some distinct disadvantages. For example, MSN Messenger stores the conversation history in simple XML files, one file per contact. In order to search in these files, you would have to use an XML parser to sift through the files looking for the search term. With only a few short files, this is not a problem, but as the number of files grows and as their content lengthens, such an architecture yields ever worsening performance. Also, it makes your IM content available only as long as the files exist. What if the machine becomes disabled for some reason? The information did not instantly become useless—its value endures. These and other reasons led me to design a solution with a simple framework for central IM conversation management.

Building the Solution

Think of this framework in terms of a few main processes: data collection via conversation harvesting, conversation indexing and searching, and conversation repurposing in the task pane (see Figure 1). The first process, that of data collection, describes the action of gathering up conversations and sending them to a repository. The harvesting should occur when new conversations are created or when existing ones are updated. The second process involves indexing the harvested conversations so that different applications can search their contents and retrieve results. The final process involves presenting the retrieved results and allowing the user to repurpose the content in some way.

Figure 1 Overall Solution Architecture

Figure 1** Overall Solution Architecture **

This solution uses a Windows® service to monitor the directory where MSN Messenger stores the conversation history files. As MSN Messenger creates new files or appends existing XML files, the service updates a SQL Server™ 2000 database with the new content. SQL Server is a greater indexer, and its query processor is also quite good. Once in the database, the conversation content is ready for queries and for discovery via a research service that presents the search results in the Research task pane. When the data is present in the pane, the user can select a search result and insert it into the document by executing a smart tag action. The smart tag action code uses the XML features in Word 2003 to insert the conversation text into the document.

Let's explore the solution, starting where it all begins, with MSN Messenger. After signing in to MSN Messenger, open it, choose the Tools menu, and click Options. Click on the Messages tab of the Options dialog. At the bottom, in the Message History section, check the box next to the label, "Automatically keep a history of my conversations." By default, Messenger will keep this history in a subdirectory of your My Documents folder.

You can change this to another location if desired. Remember where you put it, though, because you need to configure your Windows service to monitor the files in this directory. The XML content of a conversation history file is structured such that each conversation is assigned a unique identifier in the file. MSN Messenger also tracks when the conversation occurred as well as each message sent between the two parties in the conversation. The contents of a typical file are shown in Figure 2.

Figure 2 An MSN Messenger File

<?xml version="1.0" ?> <?xml-stylesheet type='text/xsl' href='MessageLog.xsl'?> <Log FirstSessionID="1" LastSessionID="2"> <Message Date="12/5/2004" Time="10:49:42 AM" DateTime="2004-12-05T17:49:42.880Z" SessionID="1"> <From> <User FriendlyName="email_of_sender" /> </From> <To> <User FriendlyName="John R. Durant" /> </To> <Text Style="font-family:MS Shell Dlg; color:#000000;">I love reading MSDN Magazine.</Text> </Message> <Message Date="12/5/2004" Time="10:50:45 AM" DateTime="2004-12-05T17:50:45.441Z" SessionID="2"> <From> <User FriendlyName="email_of_sender" /> </From> <To> <User FriendlyName="John R. Durant" /> </To> <Text Style="font-family:MS Shell Dlg; color:#000000;">Yeah, me too!</Text> </Message> </Log>

The Windows service monitors the target directory, and when files are added or when new conversations are added to existing files, it extracts the new information and loads it into the database.

The Windows Service

The Windows service conveniently uses the classes in the .NET Framework to monitor the directory configured in the MSN Messenger UI. To be effective, the service needs to respond to update the database when a file is added to the directory or when an existing file is updated. To monitor the directory, it uses the FileSystemWatcher class in the System.IO namespace. Anyone still wondering if they ought to learn how to use the .NET Framework can now consider the matter resolved. Classes like these, and there are many, are what make the libraries in the .NET Framework so compelling. By declaring an instance of this class and registering with its events, you can easily detect when new files are created, deleted, changed, or renamed. Our code responds to only two of those events, and the event procedures both call the same custom procedure. The event code for the Windows service with the event procedures is shown in Figure 3.

Figure 3 Handling FileSystemWatcher Events

Imports System.IO Imports System.Web Imports System.Xml Imports System.Net Imports System.Data Imports System.Threading Imports System.Configuration Imports System.ServiceProcess Public Class Service1 Inherits System.ServiceProcess.ServiceBase Public fsw As New FileSystemWatcher Protected Overrides Sub OnStart(ByVal args() As String) fsw.Path = CType(ConfigurationSettings.AppSettings( _ "MessengerSource"), String) AddHandler fsw.Changed, AddressOf ProcessUpdatedFile AddHandler fsw.Created, AddressOf ProcessUpdatedFile fsw.EnableRaisingEvents = True End Sub Public Sub UpdatedFile(ByVal s As Object, _ ByVal e As FileSystemEventArgs) UploadData(e.FullPath) End Sub

The Windows service's custom procedure, UploadData, opens the target XML file (either added or changed) and sends its contents to the central SQL Server database. Figure 4 shows the code that performs this data handoff.

Figure 4 Uploading Conversations to SQL Server

Private Sub UploadData(ByVal fullFilename As String) Dim CONNECT_STRING As String = CType( _ ConfigurationSettings.AppSettings("ConnStr"), String) Dim chatDoc As XmlDocument = New XmlDocument Dim chatNode As XmlNode Dim SessionID As String Dim chatNodes As XmlNodeList Dim cn As SqlClient.SqlConnection = _ New SqlClient.SqlConnection(CONNECT_STRING) Dim cmd As SqlClient.SqlCommand = _ New SqlClient.SqlCommand("sp_AddData", cn) cmd.CommandType = CommandType.StoredProcedure Dim p_FromName As New SqlClient.SqlParameter p_FromName.ParameterName = "@FromName" p_FromName.Direction = ParameterDirection.Input p_FromName.SqlDbType = SqlDbType.VarChar p_FromName.Size = 50 Dim p_MessageDate As New SqlClient.SqlParameter p_MessageDate.ParameterName = "@MessageDate" p_MessageDate.Direction = ParameterDirection.Input p_MessageDate.SqlDbType = SqlDbType.DateTime Dim p_ToName As New SqlClient.SqlParameter p_ToName.ParameterName = "@ToName" p_ToName.Direction = ParameterDirection.Input p_ToName.SqlDbType = SqlDbType.VarChar p_ToName.Size = 50 Dim p_SessionID As New SqlClient.SqlParameter p_SessionID.ParameterName = "@SessionID" p_SessionID.Direction = ParameterDirection.Input p_SessionID.SqlDbType = SqlDbType.SmallInt Dim p_MessageText As New SqlClient.SqlParameter p_MessageText.ParameterName = "@MessageText" p_MessageText.Direction = ParameterDirection.Input p_MessageText.SqlDbType = SqlDbType.VarChar p_MessageText.Size = 255 cmd.Parameters.Add(p_FromName) cmd.Parameters.Add(p_MessageDate) cmd.Parameters.Add(p_ToName) cmd.Parameters.Add(p_SessionID) cmd.Parameters.Add(p_MessageText) chatDoc.Load(fullFilename) SessionID = chatDoc.ChildNodes(2).Attributes(1).Value chatNodes = chatDoc.SelectNodes( _ "//Message[@SessionID='" & SessionID & "']") Try cn.Open() For Each chatNode In chatNodes p_FromName.Value = chatNode.ChildNodes(0). _ ChildNodes(0).Attributes(0).Value p_MessageDate.Value = chatNode.Attributes(0).Value _ & " " & chatNode.Attributes(1).Value p_ToName.Value = chatNode.ChildNodes(1). _ ChildNodes(0).Attributes(0).Value p_SessionID.Value = SessionID p_MessageText.Value = chatNode.ChildNodes(2).InnerXml cmd.ExecuteNonQuery() Next Finally cn.Close() End Try End Sub

A good portion of the code here is focused on setting up the database connection and getting the stored procedure ready to execute. The code walks the structure of XML nodes and attributes to extract the desired information so that it can assign the proper values to the parameters passed to the stored procedure.

Querying via the Research Service

Once the database stores the conversation content, another process can query and extract the contents for further use. The primary consumer of the SQL Server data is a custom Web service that is a research service provider in Office 2003. This custom Web service handles queries from Office 2003 and then queries the SQL Server database for the target search term. It formats responses as item lists in the Research task pane with each additional response marked up with smart tag type information. The real work, then, of the research service is simply to query SQL Server and assemble the response so that it conforms to the research service QueryResponse XML schema.

This is the most important aspect of this part of the overall solution. Research services are Web services with two important methods: Registration and Query. The Registration method sends back important information when a user registers the new research provider. The registration request, sent by the Office 2003 application, conforms to the Registration.Request schema. Responses, on the other hand, need to conform to the Registration.Response schema. Once a research provider is registered, it can field requests that conform to the Query schema and send responses conforming to the Query.Response schema. Office has built-in awareness of these schemas, so nothing needs to be done on the client. All of the work is done in the Web service.

The structure of responses requires the assembly of XML nodes, elements, and attributes, and this can become very tedious. It is much easier using the Research Services Development Extras (RSDE) recently published on MSDN® online. You can find them at the MSDN Office Developer Center at Office 2003 Toolkit: Research Service Development Extras. For more information, read my article introducing research service technology at The Definitive "Hello World" Managed Smart Document Tutorial.

The RSDE contains, among other things, some managed classes that let developers write registration and query responses using an object model rather than using a schema directly. The classes in the object model handle streaming out responses that correspond to the research services schemas for you.

For example, the code in Figure 5 shows the output for returning paragraph text to the task pane along with an action that will allow a user to insert text into the document quickly.

Figure 5 Response Text

<ResponsePacket revision="1" xmlns="urn:Microsoft.Search.Response"> <Response domain="{96293571-B605-4fd0-9C6A-7202AA689967}"> <QueryId /> <Range> <StartAt /> <Results> <Content xmlns="urn:Microsoft.Search.Response.Content"> <P> <Char xmlns="urn:Microsoft.Search.Response.Content"> <Char bold="true">Text to insert: </Char> This is text we can insert from the task pane.</Char> <Actions defaultAllowed="false"> <Text>Actions</Text> <Insert> <Text>Insert</Text> </Insert> <Copy> <Text>Copy</Text> </Copy> </Actions> </P> </Content> </Results> </Range> <Status>SUCCESS</Status> </Response> </ResponsePacket>

As long as Office 2003 receives this XML as the response, it doesn't care where it comes from. In the simplest implementation, you could use a StringBuilder object instance to create the string and send it back to the calling program. Or, you could use a couple of statements, shown in Figure 6, to do the same thing using RSDE.

Figure 6 Using RSDE

Public Overloads Overrides Function Query( _ ByVal request As Query.QueryRequest) As Query.QueryResponse Dim response As New Query.QueryResponse Dim content As New Query.ContentResponseWriter Dim crw As New Query.ContentResponseWriter Select Case request.QueryText.ToLower() Case "test" content.WriteParagraph("<Char bold=""true""> Text to insert:" _ & "</Char>This is text we can insert from the task pane.", True) End Select response.WriteResponse(content) Return response End Function

Notice that most of the work is done in one statement, namely, the one that executes the WriteParagraph method. Gratefully, the managed classes in RSDE dramatically simplify the amount of code you need to write, and they greatly clarify the meaning of the code you do write. If you are new to research providers creation, these classes will get you up and running without having to search through the schema reference for research services.

Another benefit of the RSDE is that it includes a template for creating the Web service project in Visual Studio® .NET 2003. While you can create your own research provider from scratch by creating a generic Web service in Visual Studio, you will need to put a lot of the underlying pieces together on your own. As before, the RSDE makes this much simpler.

For example, once you start creating a project from the template, the RSDE launches a wizard that collects the information to display when a user registers your service. It also makes it simple to generate the GUID that uniquely identifies the service. You can easily include a EULA and copyright information. Finally, it generates the custom classes with the registration method already coded. For most research services, creating them using the wizard eliminates the need to learn anything about the registration schema or process because it's all taken care of for you.

That said, in the long run, there is no substitute for actually knowing the registration and query schemas. The wrapper classes take care of a great deal of the busywork when creating a Web service that delivers Research task pane functionality, but your service will become more polished and refined as you gain a deeper knowledge of the schemas and combine it with the power of the RSDE classes. You can learn more about the research services schemas by checking out the Research Services SDK on the MSDN Developer Center.

Obviously, the research provider for this solution does more than just display some hardcoded text. In this solution, the Web service needs to take the target search term and find out if any of the harvested messages contains it. A word needs to be said about the SQL Server that contains the database. To achieve the greatest search power, you should install full-text indexing on the database server. This allows you to perform more powerful queries against the data. For example, without it you can use a keyword such as LIKE with wildcard characters. However, this is not really the best way to do your searching. Full-text indexing allows you to use more powerful keywords such as CONTAINS and FREETEXT.

The design of the table that contains the data is shown in Figure 7. As you can see, it has these columns.

Figure 7 Table Design

Column Name Data Type Length
MessageID int 4
FromName varchar 50
MessageDate datetime 8
ToName varchar 50
SessionID smallint 2
MessageText varchar 255

This solution requires a full-text index only on the MessageText column. Building a full-text index on this column lets you use the CONTAINS keyword to find any entries that contain the search word, inflectional forms of the word and so much more. The stored procedure for querying the table looks like this:

CREATE PROC sp_SearchData @SearchTerm varchar(50) AS SELECT FromName, MessageText FROM MessageLog WHERE CONTAINS(MessageText, 'FORMSOF(INFLECTIONAL,@SearchTerm)')

This search is powerful because it will pick up inflectional forms of the word anywhere in the MessageText column of the table. For example, if the search term is the word "run", the query will return results that include other forms of the word such as "ran", "running," or "runs". Obviously, the chances are that you will be searching for something more specific than this verb, but you can trust SQL Server to ably handle your query.

The research service receives a query request from the Research task pane, which triggers the query to the database. The request is fielded by the code in the Query WebMethod. The sample, shown in Figure 8, queries the database and displays message results in a list with a smart tag action that allows the user to insert the text into the document.

Figure 8 Searching a Database from a Research Service

Public Overloads Overrides Function Query( _ ByVal request As Query.QueryRequest) As Query.QueryResponse Dim response As New Query.QueryResponse Dim content As New Query.ContentResponseWriter Dim crw As New Query.ContentResponseWriter Dim CONNECT_STRING As String = CType( _ ConfigurationSettings.AppSettings("ConnStr"), String) Dim cn As SqlClient.SqlConnection = _ New SqlClient.SqlConnection(CONNECT_STRING) Try cn.Open() Dim cmd As SqlClient.SqlCommand = _ New SqlClient.SqlCommand("sp_SearchData", cn) cmd.CommandType = CommandType.StoredProcedure Dim p_SearchTerm As New SqlClient.SqlParameter( _ "@SearchTerm", request.QueryText) p_SearchTerm.Direction = ParameterDirection.Input p_SearchTerm.SqlDbType = SqlDbType.VarChar p_SearchTerm.Size = 50 cmd.Parameters.Add(p_SearchTerm) Dim rdr As SqlClient.SqlDataReader = cmd.ExecuteReader() While rdr.Read content.WriteHeading("Message on " _ & rdr.Item("MessageDate"), False, False, crw) content.WriteParagraph("<Char bold=""true"">" _ & rdr.Item("FromName") & "- </Char>" _ & rdr.Item("MessageText"), True) content.WriteHorizontalLineSeparator() End While Catch exc As SqlException 'Handle the error by posting to the Windows event log, and so on. Finally cn.Close() End Try response.WriteResponse(content) Return response End Function

Using the RSDE wrapper makes it so easy to present the data in the task pane with the right formatting as well as with the appropriate accompanying actions. For example, the WriteParagraph method has two parameters, paragraphText and displayInsertCopyButton. The first parameter, quite sensibly, is the text that you want to display. The second parameter, a Boolean value, allows you to include an action menu that lets the user take action to insert or copy the displayed text. The final display, the task pane shown in Figure 9, lists each of the messages in conversations that included the search term along with an action menu to insert or copy the text.

Figure 9 Research Task Pane

Figure 9** Research Task Pane **

The display here is only a simple example of the many varieties of things you can do in the Research task pane. Again, the best way to get going is to use the RSDE, but you should also take the time to study the research service schemas in the Research Services SDK.

Modifications to the Solution

Just when you thought it couldn't get easier to make a research service, things get even simpler. When using the RSDE template in Visual Studio, the wizard presents you with an important choice. Because many research services work with data from multiple sources, the wizard lets you choose the kind of provider you want to connect to (see Figure 10). Simply specify your connection string and an SQL statement or the name of a stored procedure you want to execute when queries come through the service.

Figure 10 Building a Data Layer in Your Research Service

Figure 10** Building a Data Layer in Your Research Service **

The next dialog in the wizard lets you specify how you want results presented in the Research task pane and how many results you want displayed at a time. After the wizard completes, you will find that it has created an entire data layer for you in the Web service project. The task pane shown in Figure 11 indicates how a research service created using this technique presents the results. Using the same stored procedure and data as in the custom sample, this new research service worked without writing a single line of code! It worked with no modification or intervention whatsoever. Of course, you will want to modify the code and tailor it to your needs, but the power of the RSDE is astonishing.

Figure 11 Research Results

Figure 11** Research Results **

One of the modifications you will want to explore as you develop more research services is the addition of more powerful smart tag actions in the pane. When the research service packs up the results with the schema-based markup, you can assign smart tag type information to results so that registered smart tag action handlers on the client can present the user with actions assigned to that smart tag type. For more information about smart tags and how to create custom smart tag action handlers, see Ben Waldron's article on developing smart tags for Office 2003 in this issue of MSDN Magazine. You can also visit Smart Tags and Smart Documents for more ideas.

You can create a custom DLL with powerful action handlers associated with your own custom smart tag type names. The Research task pane has been designed to sense this type information and let Office know that the marked up content should be presented with the appropriate smart tag actions menu. The code you put in your smart tag actions DLL will fire when a user clicks a given menu item. The insert and copy menu items you see in Figure 8 are built into Office. Yours will be different because they are your own invention. What you do when a user clicks a smart tag action item is up to you. Your action handler can connect to the Web or a database, perform file operations, create presentations, insert hidden text, or pretty much any other thing you can do in a DLL.

Conclusion

It is becoming increasingly common for users to expand their authoring repertoire in Microsoft Office. Research services are a great way to simplify user tasks, reduce manual operations like copying or inserting text, and bring more information to users while they are working with documents. This article uses instant message logs as the data source, but you can choose virtually any other source. Leveraging the research service schemas, you can determine how search results are presented to the user in the task pane. To make things easier, you can use the Research Service Development Extras to save you some of the grunt work that is required to get a research service up and running. From there, you can modify the service to meet the needs of your organization.

John R. Durant manages the MSDN Office Developer Center and is a frequent author and speaker on a wide variety of development topics. He co-authored the latest edition of The XML Programming Bible (Wiley & Sons, 2003).