Using Event and Trace Logs in SharePoint
The following are some examples of information to trace:
- Errors and exceptions. If an exception occurs in your application, you want to know about it. Even if your code handles the exception, it is often useful to record the event with a trace statement. For example, a download can initially fail but succeed on the second attempt. The user may not need to know this information but the developer might find it useful.
- Remote system calls. In a service-oriented world, it is common for an application to access functionality on a different system. Logging the request and response messages can be very helpful if you must debug the application. Many frameworks, such as Windows Communication Foundation (WCF), allow you to trace this information.
- Queries. If your application creates and executes queries, knowing the text and the parameters of these queries can be valuable debugging information. Queries are not limited to SQL queries against a database. Collaborative Application Markup Language (CAML), Lightweight Directory Access Protocol (LDAP), and Entity SQL are three of many other possibilities. It can be helpful to trace any querying syntax that your application uses.
Typically, tracing information is written to a local file. Although trace information is valuable, the amount of trace information can overwhelm both the developer and the resources of the system that records the information. For this reason, tracing levels, which correspond to degrees of severity, are typically adjustable at run time.
In addition to diagnosing run-time problems, trace logs are often used to detect hidden problems. For example, SharePoint writes to a trace file every time it believes that an object has not been properly disposed. This can occur when it finds objects that contain a reference to an SPRequest object (such as SPWeb and SPSite) and were not disposed as expected. The SharePoint tracing infrastructure is named the Unified Logging Service (it is sometimes also named the Universal Logging Service) or ULS.
Developers can program their applications to write custom operations messages to the ULS. Writing to the same trace log that Windows SharePoint Services uses enables you to view your custom application traces in the larger context of Windows SharePoint Services operations without having to correlate multiple trace logs.
The Partner Portal application demonstrates how to log to the ULS and provides a simple logger to help you. Although logging to the ULS requires you to also analyze the surrounding SharePoint traces, these additional traces often help to diagnose the problem.
|You can use the SharePoint Guidance Library's default logger or you can customize this component to write to a logging framework that you provide. Some developers prefer to keep their application tracing separate from the traces generated by SharePoint itself.|
SharePoint trace logs can become very complicated, particularly in server farms. The SharePoint Administration Toolkit includes a tool named the SharePoint Diagnostic tool (SPDiag) that helps analyze these trace logs. This tool brings together logs from various sources including ULS, Internet Information Services (IIS), the performance counter, Windows event log, and WMI (Windows Management Instrumentation)—and it presents a unified view of the information. For more information, see Trace Logs on MSDN and SharePoint Diagnostics Tool on TechNet. There are also tools on CodePlex for viewing ULS trace data, such as the SharePoint ULS Log Viewer.
Considerations for Tracing
Performance and noise, or irrelevant information, are both potential issues you may encounter when using tracing. Performing detailed tracing on a production system that is under heavy load will impact areas such as disk I/O and CPU utilization. This can degrade the application's performance. If performance becomes a problem, consider putting the logs on a separate drive to offload I/O from the system disk and to prevent inadvertently filling your system partition. You can also consider the option of implementing rolling logs so that the log files never grow too large. SharePoint uses this approach for the ULS, where you can configure the maximum number of log files that are saved to each server in the farm and the duration (in minutes) you want for each log file.
Detailed tracing can add irrelevant information to the trace log files. This affects performance and makes the logs difficult to analyze. More advanced logging systems have trace levels and categories to help control the amount of information that gets logged. Trace levels are applied to categories and control the level of detail that is logged (this is often referred to as verbosity). Categories define an area of functionality. For example, you can apply the verbose trace level to the Search Indexing category. You should raise trace levels to perform diagnostics only on the categories that require investigation, and the levels should be lowered after the diagnostics are complete.
Programmatically Changing Diagnostic Settings
The SharePoint object model has a class named SPDiagnosticService. This class enables you to do programmatically what you can do manually through the user interface in the Trace Log and Event Throttling sections on the SharePoint Central Administration's Diagnostics Logging page. You can also programmatically change these trace settings with PowerShell. For more information, see Redirecting IIS7 and SharePoint 2007 logging in PowerShell in Jason Cahill's blog.
For more information about the diagnostic logging settings, see Configure diagnostic logging settings (Windows SharePoint Services) on TechNet.
SharePoint Guidance Library supports tracing to the ULS. It also allows you to configure the tracing levels that are written to the ULS trace logs based on category and trace severity.
Logging helps IT professionals detect, diagnose, and troubleshoot problems with deployed applications. This information is often aggregated into a central monitoring system such as the System Center Operations Manager (SCOM).
Logging information should provide enough information so that someone can understand the problem and know what to do about it. A logging message such as "Object reference not set in MySampleMethod" can be helpful to a developer for diagnostics and is appropriate for tracing, but it is not useful to an IT professional. In fact, these sorts of messages can be detrimental because they clutter the logs with irrelevant information and make it difficult to find the actual problem. A better message is "Could not connect to database MySample: the connection timed out." This message gives enough information for an IT professional to understand where the problem is and how to address it.
The following are some examples of information to log:
- Predictable application failures. There are situations where you can expect applications to fail. For example, if an application calls a Web service and there is no response, the call fails. Logging these events can alert the IT professional to potential problems in the network.
- Installation or configuration issues. Typically, IT professionals are responsible for installing and configuring applications. If there are problems around these tasks, they need to know what they are.
- Unknown application failures. If an application has an unhandled exception, log the event. This information alerts the IT professional to potential problems. An example of such a log message is "Authentication failed for unknown reason." Although the reason is unknown, the message pinpoints the area of the application that failed and potentially indicates the need to bring in a developer to assist with diagnostics.
- Situations that can potentially cause issues in the future. Some issues can be anticipated, such as when a resource is about to become exhausted. Examples of possible messages are "The hard drive is becoming full" and "Deadlock detected in query MyQuery."
Event Sources for Logging to the Event Log
When you log data to the event log, you must set the EventSource parameter. This is the name of the application that can log information to the event log. The default event source for the Partner Portal application is the Office SharePoint Server event source, which is available with Microsoft Office SharePoint Server.
If you want to use an event source other than the SharePoint event source, you must create the event source on all Web front-end servers and application servers before you install and activate the features that are going to log messages. To create an event source, you can either edit the registry or use the Eventcreate utility. This must be done on each Web front-end server. For information about programmatically creating an event source, see EventLogInstaller and CreateEventSource on MSDN.
You must also have the correct privileges to create an event source. You can either be an administrator on the system or you can grant the ASP.NET application pool identity extra privileges to create event sources. By default, the ASP.NET user account cannot create event sources. However, this can also be done by creating a timer job that will, in turn, create the events. Timer jobs run on each Web front-end server. The use of timer jobs is beyond the scope of this guidance. For more information, see Creating Custom Timer Jobs in Windows SharePoint Services 3.0 on MSDN and Creating Custom SharePoint Timer Jobs by Andrew Connell.
It is recommended that you set the EventID parameter in addition to the EventSource parameter. This is the ID of the type of event that has occurred. Together, the EventSource and EventID parameters identify a particular problem with a particular application. Creating a new EventID does not change the registry, so it does not require a special privilege level.
The following code shows how to use the logging functionality that is provided by the SharePoint Guidance Library to write an error message that includes the EventID parameter.
logger.LogToOperations("Could not connect to database: " + databaseName, DATABASE_CONNECTION_EVENTID);
The logging functionality that is provided by the SharePoint Guidance Library also allows you to route particular levels of events to the event log based on category and event severity.
Storing Logging and Tracing Information
Logging is information that is intended for IT administrators. Typically, logging information is stored in the Windows event log because it is easily accessible with tools such as the System Center Operations Manager (SCOM). Therefore, as a general practice, logging information should be written to this location. Tracing information is typically intended for developers and product support professionals, but it is also used by experienced IT professionals. The ULS writes trace information to a trace log. For the Partner Portal application, tracing is written to the ULS trace logs at the default location of ...\12\LOGS.
Neither SharePoint diagnostics nor the implementation of logging that is provided in the SharePoint Guidance Library provides the ability to log to a database. However, in some cases, organizations prefer this approach. One reason is that a centralized database might offer better reporting features. Another reason is that a centralized database allows you to synchronize entries across all the Web servers in the farm. This can provide more complete diagnostics when a problem occurs. A third advantage is that you do not need to perform complex post-processing steps to aggregate the different logs.
A central database incurs higher costs and is more complex than file-based logging. It is possible for the logging activity itself to become a bottleneck in your system. Make sure that the tracing database never becomes full or else implement a stored procedure to control the size of the tracing table. You should carefully consider whether you want to use a database to augment or replace file-based logging. However, when a centralized database solution is used appropriately, it can be helpful.
The following table summarizes the tradeoffs between logging and tracing approaches. The headings have the following meaning in this table:
- Approach. This is the logging or tracing approach.
- In context with other SharePoint events/traces. This means that the information is present along with trace or event information from SharePoint. Frequently, this can help you identify the root cause of an issue because the application-specific traces are surrounded by SharePoint trace information that may relate to the issue. However, this additional information also adds complexity.
- Primary audience. This is the audience that is typically targeted by the approach.
In context with other SharePoint events/traces
Developer and support specialists, although also used by experienced IT professionals.
Custom trace log file
ASP.NET output trace
|Some system administrators may need to use information that the SharePoint system stores in ULS, and there are tools created by the SharePoint community that make ULS viewing and filtering easier. However, in general, it is recommended that in your SharePoint applications you should always record events that require human response in the Windows event log. When diagnosing a problem, it is a common practice for system administrators and developers to start with the event logs and then move to the SharePoint logs and IIS logs if more information is needed. In these cases, managing custom SharePoint applications can require collaboration between developers and system administrators.|