Tamper-Resistant Apps

Cryptographic Hash Algorithms Let You Detect Malicious Code in ASP.NET

Jason Coombs

This article assumes you're familiar with ASP.NET and C#

Level of Difficulty123

SUMMARY

Cryptographic hash algorithms produce fixed-length sequences based on input of arbitrary length. A given input always produces the same output, called a hash code. Using these algorithms, you can compute and validate hash codes to ensure that code running on your machine has not been tampered with or otherwise changed. ASP.NET provides a software mechanism for validating hash code fingerprints for every page requested by a client. In this article, the author shows how to use hash codes with ASP.NET applications to detect tampering and prevent malicious code from running when tampering is detected.

Contents

Taking Your App's Fingerprints
Prevent Script Tampering Automatically
Deploying an ASP.NET IHttpModule

Abig concern for any Web developer is that a malicious third party will change code that has been deployed to a production server as part of a Web application. Ideally, Microsoft® Internet Information Services (IIS) is installed in an impenetrable data fortress and Web applications will never be compromised through tampering by a third-party intruder, worm, or Trojan. But it's always best to plan for the worst and put in place simple, manageable safeguards. If ASP.NET applications could automatically detect unauthorized changes and lock down access to modified files, one of the worst-case scenarios for ASP.NET security could be mitigated effectively. Luckily, ASP.NET makes this quite simple. (For more on security in IIS, see "Innovations in Internet Information Services Let You Tightly Guard Secure Data and Server Processes" in this issue.)

Whether you're a Web hosting provider who manages data security for publishing points where customers deploy code, or a deployer of code hosted by your own equipment, you know that publishing points are vulnerable to attacks and must be protected from unauthorized access. Malicious control of a publishing point is harmful not just to the one site that is compromised, but also to any other site or server that the attacker can deploy viral code. It must be your highest data security priority to prevent malicious code from being executed when it is published to your IIS box, and I'll show you how to do just that with ASP.NET.

Taking Your App's Fingerprints

Fingerprints aren't a reliable biometric mechanism for exclusive positive identification. They can be forged, and they are merely a superficial surface marking. However, it's different with data security. We can run data in its entirety through a processing algorithm that digests it into a smaller fixed-length data output; a digital fingerprint based not just on a superficial characteristic but on the entire data stream. A routine that creates such a digital fingerprint is known as a hash algorithm, and the digital fingerprint itself is called a hash code.

To deploy the hash verification described in this article, you first finish developing your ASP.NET application files using a trustworthy development workstation. Then you compute the hash code for each application file and deploy them to the server at the same time as the production application files. To prevent production hash codes from being tampered with, store them in a location other than the publishing point where the application files reside. Figure 1 shows the architecture of the data security hash code verification module and its relationship to ASP.NET. In Figure 1, a single ASP.NET application file named default.aspx is deployed to an ASP.NET publishing point along with the file's hash code. You'll notice the hash code shown in Figure 1 matches the hash code in the source of the hash verification module (see Figure 2). ASP.NET hosts the hash verification module out of process in the ASP.NET worker process and it relies on machine.config to locate the hash verification module assembly registered with the global assembly cache (GAC).

Figure 2 C# Hash Verification Module for ASP.NET

using System; using System.Web; using System.Security.Cryptography; using System.IO; namespace HashVerification { public class HashVerificationModule : System.Web.IHttpModule { public void Init(HttpApplication context) { context.AuthorizeRequest += new EventHandler(this.HashAuthorization); } public void Dispose() {} public void HashAuthorization(object sender,EventArgs e) { HttpApplication app = (HttpApplication)sender; try { FileStream f = File.Open(app.Request.PhysicalPath, FileMode.Open,FileAccess.Read, FileShare.ReadWrite); HashAlgorithm md5Hasher = MD5.Create(); byte[] hash = md5Hasher.ComputeHash(f); f.Close(); if(!BitConverter.ToString(hash).Equals("CE-11-4E-45-01-D2-F4-E2- DC-EA-3E-17-B5-46-F3-39")) { throw(new Exception()); }} catch(Exception ex) { app.Response.Write( "<html><body><h1>Error Processing Request</h1></body></html>"); app.CompleteRequest(); } } } }

Figure 1 HashVerificationModule Layer Hashes Each Request

Part of the Microsoft .NET Framework Common Language Runtime (CLR) is the System.Security.Cryptography namespace, which includes classes that implement hash algorithms. These Algorithms make it improbable that additional inputs to the hash algorithm will compute the same output hash code. Any change to the input creates unpredictable changes to the hash code. This makes hash algorithms ideal for use in detecting changes to application code as well as data sets.

Figure 3 shows how the MD5 HashAlgorithm class is used within a console application written in C# to compute and display hash codes for each file in the current directory. The MD5 static class method Create is called to instantiate an instance of the MD5 HashAlgorithm class, and System.IO.Directory.GetCurrentDirectory is used to obtain an object representing the current directory. GetFiles returns a FileInfo collection that can be iterated using foreach in C#. The MD5 HashAlgorithm object has a method called ComputeHash that works on an I/O stream, and the code in Figure 3 simply loops through the collection of files, opening, hashing, and closing each one. BitConverter is used to display a human-readable hex encoding of the bytes in the hash.

Figure 3 MD5 Hash Code Generator Console in C#

using System; using System.Security.Cryptography; using System.IO; namespace MD5Hasher { class Class1 { [STAThread] static void Main(string[] args) { HashAlgorithm md5Hasher = MD5.Create(); FileStream fs; byte[] hash; FileInfo[] fi = new DirectoryInfo( Directory.GetCurrentDirectory()).GetFiles(); foreach(FileInfo f in fi) { try { fs = f.Open(FileMode.Open); hash = md5Hasher.ComputeHash(fs); fs.Close(); System.Console.WriteLine(f.Name + ": " + BitConverter.ToString(hash)); } catch(Exception ex) {} } }} }

You can adapt the code in Figure 3 as part of your production code deployment procedure to automate updates to safe hash storage. Or you can use the console application as shown to update safe storage manually through administrator-level privileges in IIS. The most important lines in Figure 3 call the MD5 HashCode class's ComputeHash method based on the FileStream input stream (see Security: Protect Private Data with the Cryptography Namespaces of the .NET Framework for more information). The BitConverter class is used only for human readability; what really matters is that the encoding method matches the method used to validate hash codes automatically.

If you don't compute hash codes for each program or script that you execute and validate those hash codes periodically to ensure that those programs or scripts have not been changed, you really don't know whose code your computer is executing. This is a Catch-22 for automated systems that need a way to verify hash codes dynamically: the hash codes that are trusted as authentic must be available to the automated system in real time in order for it to validate hashes. Those stored hash codes are themselves subject to tampering, so how do you authenticate the hash codes? Digital signatures can be applied to each stored hash code so that the authenticity of the hash against which each file is compared can be established, but that only pushes the problem down another layer.

In order to verify a digital signature, a signature verification key must be accessible in real time to the automated system, and that key is also subject to tampering. You can hash the key, its certificate, and the entire chain of trust to make sure it hasn't been changed, but then the question becomes how to authenticate the authentic key, certificate, and chain of trust. The fact is that it too is subject to tampering unless it is embedded in non-programmable hardware along with the logic that makes use of it. The extent to which your automated system needs a guarantee of data security defines the extent to which you will need to unravel this tangle and how sensible it is to use extra layers of validation.

One method to increase the data security provided by hash code and signature validation is to attach a ROM storage device to the computer to store sensitive data such as authentic hash codes or a key used for verifying signatures. Then the problem becomes protection of the hash or signature verification application logic (machine code) since tampering at that level could alter the hash validation mechanism so that it doesn't retrieve data from the ROM device but from malicious data storage instead. A better, and simpler, solution is to pick a relatively safe storage location for the hash codes and accept the reality that programmable computers provide less than perfect data security.

The process by which hash codes get updated in the safe storage location must be different from the process by which application logic gets deployed. This extra layer of authorization creates two stages for publishing code to IIS: publishing code to the server and publishing hash codes to safe storage that give IIS permission to run the published code. It is relatively safe to assume that an intruder who gains access to one won't gain access to the other, and it is also safe to assume that an intruder who can gain access to both can also circumvent any data security mechanism built into software simply by reprogramming the machine completely.

Prevent Script Tampering Automatically

For optimal security, a computer must only run programs it is authorized to run. In most platforms today, the microprocessor will execute any compatible code. The future may see digital signature and hash code verification in hardware as a level of authentication the microprocessor implements before executing machine code instructions. Until then, it is up to developers and administrators to add protective layers in their software.

To incorporate automated tampering detection, there must be a layer of code to validate hash codes by recomputing a program's hash each time it is accessed but before it runs. Each one of your ASP.NET application files constitutes an individual program, and each client request for one of these files represents its execution. When you're done programming and debugging parts of your ASP.NET application, compute and store hash codes for each source file that you plan to deploy to the production server. The code shown previously in Figure 3 can be used for this purpose.

You should make sure to store production file hash codes outside the URL space to be sure it's safe from tampering even by an intruder who gains access to the application files of a hosted site through its normal publishing point. You don't have to go overboard trying to secure access to these hash codes; any intruder who gains complete control of your server will be able to remove your automatic tamper detection code anyway, so it's adequate to ensure that the hash codes are not accessible through any exposed publishing point. An intruder who gains enough access to publish code to your server but who fails to obtain complete control over it can be successfully locked out using the technique shown in this article. To stop an intruder who gains complete control over your server through Administrator remote access requires a different countermeasure.

ASP.NET provides a mechanism for validating hash code fingerprints for every page requested by a client. By editing machine.config, the file that controls ASP.NET configuration for a particular box running IIS, you can register a custom code module for handling HTTP requests from client browsers. The ASP.NET HTTP handler module interface is defined in System.Web.IHttpModule. Figure 2 shows C# code that creates a new HTTP handler module class for automatic hash verification. You can register the handler in machine.config in the <httpModules> section of <system.web>, where all modules that implement the IHttpModule interface are configured in ASP.NET. The IHttpModule interface is the ASP.NET replacement for ISAPI filters, and by creating a class that implements this interface you can layer in custom code to assist with HTTP request processing just as you would have done in the past with an ISAPI filter. HTTP modules are superior to ISAPI filters; for example, they're managed code, so they're not susceptible to buffer overflow attacks and they can be loaded and unloaded without restarting IIS.

System.Web.IHttpModule is the interface that all HTTP modules must implement for use in ASP.NET. Default HTTP modules, including WindowsAuthenticationModule and UrlAuthorizationModule, are configured automatically when ASP.NET is installed. Each default module registers itself with one or more of the event handler delegates of the ASP.NET application object. The idea behind the IHttpModule interface is that its two methods, Init and Dispose, get called by ASP.NET so that the module can register and unregister itself as an event handler for whatever events it needs to intercept. Init is called before any requests are processed by ASP.NET, and Dispose is called when ASP.NET needs to remove the module from the processing pipeline.

The IHttpModule class shown in Figure 2 registers itself with the application's AuthorizeRequest event delegate. The HashVerificationModule adds an authorization step to every request; this step computes the hash code for the ASP.NET page being requested by the client and matches it against the hash code as it was computed previously. It does this by opening a FileStream for the Request.PhysicalPath and using the MD5 HashAlgorithm class to compute a hash from the file input stream. For the code shown in Figure 2 to become a real-world hash verification module, you simply need the appropriate authentic hash code to be retrieved from a secure storage location.

You can hardcode the hashes of your application's production scripts into the HashVerificationModule so that a fresh build of HashVerificationModule must be deployed to the server whenever new production code is deployed to a publishing point. In Figure 2 a single hash code is hardcoded to illustrate this technique; note however that the hash code

CE-11-4E-45-01-D2-F4-E2-DC-EA-3E-17-B5-46-F3-39

appears in Figure 2 only as a demonstration; its use in your HashVerificationModule is not appropriate. In your real app, replace the hash with a list of production hashes and code that determines which hash to use for validation based on the value in Request.PhysicalPath or similar resource identifier. The entire process is shown in Figure 1.

Hardcoding production hashes in the HashVerificationModule works, but it may be a little clumsy. Besides, whenever you deploy updates to your ASP.NET source files and use a utility like the console application shown in Figure 3 to recalculate the hashes of each source file, you must update the hashes that you hardcoded into HashVerificationModule. This dependency is actually good for data security. You may also want to enhance the HashVerificationModule shown in Figure 2 by forcing it to verify the hash of the ASP.NET application's web.config file during processing of each client request. This will prevent an attacker who gains access to the application's root publishing point from replacing its web.config file with one of malicious design.

Deploying an ASP.NET IHttpModule

Classes that implement the IHttpModule interface are deployed in ASP.NET by adding them to the <httpModules> section of the appropriate XML configuration file. Although IHttpModules can be deployed for a specific ASP.NET application through the app's web.config file, this leaves your system more vulnerable than it would be when configuring the module at the machine.config level. The access permissions required to modify machine.config exceed the permissions required to modify web.config, so it makes more sense to deploy modules related to data security through machine.config. To configure the HashVerificationModule in machine.config you need to assign its assembly a strong name and modify <httpModules> by adding a new <add ... /> line. Add the class and its assembly to the GAC first. Replace the word "assembly" in the following sample with the name of the assembly that contains your build of the HashVerificationModule. This will be the strong name you assign to the assembly using the strong name command-line tool in the .NET Framework SDK:

<add name="HashVerification" type="HashVerification.HashVerificationModule, assembly"/>

Hash codes could also be cryptographically signed using an AsymmetricAlgorithm class such as RSA or DSA. The derived classes, RSACryptoServiceProvider and DSACryptoServiceProvider, contain a SignHash method that applies a private key to encrypt a hash code. The corresponding VerifyHash method uses the public key from the key pair to verify the signature. For absolute confirmation that your ASP.NET application files have not been tampered with, you can apply a signature to the codes you store in your hash database and call VerifyHash prior to performing the hash validation. When you do this, a malicious third party would have to steal your secret key and compromise the Web authoring security over your publishing point and the database or other safe storage location security simply to bypass the hash verification HTTP module loaded into each ASP.NET application by machine.config. A digitally signed hash code provides an extra level of data security in situations where authenticity determinations are made automatically. This is much better than just applying a hash algorithm to produce and verify hash codes.

An intruder who can tamper with your secure hash code storage can replace authentic hash codes with malicious ones if they know the hash code algorithm that you used. A keyed hash code algorithm, like those derived from KeyedHashAlgorithm, would also add extra data security since an attacker would have to capture not only the secret key used with the AsymmetricAlgorithm to apply a valid signature to their replacement hash codes, but also the key used in the KeyedHashAlgorithm to compute the hash codes.

For related articles see:
Security: Protect Private Data with the Cryptography Namespaces of the .NET Framework
Ensuring Data Integrity with Hash Codes

For background information see:
HOW TO: Create an ASP.NET HTTP Module Using Visual C# .NET
HOW TO: Create an ASP.NET HTTP Module Using Visual Basic .NET

Jason Coombsis cofounder of SCIENCE.ORG, a non-profit research institute of forensic computer science, and President of DigitalMarketplace.com, an Internet and data security programming firm. This article has been adapted with permission from his upcoming book, Microsoft Internet Information Services Security Technical Reference (Microsoft Press). E-mail him at jasonc@science.org.