Creating Asynchronous, Transactional, Cache-Friendly Web Services

This chapter is excerpted from Building a Web 2.0 Portal with ASP.NET 3.5: Learn How to Build a State-of-the-Art Ajax Start Page Using ASP.NET, .NET 3.5, LINQ, Windows WF, and More by Omar AL Zabir, published by O'Reilly Media

Building a Web 2.0 Portal with ASP.NET 3.5

Logo

Buy Now

Web applications that expose a majority of their features via web services or depend on external web services for their functionality suffer from scalability problems at an early stage. When hundreds of users concurrently hit your site, long-running external web service calls start blocking ASP.NET worker threads, which makes your site slow and sometimes unresponsive. Sometimes requests fail from timeout errors and leave user data in an inconsistent state. Moreover, lack of response caching support in the ASP.NET AJAX Framework makes it even harder because servers serve the same requests again and again to the same browser. In this chapter, you will learn how to rewrite ASP.NET AJAX web service handlers to handle web method calls on your own and make your web methods asynchronous, transactional, and cache-friendly.

Scalability Challenges with Web Services

Ajax web sites tend to be very chatty because they make frequent calls to web services, e.g., auto complete on text boxes, client-side paging on data grids, and client-side validations require frequent web service calls. Thus, Ajax web sites produce more ASP.NET requests than similar non-Ajax web sites. Moreover, it gets worse when the web service calls make another web service call to external services. Then you not only have an incoming request but also an outgoing request. This means double the load on ASP.NET. ASP.NET has a limited number of worker threads that serve the requests. When there's no threads left free, ASP.NET cannot execute requests. Requests are queued in a waiting queue, and only when a worker thread becomes free does a request from the queue gets the chance to execute. When web service calls perform long I/O operations, like calls to an external web service, long-running queries in the database, or long file operations, the thread that is executing the request is occupied until the I/O operation completes. So, if such long requests are made more frequently than they complete execution, the ASP.NET worker thread pool will be exhausted. Which means further requests will be queued in the application queue and your web site will become nonresponsive for some time.

Fetching Flickr photo widget's XML takes a couple of seconds. So, when hundreds of users load the Flickr photo widget concurrently, too many web service calls will get stuck while fetching the XML. If Flickr somehow becomes slow and takes 10 seconds to complete each request, all such proxy web service calls will get stuck for 10 seconds as well. If there's high traffic during these 10 seconds, ASP.NET will run out of worker threads, and it will not execute new requests and the site will appear very slow to users. Moreover, if requests in the application queue are stuck for more than 30 seconds, they will time out and the site will become nonresponsive to users.

In Figure 7.1, "When there are too many requests for ASP.NET to handle, Requests In Application Queue starts to increase and Requests/Sec decreases. High Request Execution Time shows how long external web services requests are stuck.", you can see a production server's state when it has exceeded its limit. External web service calls are taking too long to execute, which makes the request execution time too high. Some requests are taking more than 200 seconds to complete. As a result, 72 requests are stuck in calling external services, and additional incoming requests are getting queued in the application queue. The number of requests completing successfully per second is very low as shown in the Requests/Sec counter. Also, requests are waiting in the queue for more than 82 seconds to get a free worker thread.

Figure 7.1. When there are too many requests for ASP.NET to handle, Requests In Application Queue starts to increase and Requests/Sec decreases. High Request Execution Time shows how long external web services requests are stuck.

When there are too many requests for ASP.NET to handle, Requests In Application Queue starts to increase and Requests/Sec decreases. High Request Execution Time shows how long external web services requests are stuck.

Real-Life: Fixing a Traffic Jam in the Request Pipeline

Problem: A popular widget took too long to execute, the web servers got stuck, and the web site was unusable.

Solution: Changed the proxy web service to an asynchronous HTTP handler.

One time at Pageflakes, the external stock quote service was taking too long to execute. Our web servers were all getting stuck. After they were restarted, the web servers would get stuck again within 10 minutes. The stock quote widget is very popular, and thousands of users have that widget on their page. So, as soon as they visited their page, the stock quote widget made a call via our proxy web service to fetch data from the external stock quote service. Requests to the proxy got stuck because the external web service was neither executing quickly nor timing out. Moreover, we had to use a high timeout because the external stock quote service is generally very slow. As a result, when we had a large traffic spike during the morning in the U.S., all our web servers repeatedly got stuck, and the web site became unusable. We had no choice but to disable the stock quote widget for an immediate solution. For a long-term solution, we had to change the stock quote proxy web service to an asynchronous HTTP handler because ASP.NET AJAX does not support asynchronous web methods.

Finding out what caused the web servers to get stuck was rather difficult. We went through the HTTP.sys error logs that are found under C:\windows\system32\Logfiles\HTTPERR. The logfiles were full of timeout entries on many different URLs including the stock quote service URL. So, we had to turn off each URL one at a time to figure out which one was the real culprit.

Asynchronous Web Methods

By default, all web methods declared on a web service are synchronous on the server side. However, the call from the browser via XML HTTP is asynchronous, but the actual execution of the web method at the server is synchronous. This means that from the moment a request comes in to the moment the response is generated from that web method call, it occupies a thread from the ASP.NET worker pool. If it takes a relatively long period of time for a request to complete, then the thread that is processing the request will be in use until the method call is done. Unfortunately, most lengthy calls are due to something like a long database query or perhaps a call to another web service. For instance, if you make a database call, the current thread waits for the database call to complete. The thread simply has to wait around doing nothing until it hears back from its query. Similar issues arise when a thread waits for a call to a TCP socket or a backend web service to complete.

When you write a typical ASP.NET web service using web methods, the compiler compiles your code to create the assembly that will be called when requests for its web methods are received. When your application is first launched, the ASMX handler reflects over the assembly to determine which web methods are exposed.

For normal synchronous requests, it is simply a matter of finding which methods have a [WebMethod]attribute associated with them.

To make asynchronous web methods, you need to ensure the following rules are met:

  • There is a BeginXXX and EndXXX web method where XXX is any string that represents the name of the method you want to expose.

  • The BeginXXX function returns an IAsyncResult interface and takes an AsyncCallback and an object as its last two input parameters, respectively.

  • The EndXXX function takes an IAsyncResult interface as its only parameter.

  • Both the Both the BeginXXXand EndXXX methods must be flagged with the WebMethod attribute.

If the ASMX handler finds two methods that meet all these requirements, then it will expose the XXX method in its WSDL as if it were a normal web method.

Example 7-1, "Example of a synchronous web method" shows a typical synchronous web method and the section called "Asynchronous Web Methods" shows how it is made asynchronous by introducing a Begin and End pair.

Example 7-1. Example of a synchronous web method

[WebMethod]
public string Sleep(int milliseconds)
{
     Thread.Sleep(milliseconds); 
}

Example 7-2. Asynchronous web methods

[WebMethod]
public IAsyncResult BeginSleep(
                              int milliseconds,
                              AsyncCallback cb,
                              object s) {...}
[WebMethod]
public string EndSleep(IAsyncResult call) {...}

The ASMX handler will expose a web method named Sleep from the pair of web methods. The method will accept the parameters defined before the AsyncCallback parameter in the signature for BeginXXX as input and return with the EndXXX function.

After the ASMX handler reflects on the compiled assembly and detects an asynchronous web method, it must handle requests for that method differently than it handles synchronous requests. Instead of calling the Sleep method synchronously and producing responses from the return value, it calls the BeginSleep method. It deserializes the incoming request into the parameters to be passed to the function-as it does for synchronous requests-but it also passes the pointer to an internal callback function as the extra AsyncCallback parameter to the BeginSleep method.

After the ASMX handler calls the BeginSleep function, it will return the thread to the process thread pool so it can handle another request. The HttpContext for the request will not be released yet. The ASMX handler will wait until the callback function that it passed to the BeginSleep function is called to finish processing the request.

Once the callback function is called, a thread from the thread pool is taken out to execute the remaining work. The ASMX handler will call the EndSleep function so that it can complete any processing it needs to perform and return the data to be rendered as a response. Once the response is sent, the HttpContext is released (see Figure 7.2, "How the asynchronous web method works").

Figure 7.2. How the asynchronous web method works

How the asynchronous web method works

The asynchronous web method concept is hard to grasp. It does not match with anything that we do in regular development. There are some fundamental differences and limitations to consider:

  • You cannot use asynchronous web methods when you use a business layer to read or write data that's not asynchronous itself. For example, a web method calling some function on DashboardFacade will not benefit from an asynchronous approach.

  • You cannot use the asynchronous method when you are calling an external web service synchronously. The external call must be asynchronous.

  • You cannot use the asynchronous method when you perform database operations using regular synchronous methods. All database operations must be asynchronous.

  • There's no benefit in making an asynchronous web method when there's no wait on some I/O operation such as HTTP requests, web service calls, remoting, asynchronous database operations, or asynchronous file operations. You won't benefit from simple Delegate.BeginInvoke calls, which run a function asynchronously, because asynchronous delegates take threads from the same thread pool as ASP.NET.

So, in Example 7-1, "Example of a synchronous web method", neither the simple sleep function nor any of the methods that we have used in our proxy web service can be real asynchronous functions (see Chapter 5, Building Client-Side Widgets). We need to rewrite them to support the asynchronous call nature. Before we do so, remember one principle-you can only benefit from the asynchronous method when the BeginXXX web method ends up calling a BeginYYY method on some other component, and your EndXXX method calls that component's EndYYY method. Otherwise, there's no benefit in making web methods asynchronous.

Example 7-3, "Example of a stock quote proxy web service" shows the code for a simple stock quote proxy web service. The proxy web service's BeginGetStock method ends up calling the BeginGetStock method on a component that fetches the stock data from external source. When data arrives, the component calls back via the AsyncCallback cb. The ASMX handler passes down this callback to the web method. So, when it is called, ASP.NET's ASMX handler receives the callback, and it restores the HttpContext, calls EndGetStock, and renders the response.

Example 7-3. Example of a stock quote proxy web service

[WebService]
public class StockQuoteProxy : System.Web.Services.WebService
{
    [WebMethod]
    public IAsyncResult BeginGetStock(AsyncCallback cb, Object state)
{
        net.stockquote.StockQuoteService proxy
            = new net.stockquote.StockQuoteService();
        return proxy.BeginGetStock("MSFT",
                                      cb,
                                      proxy);
}
    [WebMethod]
    public string EndGetStock(IAsyncResult res)
    {
        net.stockquote.StockQuoteService proxy
            = (net.stockquote.StockQuoteService)res.AsyncState;
        string quotes = proxy.EndGetStock(res);
        return quotes;
    }
}

The problem is ASP.NET's ASMX handler has the capability to call asynchronous web methods and return threads to the ASP.NET thread pool, but ASP.NET AJAX Framework's ASMX handler does not have that capability. It supports only synchronous calls. So, we need to rewrite the ASMX handler of ASP.NET AJAX to support asynchronous web method execution and then bypass ASP.NET AJAX's ASMX handler when web methods are called via XML HTTP. In the next section, you will see how the ASP.NET AJAX Framework's ASMX handler works and how you can rewrite such a handler yourself and introduce new features to it.

Modifying the ASP.NET AJAX Framework to Handle Web Service Calls

When you make a web service call from the browser via the ASP.NET AJAX Frame-work, it uses XML HTTP to make a call to a server-side web service. Usually all calls to ASMX files are handled by ASP.NET's ASMX handler. But when you add ASP. NET AJAX to your web application, you need to make some changes in the web.config where you explicitly remove the default ASMX handler and add the ASP.NET AJAX Framework's own ScriptHandler as the ASMX handler (see Example 7-4, "ASP.NET AJAX handles all calls to ASMX").

Example 7-4. ASP.NET AJAX handles all calls to ASMX

<httpHandlers>
    <remove verb="*" path="*.asmx" />
    <add verb="*" path="*.asmx" validate="false    " type="System.Web.Script.Services.
    ScriptHandlerFactory, System.Web.Extensions, Version=1.0.61025.0, Cultu    re=neutral,
    PublicKeyToken=31bf3856ad364e35" />

You also add a ScriptModule in the HTTP modules pipeline. It intercepts each and every HTTP request and checks whether the call is to an ASPX page and is calling a page method. It intercepts only page method calls, not web service calls. So, you don't need to bypass it.

ScriptHandler is a regular HTTP handler that finds out which web service and web method is called by parsing the URL. It then executes the web method by reflecting on the web service type. The steps involved in calling a web method are as follows:

  1. Confirm it's an Ajax web method call by checking Content-Type to see whether it has application/json. If not, raise an exception.

  2. Find out which .asmx is called by parsing a requested URL and getting the assembly, which has the compiled code for the .asmx .

  3. Reflect on the assembly and find the web service class and method that represents the web method being called.

  4. Deserialize input parameters into the proper data type. In case of HTTP POST, deserialize the JSON graph.

  5. See the parameters in the method and map each parameter to objects that have been deserialized from JSON.

  6. Initialize the cache policy.

  7. Invoke the method via reflection and pass the parameter values that match from JSON.

  8. Get the return value. Serialize the return value into JSON/XML.

  9. Emit the JSON/XML as the response.

To add asynchronous web method call support, you need to first change the way it reflects on the assembly and locates the web method. It needs to call the Begin and End pair, instead of the real web method. You also need to make the handler implement the IHttpAsyncHandler interface and execute the Begin and End pair in BeginProcessRequest and EndProcessRequest.

But there's no step that facilitates .NET 2.0 transactions. The only way to implement them is to use System.EnterpriseServices transactions or use your own .NET 2.0 TransactionScope class inside your web method code. .NET 2.0 introduced the new System.Transaction namespace, which has a much better way to handle transactions. It would be great if you could add a [Transaction] attribute in your web methods so they could work within a transaction managed by the ScriptHandler. But ScriptHandler does not deal with .NET 2.0 transactions.

Initializing the Cache Policy

In the ASP.NET AJAX Framework, initialization of cache policy comes before invoking the web method. Example 7-5, "ScriptHandler's InitializeCachePolicy function initializes the cache settings before the web method is called" shows the ASP.NET AJAX 1.0 code for the InitializeCachePolicy function that sets the cache policy before invoking the web method.

Example 7-5. ScriptHandler's InitializeCachePolicy function initializes the cache settings before the web method is called

private static void InitializeCachePolicy(WebServiceMethodData methodData, HttpContext
context) {
     int cacheDuration = methodData.CacheDuration;
     if (cacheDuration > 0) {
          context.Response.Cache.SetCacheability(HttpCacheability.Server);
          context.Response.Cache.SetExpires(DateTime.Now.AddSeconds(cacheDuration));
          context.Response.Cache.SetSlidingExpiration(false);
          context.Response.Cache.SetValidUntilExpires(true);

          if (methodData.ParameterDatas.Count > 0) {
               context.Response.Cache.VaryByParams["*"] = true;
          }
          else {
               context.Response.Cache.VaryByParams.IgnoreParams = true;
          }
      }
     else {
        context.Response.Cache.SetNoServerCaching(); 
        context.Response.Cache.SetMaxAge(TimeSpan.Zero);
     }
}

If you do not have cache duration set in the [WebMethod] attribute, it will set the MaxAge to zero. Once MaxAge is set to zero, it can no longer be increased; therefore, you cannot increase MaxAge from your web method code dynamically and thus make the browser cache the response.

Developing Your Own Web Service Handler

In this section, you will learn how to develop your own web service handler and overcome the limitation of the ASP.NET AJAX Framework's built-in ASMX handler. The first step is to add asynchronous method invocation support to web methods. Then add .NET 2.0 transactions on the synchronous method calls. Unfortunately, I haven't found a way to make asynchronous functions transactional. The third step is to set the cache policies after invoking the web method (be careful not to overwrite the cache policies that the web method has already set for itself). Finally, some minor modifications are needed to generate responses with a proper Content-Length header, which helps browsers optimize a response's download time by using persisted connections and less strict exception handling to prevent event logs being flooded with errors.

Basics of Asynchronous Web Service Handlers

First you need to create a HTTP handler that will intercept all calls to web services. You need to map that handler to the *.asmx extension in web.config's <httphandlers> section. By default, ASP.NET AJAX will map its ScriptHandler, which handles the *.asmx extension, so you will have to replace that with your own HTTP handler.

In the accompanying source code, the AJAXASMXHandler project is the new web service handler. ASMXHttpHandler.cs is the main HTTP handler class. The ASMXHttpHandler class implements IHttpAsyncHandler. When this handler is invoked during calls to web services, the ASP.NET Framework first calls BeginProcessRequest. In this function, the handler parses the requested URL and finds out which web service and web method to invoke (see Example 7-6, "The ASMXHttpHandler class BeginProcessRequest's function starts the execution of a request asynchronously").

Example 7-6. The ASMXHttpHandler class BeginProcessRequest's function starts the execution of a request asynchronously

IAsyncResult IHttpAsyncHandler.BeginProcessRequest(HttpContext context, AsyncCallback cb,
object extraData)
{
     // Proper content-type header must be present to make an Ajax call
     if (!IsRestMethodCall(context.Request)) return GenerateErrorResponse(context, "Not a
    valid AJAX call", extraData);

     string methodName = context.Request.PathInfo.Substring(1);

      WebServiceDef wsDef = WebServiceHelper.GetWebServiceType(context, context.Request.
     FilePath);
      WebMethodDef methodDef = wsDef.Methods[methodName];

     if (null == methodDef) return GenerateErrorResponse(context, "Web method not
     supported: " + methodName, extraData);

     // GET request will only be allowed if the method says so
     if (context.Request.HttpMethod == "GET" && !methodDef.IsGetAllowed)
          return GenerateErrorResponse(context, "Http Get method not supported",
          extraData);

     // If the method does not have a BeginXXX and EndXXX pair, execute it synchronously
     if (!methodDef.HasAsyncMethods)
     {

WebServiceDef is a class that wraps the Type class and contains information about a web service's type. It maintains a collection of WebMethodDef items where the item contains the definition of a web method. WebMethodDef has the name of each method, the attributes associated to the method, whether it supports HTTP GET or not, and a reference to the Begin and End function pair, if there's any. If there's no Begin and End pair, the function is executed synchronously, as in Example 7-7, "BeginProcessRequest: synchronous execution of web methods when there's no Begin and End pair". Both of these classes are used to cache information about web services and web methods, so there's no need to repeatedly use reflection to discover the metadata.

Example 7-7. BeginProcessRequest: synchronous execution of web methods when there's no Begin and End pair

// If the method does not have a BeginXXX and EndXXX pair, execute it synchronously
if (!methodDef.HasAsyncMethod)
{
     // Do synchronous call
     ExecuteMethod(context, methodDef, wsDef);
     // Return a result that says method was executed synchronously
     return new AsmxHandlerSyncResult(extraData); 
}

BeginProcessRequest returns immediately when the method is executed synchronously. It returns an AsmxHandlerSyncResult instance that indicates the request has executed synchronously and there's no need to fire EndProcessRequest.AsmxHandlerSyncResult implements the IAsyncResult interface. It returns true from the CompletedSynchronously property (see Example 7-8, "AsmxHandlerSyncResult implements IAsyncResult and returns true from the CompletedSynchronously property. It also returns a ManualReset event with state set to true indicating that the call has completed.").

Example 7-8. AsmxHandlerSyncResult implements IAsyncResult and returns true from the CompletedSynchronously property. It also returns a ManualReset event with state set to true indicating that the call has completed.

public class AsmxHandlerSyncResult : IAsyncResult
{
     private object state;
     private WaitHandle handle = new ManualResetEvent(true);
     public AsmxHandlerSyncResult(object state)
     {
          this.state = state;
          this.handle = handle;
     }
     object IAsyncResult.AsyncState { get { return this.state; } }
     WaitHandle IAsyncResult.AsyncWaitHandle { get { return this.handle; } }
     bool IAsyncResult.CompletedSynchronously { get { return true; } }
     bool IAsyncResult.IsCompleted { get { return true; } }
}

Going back to BeginProcessRequest, when there is a Begin and End pair, it calls the BeginXXX method of the web method and returns from the function. Execution goes back to the ASP.NET Framework, and it returns the thread to the thread pool.

Dynamically Instantiating a Web Service

Web services inherit from System.Web.Services.WebService, which implements the IDisposable interface. Activator.CreateInstance is a .NET Framework class that can dynamically instantiate any class from its type and return a reference to the object. In Example 7-9, "BeginProcessRequest: Preparing to invoke the BeginXXX web method on the web service", a web service class instance is created, and the IDisposable interface reference is used. IDisposable interface is used because we need to dispose of it when we are done.

Example 7-9, "BeginProcessRequest: Preparing to invoke the BeginXXX web method on the web service" show the preparation step for calling the BeginXXX function on the web service. First, all the parameters are properly mapped from the request parameters except for the last two parameters, where one is the AsyncCallback and the other is the object state.

Example 7-9. BeginProcessRequest: Preparing to invoke the BeginXXX web method on the web service

else
{
     // Create an instance of the web service
     IDisposable target = Activator.CreateInstance(wsDef.WSType) as IDisposable;

     // Get the BeginXXX method and extract its input parameters
     WebMethodDef beginMethod = methodDef.BeginMethod;
     int allParameterCount = beginMethod.InputParametersWithAsyc.Count;

     // Map HttpRequest parameters to BeginXXX method parameters
     IDictionary<string, object> inputValues = GetRawParams(context, beginMethod.
     InputParameters);
     object[] parameterValues = StrongTypeParameters(inputValues, beginMethod.
     InputParameters);

     // Prepare the list of parameter values, which also includes the AsyncCallback and
     the state
     object[] parameterValuesWithAsync = new object[allParameterCount];
     Array.Copy(parameterValues, parameterValuesWithAsync, parameterValues.Length);

     // Populate the last two parameters with asynchonous callback and state
     AsyncWebMethodState webMethodState = new AsyncWebMethodState(methodName, target,
          wsDef, methodDef, context, extraData);

     parameterValuesWithAsync[allParameterCount - 2] = cb;
     parameterValuesWithAsync[allParameterCount - 1] = webMethodState;

Once the preparation is complete, the BeginXXX method is invoked. Now the BeginXXX method can execute synchronously and return immediately. In that case, you need to generate the response right out of BeginXXX and complete execution of the request. But if BeginXXX needs more time to execute asynchronously, then you need to return the execution to the ASP.NET Framework so that it can put the thread back into the thread pool. When the asynchronous operation completes, the EndProcessRequest function will be called back and you resume processing the request (see Example 7-10, "BeginProcessRequest: Invoke the BeginXXX function on the web service and return the IAsyncResult").

Example 7-10. BeginProcessRequest: Invoke the BeginXXX function on the web service and return the IAsyncResult

try
{
     // Invoke the BeginXXX method and ensure the return result has AsyncWebMethodState.
     // This state contains context and other information that we need to call
     // the EndXXX
     IAsyncResult result = beginMethod.MethodType.Invoke(target,
          parameterValuesWithAsync) as IAsyncResult;

     // If execution has completed synchronously within the BeginXXX function, then
     // generate response immediately. There's no need to call EndXXX
     if (result.CompletedSynchronously)
     {
          object returnValue = result.AsyncState;
          GenerateResponse(returnValue, context, methodDef);

          target.Dispose();
          return new AsmxHandlerSyncResult(extraData);
     }
     else
     {
          if (result.AsyncState is AsyncWebMethodState) return result;
          else throw new InvalidAsynchronousStateException("The state passed
          in the " + beginMethod.MethodName + " must inherit from "
          + typeof(AsyncWebMethodState).FullName);
     }
}
catch( Exception x )
{
     target.Dispose();
     WebServiceHelper.WriteExceptionJsonString(context, x, _serializer);
     return new AsmxHandlerSyncResult(extraData);
}

The EndProcessRequest function is fired when the asynchronous operation completes and the callback is fired. For example, if you call an external web service asynchronously inside the BeginXXX web method, you need to pass an AsyncCallback reference. This is the same callback that you receive on BeginProcessRequest. The ASP.NET Framework creates a callback reference for you that fires the EndProcessRequest on the HTTP handler. During the EndProcessRequest, you just need to call the EndXXX method of the web service, get the response, and generate output (see Example 7-11, "EndProcessRequest function of ASMXHttpHandler").

Example 7-11. EndProcessRequest function of ASMXHttpHandler

void IHttpAsyncHandler.EndProcessRequest(IAsyncResult result)
{
     if (result.CompletedSynchronously) return;

     AsyncWebMethodState state = result.AsyncState as AsyncWebMethodState;

     if (result.IsCompleted)
     {
          MethodInfo endMethod = state.MethodDef.EndMethod.MethodType;

          try
          {
               object returnValue = endMethod.Invoke(state.Target,
                    new object[] { result });
               GenerateResponse(returnValue, state.Context, state.MethodDef);
          }
          catch (Exception x)
          {
               WebServiceHelper.WriteExceptionJsonString(state.Context, x, _serializer);
          }
          finally
          {
               state.Target.Dispose();
          }

          state.Dispose();
     }
}

When the EndXXX web method completes, you will get a return value if the function is not a void type function. In that case, you need to convert the return value to a JSON string and return to the browser. However, the method can return an XML string also instead of JSON. So, just write the string to the HttpResponse (see Example 7-12, "TheGenerateResponse function of ASMXHttpHandler prepares the response JSON or the XML string according to the web method definition").

Example 7-12. TheGenerateResponse function of ASMXHttpHandler prepares the response JSON or the XML string according to the web method definition

private void GenerateResponse(object returnValue, HttpContext context, WebMethodDef
methodDef)
{
     string responseString = null;
     string contentType = "application/json";

     if (methodDef.ResponseFormat == System.Web.Script.Services.ResponseFormat.Json)
     {
          responseString = _serializer.Serialize(returnValue);
          contentType = "application/json";
     }
     else if (methodDef.ResponseFormat == System.Web.Script.Services.ResponseFormat.Xml)
     {
          responseString = returnValue as string;
          contentType = "text/xml";
     }

     context.Response.ContentType = contentType;

     // If we have a response and no redirection is happening and the client is
     // still connected, send response
     if (responseString != null
          && !context.Response.IsRequestBeingRedirected
          && context.Response.IsClientConnected)
     {
          // Convert the return value to response encoding, e.g., UTF-8
          byte[] unicodeBytes = Encoding.Unicode.GetBytes(responseString);
          byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode,
                    context.Response.ContentEncoding, unicodeBytes);

          // Instead of Response.Write, which will convert the output to UTF-8,
          // use the internal stream
          // to directly write the UTF-8 bytes
          context.Response.OutputStream.Write(utf8Bytes, 0, utf8Bytes.Length);
     }
     else
     {
          // Send no body as response and abort it
          context.Response.AppendHeader("Content-Length", "0");
          context.Response.ClearContent();
          context.Response.StatusCode = 204; // No Content
     }

Basically this is how a web method is executed synchronously and asynchronouslyand response is prepared. Although there are more complicated steps in preparing the web service and web method definition, serialization/deserialization of JSON, and mapping deserialized objects to input parameters of web method, I will skip these areas. You can review the code of the HTTP handler and learn in detail how all these work. A lot of code has been reused from ASP.NET AJAX; I also used the JSON serializer that comes with the Framework.

Adding Transaction Capability to Web Methods

Up to this point, the web method execution doesn't support transaction. The [TransactionalMethod] attribute defines the scope of transaction to use, as well as the isolation level and timeout period (see Example 7-13, "An example of implementing a transactional web method").

Example 7-13. An example of implementing a transactional web method

[WebMethod]
[TransactionalMethod(
     TransactionOption=TransactionScopeOption.RequiresNew,
     Timeout=10,
     IsolationLevel=IsolationLevel.Serializable)]
public void TestTransactionCommit()
{
     Debug.WriteLine(string.Format(
          "TestTransactionCommit: Status: {0},
          Isolation Level: {1}",
          Transaction.Current.TransactionInformation.Status,
          Transaction.Current.IsolationLevel));
     using (SqlConnection con = new SqlConnection(
          ConfigurationManager.ConnectionStrings["default"].ConnectionString))
     {
          con.Open();
          using (SqlCommand cmdInsert = new SqlCommand("INSERT INTO Widget
               (Name, Url, Description, CreatedDate, LastUpdate,
               VersionNo, IsDefault, DefaultState, Icon)
               VALUES ( '', '', '', GETDATE( ), GETDATE( ), 0, 0, '', '');
               SELECT @@IDENTITY", con))
          {

               object id = cmdInsert.ExecuteScalar();

               using (SqlCommand cmdDelete = new SqlCommand(
                    "DELETE FROM Widget WHERE ID=" + id.ToString(), con))
               {
                    cmdDelete.ExecuteNonQuery();
               }
          }
     }
}

A web method that has the TransactionalMethod attribute will automatically execute inside a transaction. We will use .NET 2.0 transactions here. The transaction management is done entirely in the HTTP handler and thus the web method doesn't have to do anything. The transaction is automatically rolled back when the web method raises an exception; otherwise, the transaction is committed automatically.

The ExecuteMethod function of the ASMXHttpHandler invokes web methods synchronously and provides transaction support. Currently, transaction support for asynchronous methods has not been implemented because execution switches from one thread to another, so the TransactionScope is lost from the thread local storage (see Example 7-14, "The ExecuteMethod of ASMXHttpHandler invokes a web method synchronously within a transaction scope").

Example 7-14. The ExecuteMethod of ASMXHttpHandler invokes a web method synchronously within a transaction scope

private void ExecuteMethod(
     HttpContext context,
     WebMethodDef methodDef,
     WebServiceDef serviceDef)
{
IDictionary<string, object> inputValues =
     GetRawParams(context, methodDef.InputParameters);
object[] parameters =
     StrongTypeParameters(inputValues, methodDef.InputParameters);

object returnValue = null;
using (IDisposable target =
     Activator.CreateInstance(serviceDef.WSType) as IDisposable)
{
     TransactionScope ts = null;
     try
     {
          // If the method has a transaction attribute,
          // then call the method within a transaction scope
          if (methodDef.TransactionAtt != null)
          {
               TransactionOptions options = new TransactionOptions();
               options.IsolationLevel = methodDef.TransactionAtt.IsolationLevel;
               options.Timeout =
                    TimeSpan.FromSeconds( methodDef.TransactionAtt.Timeout );

               ts = new TransactionScope(
                    methodDef.TransactionAtt.TransactionOption, options);
          }

          returnValue = methodDef.MethodType.Invoke(target, parameters);

          // If transaction was used, then complete the transaction
          // because no exception was generated
          if( null != ts ) ts.Complete( );

          GenerateResponse(returnValue, context, methodDef);
     }

Example 7-14, "The ExecuteMethod of ASMXHttpHandler invokes a web method synchronously within a transaction scope" shows a web method executing properly and generating a response. The web method executes within a transaction scope defined in the TransactionalMethod attribute. But when the web method raises an exception, it goes to the catch block where a exception message is produced. Finally, the TransactionScope is disposed and it checks whether it has been already committed. If not, TransactionScope rolls back the transaction (see Example 7-15, "ExecuteMethod: When a web method raises an exception, the transaction is rolled back").

Example 7-15. ExecuteMethod: When a web method raises an exception, the transaction is rolled back

    catch (Exception x)
    {
          WebServiceHelper.WriteExceptionJsonString(context, x, _serializer);
    }
    finally
    {
         // If the transaction was started for the method, dispose the transaction.
         // This will roll back if not committed
         if( null != ts) ts.Dispose( );

         // Dispose the web service
         target.Dispose();
    }

The entire transaction management is inside the HTTP handler, so there's no need to worry about transactions in web services. Just add one attribute, and web methods become transaction enabled.

Adding Cache Headers

The previous section "Modifying the ASP.NET AJAX Framework to Handle Web Service Calls" described how ASP.NET AJAX initializes the cache policy before invoking the web method. Due to a limitation in HttpCachePolicy, once the MaxAge is set to a value, it cannot be increased. Because ASP.NET AJAX sets the MaxAge to zero, there's no way to increase that value from within the web method code. Moreover, if you use Fiddler or any other HTTP inspection tool to see responses returned from web service calls, you will see the responses are missing Content-Length attribute. Without this attribute, browsers cannot use HTTP pipelining, which greatly improves the HTTP response download time.

Example 7-16, "The GenerateResponse function handles cache headers properly by respecting the cache policy set by the web method" shows some additions made to the GenerateResponse function to deal with the cache policy. The idea is to confirm that the web method has already set some cache policy in the HttpResponse object so it will not change any cache setting. Otherwise, it will look at the WebMethod attribute for cache settings and then set the cache headers.

Example 7-16. The GenerateResponse function handles cache headers properly by respecting the cache policy set by the web method

// If we have a response and no redirection is happening and the client is still
//connected, send response
if (responseString != null
     && !context.Response.IsRequestBeingRedirected
     && context.Response.IsClientConnected)
{
     // Produces proper cache. If no cache information is specified on the method and
    // there's been no cache-related
     // changes done within the web method code, then the default cache will be private,
    // no cache.
     if (IsCacheSet(context.Response))
     {
       // Cache has been modified within the code; do not change any cache policy
     }
     else
     {
          // Cache is still private. Check to see if CacheDuration was set in WebMethod
          int cacheDuration = methodDef.WebMethodAtt.CacheDuration;
          if (cacheDuration > 0)
          {
               // If CacheDuration attribute is set, use server-side caching
               context.Response.Cache.SetCacheability(HttpCacheability.Server);
               context.Response.Cache.SetExpires(DateTime.Now.AddSeconds(cacheDuration));
               context.Response.Cache.SetSlidingExpiration(false);
               context.Response.Cache.SetValidUntilExpires(true);
               if (methodDef.InputParameters.Count > 0)
               {
                    context.Response.Cache.VaryByParams["*"] = true;
               }
               else
               {
                    context.Response.Cache.VaryByParams.IgnoreParams = true;
               }
          }
          else
          {
               context.Response.Cache.SetNoServerCaching();
               context.Response.Cache.SetMaxAge(TimeSpan.Zero);
          }
     }
     // Convert the response to response encoding, e.g., UTF-8
     byte[] unicodeBytes = Encoding.Unicode.GetBytes(responseString);
     byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode, context.Response.
     ContentEncoding, unicodeBytes);

     // Emit content length in UTF-8 encoding string
     context.Response.AppendHeader("Content-Length", utf8Bytes.Length.ToStrin  g( ));

     // Instead of Response.Write, which will convert the output to UTF-8, use the
     // internal stream
     // to directly write the UTF-8 bytes       
     context.Response.OutputStream.Write(utf8Bytes, 0, utf8Bytes.Length);
}

The IsCacheSet function checks to see whether there's been any change in some of the common cache settings. If there has been a change, then the web method wants to deal with the cache itself, and GenerateResponse does not make any change to the cache policy (see Example 7-17, "The IsCacheSet function checks whether the cache policy has already been set by the web method").

Example 7-17. The IsCacheSet function checks whether the cache policy has already been set by the web method

private bool IsCacheSet(HttpResponse response)
{
     // Default is private. So, if it has been changed, then the web method
     // wants to handle cache itself
     if (response.CacheControl == "public") return true;

     // If maxAge has been set to a nondefault value, then the web method
     // wants to set maxAge itself.
     FieldInfo maxAgeField = response.Cache.GetType().GetFiel"d_(maxAge",
          BindingFlags.GetField | BindingFlags.Instance | BindingFlags.NonPublic);
     TimeSpan maxAgeValue = (TimeSpan)maxAgeField.GetValue(response.Cache);
     if (maxAgeValue != TimeSpan.Zero) return true;

     return false;
}

Real-Life: Exception Handling

Problem: The ASMX handler kept firing exceptions.

Solution: Used the reflection-based maxAge hack in the "Caching Web Service Responses on the Browser" section in Chapter 6, Optimizing ASP.NET AJAX.

On an earlier portal project I worked on, our web servers' event logs were being flooded with this error:

 Request format is unrecognized for URL unexpectedly ending in /SomeWebServiceMethod

In ASP.NET AJAX 1.0 version, Microsoft added a check for all web service calls to have Content-Type:application/json in the request headers. Unless this request header was present, ASMX handler fired an exception. This exception was raised directly from the ScriptHandler, which handled all web service calls made via ASP.NET AJAX. This resulted in an UnhandledException and was written in the event log.

This is done for security reasons; it prevents someone from feeding off your web services. For example, you might have a web service that returns some useful information that others want. So, anyone could just add a <script> tag pointing to that web service URL and get the JSON. If that web service is a very expensive web service in terms of I/O and/or CPU, then other web sites feeding off your web service could easily bog down your server.

Now, this backfires when you have HTTP GET supported web service calls that produce response headers to cache the response in the browser and proxy. For example, you might have a web method that returns stock quotes. You have used response caching so the browser caches the response of that web method, and repeated visits do not produce repeated calls to that costly I/O web service. Because it has a cache header, proxy gateways or proxy servers will see that their client users are requesting this frequently and it can be cached. So, they will make periodic calls to that web service and try to precache the headers on behalf of their client users. However, during the precache process, the proxy gateways or proxy servers will not send the Content-Type: application/json header. As a result, an exception is thrown and your event log is flooded.

The reason why this went undetected is because there's no way to make a HTTP GET response cacheable on the browser from web service calls unless you do the reflection-based maxAge hack in the "Caching Web Service Responses on the Browser" section in Chapter 6, Optimizing ASP.NET AJAX.

So, the ASMXHttpHandler just returns HTTP 405 saying the call is not allowed if it does not have the application/json content type. This solves the event log flood problem and prevents browsers from getting a valid response when someone uses a <script> tag on your web method.

Using the Attributes

You have seen that the BeginXXX and EndXXX functions don't have the [WebMethod] attribute, but instead only have the [ScriptMethod] attribute. If you add WebMethod attribute, the Ajax JavaScript proxy generator will unnecessarily generate function wrappers for those methods in the JavaScript proxy for the web service. So, for the JavaScript proxy generator, you need only to put the WebMethod attribute on the XXX web method. Moreover, you cannot have a WebMethod attribute on BeginXXX, EndXXX, and the XXX functions at the same time because the WSDL generator will fail to generate. So, the idea is to add the WebMethod attribute only to the XXX function, and the JavaScript proxy generator will generate a JavaScript function for the web method and add only the ScriptMethod attribute on the BeginXXX and EndXXX functions.

Handling the State Object

The last parameter passed in the BeginXXX function, the object state parameter, needs to be preserved. It contains a reference to the HttpContext, which is needed by the ASMX handler to call the EndXXX function on proper context. So, if you create a custom state object and pass that to a BeginYYY function of some component, e.g., File.BeginRead, then you need to inherit that custom state object from the AsyncWebMethodState class. You must pass the state parameter in the constructor. This way, your custom state object will carry the original state object that is passed down to your BeginXXX function.

Making an Asynchronous and Cache-Friendly Proxy

You can make proxy methods asynchronous by using the new Ajax ASMX handler. This will solve a majority of the proxy web service's scalability problems. Moreover, the proxy service will become cache-friendly for browsers, and they will be able to download responses faster from the proxy by using the Content-Length header.

The GetString and GetXml method can become asynchronous very easily by using the HttpWebRequest class and its asynchronous methods. HttpWebRequest has the BeginGetResponse function, which works asynchronously. So, you just need to call BeginResponse in the BeginGetString class of the proxy (see Example 7-18, "A proxy's BeginGetString function asynchronously downloads responses from an external source").

Example 7-18. A proxy's BeginGetString function asynchronously downloads responses from an external source

private class GetStringState : AsyncWebMethodState
{
     public HttpWebRequest Request;
     public string Url;
     public int CacheDuration;
     public GetStringState(object state) : base(state) {}
}

[ScriptMethod]
public IAsyncResult BeginGetString(string url, int cacheDuration, AsyncCallback cb, object
state)
{
     // See if the response from the URL is already cached on server
     string cachedContent = Context.Cache[url] as string;
     if (!string.IsNullOrEmpty(cachedContent))
     {
          this.CacheResponse(Context, cacheDuration);
          return new AsmxHandlerSyncResult(cachedContent);
     }

     HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;

     GetStringState myState = new GetStringState(state);
     myState.Request = request;
     myState.Url = url;
     myState.CacheDuration = cacheDuration;

     return request.BeginGetResponse(cb, myState);
}

The BeginGetString method has two modes. It executes them synchronously when the content is cached in the ASP.NET cache. Then there's no need to return the thread to the thread pool because the method can complete right away. If there isn't any content in the cache, it makes a BeginGetResponse call and returns execution to the ASMX handler. The custom state object, GetStringState, inherits from the AsyncWebMethodState defined in the AJAXASMXHandler project. In its constructor, it takes the original state object passed down to the BeginGetString function. The ASMXHttpHandler needs the original state so that it can fire the EndGetString function on proper context.

When HttpWebRequest gets the response, it fires the ASMX handler's callback. The ASMX handler, in turn, calls EndGetString to complete the response. EndGetString downloads the response, caches it, and returns it as a return value (see Example 7-19, "The EndGetString method of a proxy web service").

Example 7-19. The EndGetString method of a proxy web service

[ScriptMethod]
public string EndGetString(IAsyncResult result)
{
     GetStringState state = result.AsyncState as GetStringState;

     HttpWebRequest request = state.Request;
     using( HttpWebResponse response =
           request.EndGetResponse(result) as HttpWebResponse )
     {
           using( StreamReader reader = new
                StreamReader(response.GetResponseStream( )) )
           { 
                string content = reader.ReadToEnd();
                state.Context.Cache.Insert(state.Url, content, null,
                     Cache.NoAbsoluteExpiration,
                     TimeSpan.FromMinutes(state.CacheDuration),
                     CacheItemPriority.Normal, null);

                // produce cache headers for response caching
                this.CacheResponse(state.Context, state.CacheDuration);

                return content;
            }
     }
}

Keep in mind that the Context object is unavailable in the EndGetString function because this function is fired on a different thread that is no longer tied to the original thread that initiated the HTTP request. So, you need to get a reference to the original Context from the state object.

Similarly, you can make GetRss asynchronous by introducing a BeginGetRss and EndGetRss pair.

Scaling and Securing the Content Proxy

As widgets start using the proxy service, described in Chapter 5, Building Client-Side Widgets, more and more, this single component will become the greatest scalability bottleneck of your entire web portal project. It's not unusual to spend a significant amount of development resources to improve scalability, reliability, availability, and performance on the content proxy. This section describes some of the challenges you will face when going live to millions of users with such a proxy component.

Maintaining Speed

Widgets call a proxy service to fetch content from an external source. The proxy service makes the call, downloads the response on server, and then transmits the response back to the browser. There are two latencies involved here: between the browser and your server, and your server and the destination. If the response's pay-load is high, say 500 KB, then there's almost 1 MB of transfer that takes place during the call. So, you need to put a limit on how much data transfer you allow from the proxy (see Example 7-20, "Putting a limit on how much data you will download from external sources via a HttpWebRequest"). HttpWebResponse class has a ContentLength property that tells you how much data is being served by the destination. You can check whether it exceeds the maximum limit that you can take in. If widgets are requesting a large amount of data, it not only slows that specific request, but also other requests on the same server, since the server's bandwidth is occupied during the megabyte transfer. Servers generally have 4 Mbps, 10 Mbps, or, if you can afford it, 100 Mbps connectivity to the Internet. At 10 Mbps, you can transfer about 1 MB per second. So, if one proxy call is occupied transferring megabytes, there's no bandwidth left for other calls to happen and bandwidth cost goes sky high. Moreover, during the large transfer, one precious HTTP worker thread is occupied streaming megabytes of data over a slow Internet connection to the browser. If a user is using a 56 Kbps ISDN line, a 1 MB transfer will occupy a worker thread for about 150 seconds.

Example 7-20. Putting a limit on how much data you will download from external sources via a HttpWebRequest

HttpWebResponse response = request.GetResponse() as HttpWebResponse;

if (response.StatusCode == HttpStatusCode.OK)
{
     int maxBytesAllowed = 512 * 1024; // 512 K
    if (response.ContentLength > maxBytesAllowed)
    {

          response.Close();
          throw new ApplicationException("Response too big.
               Max bytes allowed to download is: " + maxBytesAllowed);
    }

Sometimes external sources do not generate the content length header, so there's no way to know how much data you are receiving unless you download the entire byte stream from the server until the server closes the connection. This is a worst-case scenario for a proxy service because you have to download up to your maximum limit and then abort the connection. Example 7-21, "An algorithm for downloading external content safely" shows a general algorithm for dealing with this problem.

Example 7-21. An algorithm for downloading external content safely

Get content length from the response header.

If the content length is present, then
     Check if content length is within the maximum limit
     If content length exceeds maximum limit, abort

If the content length is not present
     And there are more bytes available to download
          Read a chunk of bytes, e.g., 512 bytes,
          Count the total number of bytes read so far
          If the count exceeds the maximum limit, abort

Connection management

Every call to the proxy makes it open an HTTP connection to the destination, download data, and then close it. Setting up an HTTP connection is expensive because there's network latency involved in establishing a connection between the browser and the server. If you are making frequent calls to the same domain, like Flickr.com, it will help to maintain an HTTP connection pool, just like an ADO.NET connection pool. You should keep connections open and reuse open connections when you have frequent requests going to the same external server. However, the HTTP connection pool is very complicated to make because, unlike SQL Servers in fast private networks, external servers are on the Internet, loaded with thousands of connection from all over the world, and are grumpy about holding a connection open for long period. They are always eager to close down an inactive connection as soon as possible. So, it becomes quite challenging to keep HTTP connections open to frequently requested servers that are quite busy with other clients.

DNS resolution

DNS resolution is another performance obstacle. If your server is in the U.S., and a web site that you want to connect to has a DNSin Australia, it's going to take about 1 second just to resolve the web site's IP. DNS resolution happens in each and every HttpWebRequest. There's no built-in cache in .NET that remembers the host's IP for some time. You can benefit from DNS caching if there's a DNS server in your data center. But that also flushes out the IP in an hour. So, you can benefit from maintaining your own DNS cache. Just a static thread-safe dictionary with the key as the domain name and the value as the IP will do. When you open HttpWebRequest, instead of using the URI that is passed to you, replace the domain name with the cached IP on the URI and then make the call. But remember to send the original domain as the host header's value.

The HttpWebRequest class has some parameters that can be tweaked for performance and scalability for a proxy service. For example, the proxy does not need any keep-alive connections. It can close connections as soon as a call is finished. In fact, it must do that or a server will run out of TCP sockets under a heavy load. A server can handle a maximum of 65,535 TCP connections that connect one a time. However, your application's limit is smaller than that because there are other applications running on the server that need free TCP sockets. Besides closing a connection as soon as you are finished, you need to set a much lower Timeout value for HttpWebRequest. The default is 100 seconds, which is too high for a proxy that needs content to be served to a client in seconds. So, if an external service does not respond within 3 to 5 seconds, you can give up on it. Every second the timeout value increases, the risk of worker threads being jammed is increased as well. ReadWriteTimeout is another property that is used when reading data from the response stream. The default is 300 seconds, which is too high; it should be as low as 1 second. If a Read call on the response stream gets stuck, not only is an open HTTP connection stuck but so is a worker thread on the ASP.NET pool. Moreover, if a response to a Read request takes more than a second, that source is just too slow and you should probably stop sending future requests to that source (see Example 7-22, "Optimizing the HttpWebRequest connection for a proxy").

Example 7-22. Optimizing the HttpWebRequest connection for a proxy

HttpWebRequest request = WebRequest.Create("http://... ") as HttpWebRequest;
request.Headers.Add("Accept-Encoding", "gzip");
request.AutomaticDecompression = DecompressionMethods.GZip;
request.AllowAutoRedirect = true;
request.MaximumAutomaticRedirections = 1;
request.Timeout = 15000;
request.Expect = string.Empty;
request.KeepAlive = false;
request.ReadWriteTimeout = 1000;

Most of the web servers now support gzip compression on response. Gzip compression significantly reduces the response size, and you should always use it. To receive a compressed stream, you need to send the Accept-Encoding:gzip header and enable AutomaticDecompression. The header will tell the source to send the compressed response, and the property will direct HttpWebRequest to decompress the compressed content. Although this will add some overhead to the CPU, it will significantly reduce bandwidth usage and the content's fetch time from external sources. For text content, like JSON or XML where there are repeated texts, you will get a 10 to 50 times speed gain while downloading such responses.

Avoiding Proxy Abuse

When someone uses your proxy to anonymously download data from external sources, it's called proxy abuse. Just like widgets, any malicious agent can download content from external sources via your proxy. Someone can also use your proxy to produce malicious hits on external servers. For example, a web site can download external content using your proxy instead of downloading it itself, because it knows it will benefit from all the optimization and server-side caching techniques you have done. So, anyone can use your site as their own external content cache server to save on DNS lookup time, benefit from connection pooling to your proxy servers, and bring down your server with additional load.

This is a really hard problem to solve. One easy way is to limit number of connections per minute or day from a specific IP to your proxy service. Another idea is to check cookies for some secure token that you generate and send to the client side. The client will send back that secure token to the proxy server to identify it as a legitimate user. But that can easily be misused if someone knows how to get the secure token. Putting a limit on the maximum content length is another way to prevent a large amount of data transfer. A combination of all these approaches can save your proxy from being overused by external web sites or malicious clients. However, you still remain vulnerable to some misuse all the time. You just have to pay for the additional hardware and bandwidth cost that goes into misuse and make sure you always have extra processing power and bandwidth to serve your own need.

Defending Against Denial-of-Service Attacks

The proxy service is the single most vulnerable service on the whole project. It's so easy to bring down a site by maliciously hitting a proxy that most hackers will just ignore you, because you aren't worth the challenge.

Here's one way to bring down any site that has a proxy service:

  • Create a web page that accepts an HTTP GET call.

  • Make that page sleep for as long as possible.

  • Create a small client that will hit the proxy to make requests to that web page. Every call to that web page will make the proxy wait for a long time.

  • Find the timeout of the proxy and sleep it so that proxy will always time out on each call (this may take some trial and error).

  • Spawn 100 threads from your small client and make a call to the proxy from each thread to fetch content from that slow page. You will have 100 worker threads stuck on the proxy server. If the server has two processors, it will run out of worker threads and the site will become nonresponsive.

You can take this one step further by sleeping until timeout minus 1 second. After that sleep, start sending a small number of bytes to the response as slowly as possible. Find the ReadWriteTimeout value of the proxy on the network stream. This will prevent the proxy from timing out on the connection. When it's just about to give up, it will start getting some bytes and not abort the connection. Because it is receiving bytes within the ReadWriteTimeout, it will not time out on the Read calls. This way, you can make each call to the proxy go on for hundreds of seconds until the ASP.NET request times out. Spawn 100 threads and you will have 100 requests stuck on the server until they time out. This is the worst-case scenario for any web server.

To prevent such attacks, you need to restrict the number of requests allowed from a specific IP per minute, hour, and day. Moreover, you need to decrease the ASP.NET request timeout value on machine.config, e.g., you can set it to 15 seconds so that no request is stuck for more than 15 seconds, including calls to the proxy (see Example 7-23, "The machine.config setting for ASP.NET request timeout; set it as low as you can").

Example 7-23. The machine.config setting for ASP.NET request timeout; set it as low as you can

<system.web>
...
...
<httpRuntime executionTimeout="15/>
...
...
</system.web>

Another way to bog down your server is to produce unique URLs and make your proxy cache those URLs. For example, anyone can make your proxy hit https://msdn.microsoft.com/rss.xml?1 and keep adding some numbers in the query string to make the URL unique. No matter what you add on the query string, it will return the same feed. But because you are using an URL as the key for cache, it will cache the large response returned from MSDN against each key. So, if you hit the proxy with 1 to 1,000 query strings, there will be 1,000 identical copies of the MSDN feed on the ASP.NET cache. This will put pressure on the server's memory, and other items from the cache will purge out. As a result, the proxy will start making repeated requests for those lost items and become significantly slower.

One way to prevent this is to set CacheItemPriority as Low for such items in the cache. It will prevent more important items in the cache from purging out. Moreover, you can maintain another dictionary where you store the content's MD5 hash as key and the URL as value. Before storing an item in the cache, calculate the content's MD5 hash and check if it's already in the dictionary. If it is, then this item is already cached, regardless of the URL. So, you can get the original cached URL from the hash dictionary and then use that URL as the key to get the cached content from the ASP.NET cache.

Summary

In this chapter, you learned how to rewrite the Ajax ASMX handler to add asynchronous, transactional, cache-friendly, and faster web service response download capabilities than those provided by the ASP.NET AJAX 1.0 Framework. You also learned the scalability challenges of a proxy service and how to overcome them. The principles introduced here apply to many types of web services, and knowing these in advance will help you eliminate common bottlenecks.