Dino Esposito
Wintellect
July 10, 2003
Applies to:
Microsoft® ASP.NET
Summary: Offers a detailed description of the constituent components of the HTTP runtime and of the logic that drives the processing of individual requests directed to ASP.NET applications. Also examines the behavior of the worker process from the standpoints of the Web garden model and the newest IIS 6 process model, and shows all the steps by which a HTTP request becomes plain HTML text. (10 printed pages)
Contents
Introduction
Components of the ASP.NET Infrastructure
The Web Garden Model
The HTTP Pipeline
Temporary Files and Page Assemblies
Summary
Introduction
Reliability and performance are key requirements for all Web applications, no matter what the underlying platform. These two requirements, though, in a sense contrast with each other. For instance, to build a more reliable and robust application, you might want to separate the Web server from the physical application, which will therefore work out-of-process. But just working in a memory context distinct from the Web server process makes the application intrinsically slower. For this reason, any reasonable measures should be taken to ensure that the out-of-process code runs as fast as possible.
The Microsoft® ASP.NET runtime environment was architected according to design principles that place a premium on reliability and performance. The resultant ASP.NET process model comprises two system elements—an in-process connector living inside the Web server process and an external worker process. In addition, the ASP.NET runtime infrastructure is scalable enough to automatically exploit any selected processor on multi-processor hardware. This model, known as a Web garden, enables multiple worker processes to run at the same time, each on a distinct processor.
At the highest level of abstraction, the ASP.NET runtime has three qualifying attributes:
- Total decoupling between applications and the ASP.NET worker process. The lifetime of the application is in no way affected by the lifetime of the servicing worker process. In other words, worker processes can come and go while the application is up and running all the time.
- Although ASP.NET applications never run in-process within the Web server, in the most common scenarios, the overall performance is close to that of in-process applications.
- Built-in and configurable support for Web garden architectures. By simply looking at settings in the configuration file, the worker process can clone itself to exploit all CPUs that have affinity with the process. As a result, for most scenarios you get really close to linear scalability on multi-processor machines. (More on this later.)
In this article, we'll examine the constituent elements of the ASP.NET runtime environment and go over, step by step, the long and winding road by which a URL request becomes plain HTML text.
Unless explicitly stated, the following description refers to the default process model of ASP.NET—the only one available with Microsoft® Internet Information Services (IIS) 5.x.
Components of the ASP.NET Infrastructure
ASP.NET applications execute under the aegis of the hosting Web server. On Microsoft® Windows® server platforms, the Web server is represented by the IIS executable named inetinfo.exe. It is a native part of the Windows 2000 operating system and newer versions. Notice, though, that in Microsoft® Windows Server™ 2003, neither IIS nor ASP.NET are installed by default. You must add them by clicking on the Add or Remove Programs applet in the Control Panel.
IIS is an unmanaged executable that provides an extensibility model based on ISAPI extension and filter modules. By writing such modules, developers can directly take over requests for specific resource types and hook up the current request at various predefined steps. Extensions and filters are DLLs that export a few functions with well-known names and signatures. Such plug-in components are registered and configured in the IIS metabase.
Only a few resource types that clients request are handled directly by IIS. For example, any incoming request for HTML pages, text files, JPEG and GIF images is processed by IIS itself. Requests for Active Server Pages (*.asp) files are resolved by invoking an ASP-specific extension module named asp.dll. Likewise, requests for ASP.NET resources (for example, *.aspx, *.asmx, *.ashx) are passed on to the ASP.NET ISAPI extension. This system component is a Win32 DLL named aspnet_isapi.dll. The ASP.NET extension handles various resource types, including Web services and HTTP handler calls.
The ASP.NET ISAPI extension is a Win32 DLL and does not host managed code. It is the central console that receives and dispatches requests for a variety of ASP.NET resources. By design, the module lives in the IIS process and runs under the SYSTEM account with administrative privileges. This account cannot be modified by developers and system administrators. The ASP.NET ISAPI extension is responsible for invoking the ASP.NET worker process (aspnet_wp.exe) which, in turn, controls the execution of the request. In addition to request routing, the ASP.NET ISAPI monitors the health of the worker process and is responsible for killing it when the performance degrades beyond a certain threshold.
The worker process is a small Win32 shell of code that hosts the common language runtime (CLR) and runs managed code. It takes care of servicing requests for ASPX, ASMX, and ASHX resources. There is generally only one instance of this process on a given machine. All currently active ASP.NET applications run inside of it, each in a separate AppDomain. As mentioned earlier, though, the worker process supports the Web garden mode, meaning that identical copies of the process run on all of the CPUs with affinity to the process. (More on this later in "The Web Garden Model" section.)
The communication between the ISAPI and the worker process is conducted using a set of named pipes. A named pipe is a Win32 mechanism for transferring data over process boundaries. As the name suggests, a named pipe works like a pipe: you enter data in one end, and the same data comes out the other end. Pipes can be established both to connect local processes and processes running on remote machines. For local interprocess communications, pipes are the most efficient and flexible tool available in Windows.
To guarantee optimal performance, aspnet_isapi uses asynchronous named pipes to forward requests to the worker process and to get responses. On the other hand, the worker process exploits synchronous pipes when it needs to query for information about the IIS environment (that is, server variables). The aspnet_isapi module creates a fixed number of named pipes and uses overlapped operations to service simultaneous connections through a small pool of threads. When a pipe-driven data exchange operation has finished, the completion routine disconnects the client and reuses the pipe instance to serve a new one. The pool of threads and overlapped operations guarantee a good level of performance to the ASP.NET ISAPI. In no case, though, does the aspnet_isapi extension process the HTTP request.
The logic behind the processing of each ASP.NET request can be summarized in the following steps.
- When the request arrives, IIS examines the resource type and calls into the ASP.NET ISAPI extension. If the default process model is enabled, aspnet_isapi queues the request and assigns it to the worker process. Any request data is sent through asynchronous I/O. If the IIS 6 process model is enabled, the request is automatically queued to the worker process (w3wp.exe) handling the IIS application pool to which the application belongs. The IIS 6 worker process doesn't know anything about ASP.NET and managed code. It is limited to processing the *.aspx extension and loading the aspnet_isapi module. When the ASP.NET ISAPI works under the IIS 6 process model, it behaves differently and just loads the CLR in the context of the w3wp.exe worker process.
- After receiving the request, the ASP.NET worker process notifies the ASP.NET ISAPI that it is going to serve it. The notification takes place through synchronous I/O. The synchronous model is used because for consistency the worker process can't start processing a request that is not yet marked as "executing" in the ISAPI's internal requests table. A request that is being serviced by a particular worker process cannot be reassigned to a different process unless the original one dies.
- The request executes in the context of the worker process. There might be circumstances in which the worker process needs to call the ISAPI back in order to complete the request that is, to enumerate server variables). In this case, the worker process uses synchronous pipes because this would preserve the sequence of the request-processing logic.
- When finished, the response is sent to aspnet_isapi opening an asynchronous pipe. The state of the request now changes to "Done"; later on the request will be removed from the table. If the worker process crashes, all the requests it was handling remain in the "executing" state for a while. When aspnet_isapi detects that the worker process is dead, it automatically aborts the request and frees any associated IIS resources.
The description above refers to the default ASP.NET process model—a working model built to work on IIS 5.x. The default way of working of IIS 6 (available with Windows Server 2003) affects the ASP.NET process model too. When hosted on IIS 6.0, ASP.NET 1.1 automatically adapts its way of working to the host environment. The aspnet_wp worker process is no longer used and also some of the configuration parameters defined in the machine.config file are ignored. From the ASP.NET perspective, the big change with IIS 6 is that everything about a request takes place under the control of aspnet_isapi and in the context of the w3wp.exe worker process. The account of the worker process is the account set for the application pool that the Web application belongs to. By default, this account is NETWORKSERVICE—a built-in, weak account functionally equivalent to ASPNET.
The worker process is subject to a feature named process recycling. Process recycling consists in the aspnet_isapi ability of automatically starting a new process when the existing one is consuming too much memory, responds too slowly, or just hangs. When this happens, new requests are serviced by the new instance, which becomes the new active process. However, all the requests assigned to the old process remain pending. When the old process has finished with pending requests and enters idle state, it is terminated. If the worker process crashes, or in anyway stops processing the requests, all pending requests are reassigned to a new process.
Although the ASP.NET ISAPI and the worker process are the key components of the ASP.NET runtime infrastructure, other executables contribute to its working. The following table lists all of these components.
Table 1. Executables that form the ASP.NET runtime environment
Name | Type | Account |
---|---|---|
aspnet_isapi.dll | Win32 DLL (ISAPI extension) | LOCAL SYSTEM |
aspnet_wp.exe | Win32 EXE | ASPNET |
aspnet_filter.dll | Win32 DLL (ISAPI filter) | LOCAL SYSTEM |
aspnet_state.exe | Win32 NT Service | ASPNET |
The aspnet_filter.dll component is a small Win32 ISAPI filter used to back up the cookieless session state for ASP.NET applications. In Windows Server 2003, when the IIS 6 process model is enabled, aspnet_filter.dll also filters out requests for non-executable resources located in the Bin directory.
The role of aspnet_state.exe is more vital to Web applications, as it has to do with session state management. It is an optional service that can be used to store session state data outside of the Web application memory space. The executable is an NT service and can be run either locally or remotely. When the service is active, an ASP.NET application can be configured to store any session information into the memory of this process. A similar scheme provides for more reliable storage of data not subject to process recycling and ASP.NET applications failure. The service runs under the ASPNET local account, but this can be configured using the Service Control Manager interface.
Although not strictly part of the infrastructure, another executable that should be mentioned is aspnet_regiis.exe. The utility configures the environment for side-by-side execution of different ASP.NET versions on a single computer. The utility is also helpful to repair IIS and ASP.NET broken configurations. The utility works by updating the script maps stored in the IIS metabase root and below. A script map is an association set between resource types and ASP.NET modules. Finally, the tool can also be used to display the status of all installed versions of ASP.NET, and to perform other configuration operations, such as granting NTFS permissions to specific folders and creating client-script directories.
The Web Garden Model
The Web garden model is configurable through the <processModel> section of the machine.config file. Notice that the <processModel> section is the only configuration section that cannot be placed in an application-specific web.config file. This means that the Web garden mode applies to all applications running on the machine. However, by using the <location> node in the machine.config source, you can adapt machine-wide settings on a per-application basis.
Two attributes in the <processModel> section affect the Web garden model. They are webGarden and cpuMask. The webGarden attribute takes a Boolean value that indicates whether or not multiple worker processes (one per each affinitized CPU) have to be used. The attribute is set to false by default. The cpuMask attribute stores a DWORD value whose binary representation provides a bit mask for the CPUs that are eligible to run the ASP.NET worker process. The default value is -1 (0xFFFFFF), which means that all available CPUs can be used. The contents of the cpuMask attribute is ignored when the webGarden attribute is false. The cpuMask attribute also sets an upper bound to the number of copies of aspnet_wp.exe that are running.
The old motto "not everything that shines is gold" is an apt quotation here. Web gardening enables multiple worker processes to run at the same time. However, you should note that all processes will have their own copy of application state, in-process session state, ASP.NET cache, static data, and all that is needed to run applications. When the Web garden mode is enabled, the ASP.NET ISAPI launches as many worker processes as there are CPUs, each a full clone of the next (and each affinitized with the corresponding CPU). To balance the workload, incoming requests are partitioned among running processes in a round-robin manner. Worker processes get recycled as in the single processor case. Note that ASP.NET inherits any CPU usage restriction from the operating system and doesn’t include any custom semantics for doing this.
All in all, the Web garden model is not necessarily a big win for all applications. The more stateful applications are, the more they risk to pay in terms of real performance. Working data is stored in blocks of shared memory so that any changes entered by a process are immediately visible to others. However, for the time it takes to service a request, working data is copied in the context of the process. Each worker process, therefore, will handle its own copy of working data, and the more stateful the application, the higher the cost in performance. In this context, careful and savvy application benchmarking is an absolute must.
Changes made to the <processModel> section of the configuration file are effective only after IIS is restarted. In IIS 6, Web gardening parameters are stored in the IIS metabase; the webGarden and cpuMask attributes are ignored.
The HTTP Pipeline
When the ASP.NET ISAPI extension starts the worker process up, it passes a few command-line parameters. The worker process uses these parameters to perform tasks that need to happen before the CLR is loaded. Among the values passed are the required authentication level for COM and DCOM security, the number of named pipes available for use, and the IIS process ID. The names of the named pipes are randomly generated using the IIS process ID and the number of pipes allowed. The worker process does not receive the names of the available pipes but does receive the information sufficient to figure them out.
What do COM and DCOM security have to do with the Microsoft® .NET Framework? Actually, the CLR is exposed as a COM object. More exactly, the CLR itself is not made of COM code, but the interface to the CLR is a COM object. The worker process, therefore, loads the CLR up as it were a COM object.
When an ASPX request hits IIS, the Web server assigns a token based on the authentication model of choice—anonymous, Windows, Basic, or Digest. This token is passed along to the worker process when this receives the request to process. The request is picked up by a thread within the worker process. This thread inherits the identity token from the IIS thread that originally picked the incoming request up. In the context of aspnet_wp.exe, the actual account in charge for working the request out depends on how impersonation is configured in the particular ASP.NET application. If impersonation is disabled (the default setting) the thread runs under the account of the worker process. In the default case, this account is ASPNET in the ASP.NET process model and NETWORKSERVICE in the IIS 6 process model. Both are "weak" accounts that provide a limited set of capabilities and excellently fend off revert-to-self attacks. (A revert-to-self attack consists of reverting the security token of the impersonated client to the token of the parent process. Giving the worker process a weak account makes such attacks fail.)
At the highest level of abstraction, the ASP.NET worker process accomplishes one main task—handing the request over to a chain of managed objects dubbed the HTTP pipeline. The HTTP pipeline is activated by creating a new instance of the HttpRuntime class and then calling its ProcessRequest method. As mentioned, in ASP.NET you have a single worker process running all the time (except that the Web garden model is enabled) that manages all Web applications in distinct AppDomains. Each AppDomain has its own instance of the HttpRuntime class—the entry point in the pipeline. The HttpRuntime object initializes a number of internal objects that will help carry the request out. Helper objects include the cache manager (the Cache object) and the internal file system monitor used to detect changes in the source files that form the application. The HttpRuntime creates the context for the request and fills it up with any HTTP information specific to the request. The context is represented by an instance of the HttpContext class.
Another helper object that gets created at such an early stage of the HTTP runtime setup is the text writer—to contain the response text for the browser. The text writer is an instance of the HttpWriter class and is the object that actually buffers any text programmatically sent out by the code in the page. Once the HTTP runtime is initialized, it finds an application object to fulfill the request. An application object is an instance of the HttpApplication class—the class behind the global.asax file. The global.asax is optional at the programming level but strictly needed at the infrastructure level. For this reason, a default object must be used if no class has been constructed in the application, and the ASP.NET runtime comprises a couple of intermediate factory classes that are expected to find and return a valid handler object to service the request. The first factory class that gets in the game is HttpApplicationFactory. Its main task consists of using the URL information to find a match between the virtual directory of the URL and a pooled HttpApplication object.
The behavior of the application factory class can be outlined as follows:
- The factory class maintains a pool of HttpApplication objects and uses them to service requests for the application. The pool has the same duration of the application's lifetime.
- When the first request for the application arrives, the factory class extracts information about the type of the application (the global.asax class), sets up file monitoring for changes, creates the application state, and fires the Application_OnStart event.
- The factory picks up an HttpApplication instance from the pool and charges it with the request to process. If no objects are available, a new HttpApplication object is created. The creation of an HttpApplication object entails the compilation of the global.asax application file.
- The HttpApplication begins processing the request and won't be available for new requests until the request completes. Should new requests for the same resource come in, they will be handled by other objects in the pool.
- The application object gives all registered HTTP modules a chance to preprocess the request and figures out what type of handler can best handle the request. It does this by looking at the extension of the URL requested and the information in the configuration file.
HTTP handlers are classes that implement the IHttpHandler interface. The .NET Framework provides a few predefined handlers for common types of resources, including ASPX pages and Web services. The <httpHandlers> section of the machine.config file defines the name of the class that the HttpApplication object must instantiate to serve a request for the particular type of resource. If the helper class is a handler factory, the GetHandler method will actually determine the type of handler to use. At this point, a handler of the proper type is picked up from a pool of similar objects and configured to process the request.
The IHttpHandler interface features a couple of methods: IsReusable and ProcessRequest. The former returns a Boolean value that indicates whether the handler can be pooled. (Most of predefined handlers are pooled, but you could define your own that require a new instance each time.) The ProcessRequest method contains all of the logic needed to process a resource of a particular type. For example, the handler for ASPX pages is based on the following pseudo code:
private void ProcessRequest() { // Determine whether the request comes as a postback IsPostBack = DeterminePostBackMode(); // Fire the Page_Init event to the ASPX source code PageInit(); // Load the viewstate and process posted values if (IsPostBack) { LoadPageViewState(); ProcessPostData(); } // Fire the Page_Load event to the ASPX source code PageLoad(); // 1) Process posted values for the second time (in case of // dynamically created controls) // 2) Raise property-changed server-side events to input-driven // controls (i.e., the checkbox state changed) // 3) Execute any code associated with the postback event if (IsPostBack) { ProcessPostDataSecondTry(); RaiseChangedEvents(); RaisePostBackEvent(); } // Fire the Page_PreRender event to the ASPX source code PreRender(); // Save the current state of the controls to the viewstate SavePageViewState(); // Render the contents of the page to HTML RenderControl(CreateHtmlTextWriter(Response.Output)); }
The model based on HTTP handlers is the same no matter the type of resource that is invoked. The only element that varies with the resource type is the handler. The HttpApplication object is responsible to find out which handler should be used to process the request. The HttpApplication object is also responsible for detecting changes to the dynamically created assemblies that represent the resource, be it an .aspx page or an .asmx Web service. If any changes are detected, the application object ensures that the up-to-date source for the requested resource is compiled and loaded.
Temporary Files and Page Assemblies
To finish off the tour of the ASP.NET HTTP runtime, let's analyze what happens at the file system level when an ASP.NET page is requested. As you'll see in a moment, a bunch of temporary and dynamically created files are managed and monitored by the objects of the HTTP pipeline.
You write and deploy a Web page as an .aspx text file, although you can insulate the core code of the page in a code-behind C# or Microsoft® Visual Basic® .NET class. For the page to be visible as a URL, an .aspx file must always be available in the Web space of the application. The actual content of the .aspx file determines the assembly (or assemblies) that the application object will load up.
By design, the HttpApplication object looks for a class named after the requested ASPX file. If the page is named sample.aspx, then the corresponding class to load is named ASP.sample_aspx. The application object looks for such a class in all of the assembly folders of the Web application—the Global Assembly Cache (GAC), the Bin subfolder, and the Temporary ASP.NET Files folder. If no such class is found, the HTTP infrastructure parses the source code of the .aspx file, creates a C# or Visual Basic .NET class (depending on the language set on the .aspx page), and compiles it on the fly. The newly created assembly has a randomly generated name and is located in an application-specific subfolder of the following path: C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\Temporary ASP.NET Files.
The v1.1.4322 subfolder is specific to ASP.NET 1.1; expect to see a different version number if you're using ASP.NET 1.0. In this case, the subfolder is named v1.0.3705. The next time the page is accessed, the assembly is there and won't be recreated. However, in which way does the HttpApplication object determine whether a page-specific assembly exists? Does it really need to scan a handful of folders each time? Well, not exactly.
The application object only looks at the contents of a particular folder located under the Temporary ASP.NET Files folder. The exact path—an application-specific path—is returned by the HttpRuntime.CodegenDir property. If the .aspx file is accessed for the first time (that is, no page assembly has been created yet), in this folder there will not be an XML file whose name begins with the name of the ASPX page. For example, the sample.aspx page with a dynamic assembly associated should have an entry named like this:
sample.aspx.XXXXX.xml
The XXXXX placeholder is a hash code. By reading the content of this XML file, the application object learns about the name of the assembly to load and the class to pick up in it. The following code snippet is a typical content of such a helper file. The name of the assembly that contains the ASP.sample_aspx class is mvxvx8xr.
<preserve assem="mvxvx8xr" type="ASP.sample_aspx"> <filedep name="c:\inetpub\wwwroot\vdir\sample.aspx" /> </preserve>
It goes without saying that this file is created only when the source code of the filedep file is parsed to generate a dynamic assembly. Any change in the filedep file will invalidate the assembly and cause a new compile process on the next request. You should note that this is an implementation detail that might significantly change in future versions of the ASP.NET framework. Be careful if, for any reason, you decide to make use of it in your current applications.
When a new assembly is created for a page as the effect of an update, ASP.NET verifies whether the old assembly can be deleted. If the assembly only contains the class for the modified page, ASP.NET attempts to delete and replace the assembly; otherwise a new one is created without touching the old assembly.
During the deletion process, ASP.NET might find that the assembly file is loaded and locked. In this case, the old assembly is simply renamed by adding a ".DELETE" extension. (Note that any Windows file can always be renamed while in use.) These temporary .DELETE files are removed as soon as the application is restarted, for example, because of the changes to one of the application files such as global.asax and web.config. In no case, does the ASP.NET runtime remove these files while serving the next request.
Note that by default, each ASP.NET application is allowed a maximum of 15 page recompiles before the whole application is restarted, with a subsequent loss of session and application data. When the latest compilation exceeds the threshold set in the numRecompilesBeforeAppRestart attribute of the <httpRuntime> section, the AppDomain is unloaded and the application is restarted. Also note that in the .NET Framework, you can't unload a single assembly. The AppDomain is the minimum block of code that can be unloaded off the CLR.
Summary
There are two relevant aspects in ASP.NET applications: the process model and the page object model. ASP.NET anticipates some of the features of IIS 6.0—the new and revolutionary version of the Microsoft Web information services that shipped with Windows Server 2003. In particular, ASP.NET applications run in a separate worker process, just as all applications do in IIS 6. Furthermore, the ASP.NET runtime automatically recycles the worker process to guarantee excellent performance in spite of run-time anomalies, memory leaks, and programming errors. The same feature becomes a system feature in IIS 6.0.
In this article, I covered the underpinnings of the default ASP.NET process model, and the interaction between the IIS level code (the ASP.NET ISAPI extension) and the worker process. Along the way, I also tried to point out differences new to the IIS 6 process model. Though I didn't adequately cover the page object model, that's exactly what I'm going to do in a future article. Stay tuned!
To learn more about the internals of ASP.NET, the HTTP runtime and the page object model, check out my new book Programming Microsoft ASP.NET from Microsoft Press, 2003.
About the Author
Dino Esposito is a trainer and consultant based in Rome, Italy. Member of the Wintellect team, Dino specializes in ASP.NET and ADO.NET and spends most of his time teaching and consulting across Europe and the United States. In particular, Dino manages the ADO.NET courseware for Wintellect and writes the "Cutting Edge" column for MSDN Magazine. Get in touch at dinoe@wintellect.com.