Friday, July 9, 2010

Web Farming with the Network Load Balancing Service in Windows Server 2003

When a single Web Server machine isn’t enough to handle the traffic on your Web site it’s time to look into building a Web Farm that uses multiple machines on the network acting as a single server. In this article Rick looks at the Windows Load Balancing Service and the new interface it sports in Windows Server 2003, which makes creating a Web Farm quick and easy and – gasp –even an affordable solution.



With the release of Windows Server 2003 Network Load Balancing has become a much more visible tool as a part of the operating system, providing a very usable and relatively easy to configure interface that makes it easy to build a Web Farm. The Network Load Balancing Service has been around in one incarnation or another since Windows NT SP4, but Windows Server 2003 is the first operating system that brings this service into the forefront as a main component of the OS. A new Network Load Balancing Manager application is now directly available from the Adminstrative Tasks menu and it’s powerful enough to allow to configure the entire cluster from a single console. The service is now available for all products in the Windows Server family including the lower end Web Edition which means that you now have a much more affordable solution to create Web Farms at your disposal. Just add servers please.



In this article I’ll review the basics of a Load Balancing service and then show you how to set up configure a basic installation using two machines.
Web Farms for city folk – do you need it?

A Web Farm is a not so fancy term for a collection of servers that act as a single Web server. The process behind the scenes maps a ‘virtual’ IP address to multiple machines. Software such as the Network Load Balancing Service or hardware like a specialized router or Load Balancer then deals with dishing up requests to the appropriate machine in the server pool.


Web Farms are an obvious choice if you’ve hit the limits of your single machine hardware. But before jumping on the Web Farm band wagon (or is that a tractor?) you should look closely at your hardware and application and be sure that you can’t make it all run on a single machine first. Although the process of creating a Web Farm isn’t difficult, administration of two or more servers and keeping them properly synched is actually a lot more work than administering a single server.



Upgrading your hardware is certainly one option available to you. Today’s hardware is incredibly capable and should be sufficient to handle all but the most demanding Web applications on a single box. Multiprocessor machines with up to 16 processors on Windows make a pretty powerful platform to run Web applications with, even if those high end machines are rather pricey. While the Yahoo’s and Amazon’s won’t run on a single box (or on Windows for that matter), a vast majority of applications are likely to be able to comfortably serve millions of transactional hits a day from a single machine even with a single processor.



But Load Balancing can also provide benefits in the overload scenario. For one, it’s generally cheaper to throw mid-level machines at a load problem rather than buying one top of the line high end machine. Even with server licenses involved multiple low end machines might provide a more cost efficient solution.



Load Balancing also provides something else that has nothing to do with scalability: The ability to have failover support if something goes wrong on one of the servers in the pool. Because a Web Farm are made up of essentially identically configured servers, a failure on a single server will not bring down the entire Web site. Other servers in the pool can continue to process requests and pick up the slack. For many companies this feature of load balancing is often important for peace of mind both in the knowledge that a single point of failure on the Web Server is avoided as well as providing an in place mechanism to grow the application should the need arise at a later point.
How does it work?

The concept behind Network Load Balancing is pretty simple: Each server in a Load Balancing Cluster is configured with a 'virtual' IP address. This IP address is configured on all the servers that are participating in the load balancing 'cluster' (a loose term that's unrelated to the Microsoft Cluster Service). Whenever a request is made on this virtual IP a network driver on each of these machines intercepts the request for the IP address and re-routes the request to one of the machines in the Load Balancing Cluster based on rules that you can configure for each of the servers in the cluster. Microsoft this process Network Load Balancing (NLB). Figure 1 shows how the process works graphically.



Figure 1 – A network load balancing cluster routes requests to a single virtual IP to available servers in the load balancing cluster. Note that each machine is self-sufficient and runs independent of the others duplicating all of the resources on each server. The database sits on a separate box(es) accessible by all servers.



Although a Web Farm is a common scenario for this service keep in mind that any IP based service can be run off this service. For example, you could use a mail server that is under heavy load and uses a central datastore to share multiple machines in a cluster.



Network Load Balancing facilitates the process of creating a Web Server Farm. A Web Server farm is a redundant cluster of several Web servers serving a single IP address. The most common scenario is that each of the servers is identically configured running the Web server and whatever local Web applications running on the Web server as shown in Figure 1. Each machine has its own copy of everything it needs to run the Web application which includes the HTML files, any script pages (ASP, ASP.Net), any binary files (such as compiled .Net assemblies, COM objects or DLLs loaded from the Web app) and any support files such as configuration and local data files (if any). In short the application should be fully self-contained on a single machine, except for the data which is shared in a central location. Data typically resides in a SQL backend of some sort somewhere on the network, but could also be files shared in a directory for files from a file based database engine such as Visual FoxPro or Access.



Each server in the cluster is fully self-contained, which means it should be able to function without any other in the cluster with the exception of the database (which is not part of the NLB cluster). This means each server must be configured separately and run the Web server as well as any Web server applications that are running. If you're running a static site, all HTML files and images must be replicated across servers. If you’re using ASP or ASP.Net, those ASP pages and all associated binaries and support files must also be replicated. Source control programs like Visual SourceSafe can make this process relatively painless by allowing you to deploy updated files of a project (in Visual Studio.Net or FrontPage for example) to multiple locations simultaneously.



Short of the data everything else is running on all of the machines in the NLB cluster. The key is redundancy in addition to load balancing – if any machine in the cluster goes down, NLB will re-balance the incoming requests to the still running servers in the cluster. The servers in the cluster need to be able to communicate with each other to exchange information about their current processor and network load and even more basic checks to see if a server went down.



If you have COM components as part of your Web application things get more complicated, since the COM objects must be installed and configured on each of the servers. This isn't as simple as copying the file, but also requires re-registering the components, plus potentially moving any additional support files (DLLs, configuration files if needed, non-sql data files etc.). In addition, if you're using In-Process components you'll have to shut down the Web server to unload the components. You'll likely want to set up some scripts or batch files to perform these tasks in an automated fashion pulling update files from a central deployment server. You can use the Windows Scripting Host (.vbs or .js files) along with the IIS Admin objects to automate much of this process. This is often tricky and can be a major job especially if you have a large number of cluster nodes and updates are frequent – strict operational rules are often required to make this process reliable. Luckily if you’re building applications with pure ASP.Net you won’t have these issues since ASP.Net can update .Net binary files without any shutdowns by detecting changes to the source files and shadow copying binary files to a different directory for execution.


Make sure you cover your database!

Since multiple redundant machines are involved in a cluster you'll want to have your data in a central location that can be accessed from all the cluster machines. It's likely that you will use a full client/server database like SQL Server in a Web farm environment, but you can also use file based data access like Visual FoxPro or Jet (Access) tables if those tables are kept in a central location accessed over a LAN connection.



In heavy load balancing scenarios running a SQL backend, it’s important to understand that the database not your application code can easily become your bottleneck! Without going into details here, you need to think about what happens when you overload the database, which is essentially running on a single box. Max out that box and you have problems that are much harder to address than Web load balancing I am going to describe here. At that point you need to think about splitting your databases so that some data can potentially be written to other machines. For redundancy you can use the Microsoft Cluster Service to provide the ability to monitor and sync a backup system that can take over in case of failure of the primary server.



It’s possible that the database can become your weakest link so if you’re after redundancy, make sure you also have a backup plan for your database. If you’re using SQL Server you might want to use Replication to create live shadows on a backup box for example. At the very least make sure that frequent automated backup are performed especially if you’re not using a SQL backend and running file based data engines like FoxPro or Jet.


Efficiciency

Network Load Balancing is very efficient and can provide you reasonably close to 1:1 performance improvement for each machine added into the cluster – there is some overhead involved, but I didn't notice much in my performance tests with Vs.Net Application Center Test Tool with each machine adding 90-95% of its standalone performance to the cluster even in my non-optimized network setup that I was using to conduct the tests.



You may notice that with this level of redundancy increasing your load balancing capability becomes simply a matter of adding additional machines to the cluster, which gives you practically unlimited application scalability (database allowing) if you need it.
Setting up NLB

In order to utilize the Windows Server Network Load Balancing features you will need two machines running Windows Server 2003. Each machine needs to have at least one network card and at least one fixed IP address. Although running with one adapter works well, for best performance it’s recommended that you have two adapters in each machine – one mapped to the real IP Address (Microsoft calls this the Dedicated IP) and one mapped to the ‘virtual’ IP Address (Microsoft calls this the Cluster IP). Be aware that NLB uses some advanced networking features of network adapters, so it’s possible that some low end adapters (especially those for non-server machines) may not support the required NDIS protocols.



In addition you will also need one more machine for testing (3 machines total). The test machine should be external as you can’t use a machine from the pool to test – it will only fire request on the local machine since the IP requests are not traveling over the network when you hit the virtual IP address – it goes to the local machine.



I'm going to use two ‘servers’ here to demonstrate how to set up and run NLB. Assume the IP addresses for these machines are 111.111.111.1 and 111.111.111.2. To create a virtual IP address (Cluster IP) you need to pick an available IP Address on the same Class C network segment. In my example here I’ll use 111.111.111.10.



Unlike previous versions of NLB the new version has a central manager application that you can use to create a cluster from a single machine. Gone are the hassles of having to manually configure each machine manually – you can do it all from a single machine over the network which is a welcome change.



To start setting up this cluster bring up the Network Load Balancing Manager from the Administrative Tools menu. Figure 1 shows what the cluster manager looks like.



Figure 1 – To set up a new NLB cluster bring up the Network Load Balancing Manager and right click to createa a new cluster.



Right-click on the root node to add a new cluster. Next configure the basic cluster configuration, which will consist of assigning the Cluster or virtual IP address. Figure 2 shows what this dialog looks like filled out for our test network.



Figure 2 – Configuring the Cluster IP. This is the ‘virtual’ IP address
that will service all servers in the cluster. Note that you should set the
operation mode to Multicast if you are using a single adapter.



The IP Address is the virtual IP address for the cluster that will be used to address this cluster. NLB will actually create a new IP address on each machine in the cluster and bind it to the specified network adapter (in the next step). Choose a subnet mask – make sure you use the same one for all servers in the cluster. The Full Internet name is only for reference and is used here primarily for displaying the name of the server. But if you have a domain configured for the server you should use that domain name.



Cluster operation mode is very important. Unicast mode means that NLB takes over the network card it is bound to and doesn’t allow any additional network traffic through it. This is the reason why two adapters are a good idea – one that NLB can take over and one that can still handle all other network traffic directed at the dedicated IP address of the server. If you’re using a single adapter you should probably select Multicast which allows both the NLB traffic and the native IP traffic to move through the same network adapter. Multicast is slower than Unicast as both kinds of traffic need to be handled by the network adapter but it’s the only way to remotely configure all machines centrally. You can run a single adapter in Unicast mode, but the cluster manager will not be able to communicate with the server after it’s configured. As a general rule use Unicast for two adapters, Multicast for a single adapter. With my network cards I had to use IGMP mode in order to get the cards to converge properly – you may have to experiment with both modes to see what works best for you.



Leave the Allow Remote Control option unchecked. This allows you to reconfigure the nodes and port rules remotely, although I found little need to do so. Any changes made to the cluster are automatically propagated down to the nodes anyway, so there’s little need to do this with the exception of changing the processing priority. If you do want this functionality I suggest you enable it after you have the cluster up and running.



The next dialog called Cluster IP Addresses allows you to add additional virtual IP addresses. This might be useful if you have a Web server that is hosting multiple Web sites each of which is tied to a specific IP address. For our example here, we don’t need any and can just click next as shown in Figure 3.





Figure 3 – If you need to add additional IP addresses to be load balanced
you can add them here. This is needed only if you host multiple sites
on separate IP addresses and you need separate IPs for these.



Next we need to configure port rules. Port rules determine which TCP/IP port is handled and how. Figure 3 shows the Port Rules dialog with two port rules defined for Port 80 (http) and 443 (SSL). The default port configuration set up by NLB handles all ports, but in this case that rule is too broad. Port rules can’t overlap so if you create specific rules you either have to create them for each port specifically or create ranges that fit your specific ports.





Figure 4 – The Port Rules dialog shows all of the port rules defined for
cluster. By default a rule for all ports – 0 – 65365 is defined. Here I’ve

Created to specific port rules for port 80 and 443.



To add a new port rule click on the Add button which brings up the dialog shown in Figure 5. Here you can configure how the specific port is handled. The key property is the Filtering Mode which determines the affinity of requests. Affinity refers to how requests are routed to a specific server. None means any server can service the incoming request. Single means that a specific server has to handle every request from a given IP address. Generally None is the preferred mode as it scales better in stateless applications. There’s less overhead in NLB as it doesn’t have to route requests in many cases. Single mode is useful for server connections that do require state, such as SSL connections for HTTPS. Secure Server Certificates performs much better with a persistant connection rather than having to create new connections on each of the servers in the pool for requests. Figure 1 shows the configuration for the standard Web Server port - port 80.




Figure 5 – Setting port rules lets you configure how the cluster
responds to client requests. Affinity in particular determines
whether the same server must handle all requests from
a specific IP address (single) or Class C IP address range (Class C).





To set up the second rule for the SSL Port I added another rule and then changed the port to 443 and changed the affinity to single.



Although you can’t do it from here, another important setting is the priority for each machine for each port rule. You can set up Machine 1 to take 80% of the traffic and the second 20% for example. Each rule can be individually configured. We’ll see a little later why this is important for our SSL scenario.



The rules set in this dialog are propagated to all the cluster servers, which is significant, because the cluster port rules must be configured identically on each of the cluster node servers. The configuration tool manages this by remotely pushing the settings to each of the cluster nodes Network Connections IP configuration settings. This is a big improvement over previous versions where you manually had to make sure each machine’s port rules matched and stayed matching.



Up to this point we have configured the cluster and the common parameters for each node. Now we need to add individual nodes to the cluster. Figure 6 shows the dialog that handles this step for the first node as part of the configuration process.



Figure 6 – Adding a node by selecting the IP address and picking a specifc
network adapter.



When you click Next you get to another dialog that lets you configure the cluster node. The main feature to configure on this dialog is the Priority which is a unique ID that identifies each node in the cluster. Each node must have a unique ID and the lower the number the higher the priority. Node 1 is the master which means that it typically receives requests and acts as the routing manager although when load is high other machines will take over.





Figure 7 – Setting the node parameters involves setting a priority for
the machine, which is a unique ID you select. The lower the number
the higher the priority – this machine acts as the master host.



Click finish and now we have one node in our cluster.



Actually, not quite so fast. Be patient, this process isn’t instant. When you click finish the NLB manager actually goes out and configures your network adapter for you. It creates a new IP address in your network connections, enables the Network Load Balancing service on your network adapter(s) you chose during setup and configures the setting we assigned on the NLB property sheet.



You’ll see your network connection flash on and off a few times during this configuration process on the machine you are configuring to be a host. This is normal, but be patient until you see your network connection back up and running.



If all goes well you should see your network connection back up and running and see a new node in the NLB Manager sitting below the cluster (see Figure 8 which shows both nodes). If everything is OK the Status should say Converged. If it does node 1 is ready.



But we’re not quite done yet – we still need to add the second node. To do so right-click on the cluster, after which you go through the steps shown in Figure 7 and 8 one more time. Again be patient, this process is not super fast – it takes about 20 seconds or so to get a response back from a remote machine. Once you click finish the process of Converging can take a minute or more.





Figure 8 – The final cluster with both nodes converged and ready to process requests.


Troubleshooting Tips

I’ve had a few problems getting convergence to happen for the first time. It helps to follow the steps here closely from start to finish and if for whatever reason you end up removing nodes make sure you double check your network settings first before re-adding nodes.



You can check what NLB did in the Network Connections for your machine (Figure 9). Click on the Load Balancing section to see the settings made there. Remember that the settings should match between machines with the exception of IP Addresses assigned for each machine. You should also see the new IP address added in the Internet Protocol settings’ Advanced page.



Figure 9 – All of the setting that NLB makes are made
to the network adapter that the virtual IP is bound to.
You can click on the Network Load Balancing item to
configure the node settings as described earlier. The Virtual
IP also has been added in the Internet Protocol | Advanced
dialog.



If things look Ok, make sure that the machines can ping each other with their dedicated IPs. Figure 10 shows what you should see for one of the machines and you should run this test on both of them:





Figure 10 – Checking whether the machines can see each other.

Use IPCONFIG to see adapter information and you should see both your physical adapter and the virtual IP configured. Make sure that you don’t get any errors that say that there’s a network IP address conflict. If you do it means that the virtual IP is not virtual – ie. It’s entered but it’s not bound to the NLB service. In that case remove the IP and then configure the NLB first, then re-add the IP address. Alternately remove everything then try adding it one more time through the NLB manager.



I’ve also found that it helps to configure remote machines first, then configure the machine running the NLB Manager (if you are using it in the cluster) last. This avoids network issues on the manager machine – plain network access gets a little weird once you have NLB configured on a machine. Again this is a great reason to use two adapters rather than one.
Putting it all together

Ok, so now we’re ready to try it out. For kicks I ran two simple tests using the Application Center Test tool that comes with VS.Net Enterprise Architects on my two machines: My office server (P4 2.2ghz) and my Dell Laptop (also P4 2.2ghz).



For the first test I used only a single ASP.Net page that reads some data from a local SQL Server using a business object. Both machines have SQL Server installed locally and for this first test both are using their own local data from it. I did this to test and see them run individually under load, and then together with Load Balancing to compare the results. This is a contrived example for sure, but it shows nicely what load balancing is capable of doing for you in a best case scenario. Figure 10 shows the output for a short query running both machines with Load Balancing.



Figure 11 – Using Application Center Test to stress test a simple page. The result here is from
combined machines – which running around 275 rps. Machine 1 and 2 individually were running 136 and 158 rps respectively.



The script hits only the ASPX page – no images or other static content was hit. I tested each of the machines individually changing the IP Addresses to their dedicated IPs in the ACT script first and then together by changing the script to use the virtual IP. The results for this short 5 minute test are as follows:



Web Store Single Read Page Test

Test Mode


Requests per second

Office Server 111.111.111.2


162

Laptop 111.111.111.1


141

Both of them Load Balanced 111.111.111.10


276



This is a ratio of 91% for the load balanced vs. the machines individually which is excellent given that we are running with a single adapter here.



The second test is a bit more realistic in that it runs through the entire Web Store application site and uses a shared SQL Server on a third machine.



Web Store Full Order Test

Test Mode


Requests per second

Office Server 111.111.111.2


91

Laptop 111.111.111.1


85

Both of them Load Balanced 111.111.111.10


135



Here the ratio is a bit worse: 77%, but the reason for this drop off has little to do with the Load Balancing, but the fact that there are some limits being hit on the SQL Server. Looking at the lock count with performance monitor reveals that the site is hitting the SQL box pretty heavily and the locking thresholds are causing requests to start slowing down significantly.



This application is not heavily SQL optimized and performance could be improved to make these numbers higher both for individual and combined tests. However, this test shows that load balancing can help performance of an app, but that there may still be other limits that can slow down the application as a whole. In short, beware of load issues beyond the Web front ends that can bite you in terms of performance. Still even in this test where an external limit was being approached we still got a significant gain from using Load Balancing.
Port Rules revisited: SSL

Remember I configured my server for HTTPS operation by configuring port 443 earlier? Actually only one of the servers has the certificate installed, so I need to manage the port rules to drive all HTTPS traffic to the SSL enabled server. This must be administered manually through the Network Connections dialog by clicking on the Load Balancing Service and then configuring the Port Rules. Notice that this dialog shown in Figure 12 has a Load Weight option, which is set to 100 in the SSL enabled server and 0 in the other.






Figure 12 – When editing the Port Rules in Network Connections
you can configure the load weight for each server in percentages.



This effectively drives all SSL traffic to the machine that has the certificate installed.
Load Balancing and your Web applications

Running an application on more than one machine introduces potential challenges into the design and layout of the application. If you're Web app is not 100% stateless you will run into potential problems with resources required on specific machines. You'll want to think about this as you design your Web applications rather than retrofitting at the last minute.



If you're using Active Server Pages, you'll have to know that ASP's useful Session and Application objects will not work across multiple machines. This means you either have to run the cluster with Single Affinity to keep clients coming back to the same machine, or you have to come up with a different session management scheme that stores session data in a more central data store such as a database.



Thankfully ASP.Net has several ways around this problem by providing different options for storing Session state using either a separate State Service that can be accessed across machines or by using Session state stored in a SQL Server database. You should always use session state in one of these mechanisms because these mechanisms can survive Web application restarts which can happen more frequently in ASP.Net due to changes in web.config or simply from the Web Server (IIS 6) recycling an Application Pool.



Finally, load balancing can allow you to scale applications with multiple machines relatively easily. To add more load handling capabilities just add more machines. But remember that when you build applications this way that your weakest link can bring down the entire load balancing scheme. If your SQL backend which all of your cluster nodes are accessing is maxed out, no amount of additional machines in the load balancing cluster will improve performance. The SQL backend is your weakest link and the only way to wring better performance out of it is to upgrade hardware or start splitting databases into separate servers.
Pulling the plug

As mentioned earlier redundancy is one of the goals of a load balanced installation and to test this out I decided to test a failure scenario by pulling the network cable out of one of my servers. With both cluster nodes running one of the clusters went dead and after 10 seconds all requests ended up going to the still active cluster providing the anticipated redundancy. A few requests on the client ended up failing – basically those that had made it into the servers request queue. All others are silently moved over to the other server in the pool.



In another test I decided to turn off the Web service, which resulted expectedly in the network connection still being fed requests that now started to fail. This is to be expected because NLB deals at the network protocol level but doesn’t check for failure of the requests at the network application level (Web Server). For this scenario you will need a smart monitoring application that can tell that your Web services are not responding on port 80 or even better not returning the results that you should be getting back.



The bottom line here is: The service works well for catching fatal failures such as hardware crashes or network failures that cause the network connection to a single machine to die. But application level failures continue to be your responsibility to monitor and respond to.


Just add water… eh, machines

The Windows Server Network Load Balancing service finally makes load balancing affordable and relatively easy to implement. It’s taken a while to get here from two Windows versions back, but now that the tools are integrated into Windows it’s relatively painless to scale out to other machines. It’s good to know that the capabilities are built-in and that you can tackle applications that may require more than a single machine.



Just remember to plan ahead. Just like anything the process of taking an application and making it do something new, spreading apps over multiple machine takes time and some planning to get right. Don’t wait until you really, really can’t live without this feature – start planning for it before you do. Finally make sure you know your bottlenecks in your Web applications. A load balancing cluster is only as good as its weakest link. Pay special attention to data access as that is likely to be the most critical non-cluster component that can potentially snag scalability.



But isn’t that a position we all wish we were in? So much traffic we can’t handle it? Well, hopefully you’ll get to try out this scenario for real – real soon, so you (or your boss) can retire rich…



As always if you have any questions or comments about this article please post a message on our message board at: http://www.west-wind.com/wwThreads/Default.asp?Forum=Code+Magazine.

Friday, June 11, 2010

ICallbackEventHandler in .net for Partial Postback like Ajax



Code follows :

using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;

public partial class zRMX_CallBackData : System.Web.UI.Page , ICallbackEventHandler
{
private string _callbackResult = null;

protected void Page_Load(object sender, EventArgs e)
{
string cbreference = Page.ClientScript.GetCallbackEventReference(this, "arg", " GetRandomNumberFormServer", "context");

string cbscript = "function UseCallback(arg , context ) " + "{" + cbreference + "}";

Page.ClientScript.RegisterClientScriptBlock(this.GetType(), "UseCallback", cbscript, true);
}

public void RaiseCallbackEvent(string eventArg)
{

Random rnd = new Random();
_callbackResult = ss.Text + "/" + rnd.Next().ToString();
}

public String GetCallbackResult()
{

return _callbackResult;

}

}


Code For javascript :

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="CallBackData.aspx.cs" EnableViewState="false" Inherits="zRMX_CallBackData" %>





Untitled Page











ADO.NET Connection Pooling at a Glance

Connection pooling can increase the performance of any application by using active connections of the pool for consecutive requests, rather than creating a new connection each time. By default, ADO.NET enables and uses Connection Pooling due to its positive impact.

Table of Contents

  • ADO.NET Connection Pooling at a Glance
  • Connection Pool Creation
  • Connection Pool Deletion / Clearing Connection Pool
  • Controlling Connection Pool through Connection String
  • Simple ways to View Connections in the pool created by ADO.NET
  • Common Issues/Exceptions/Errors with Connection Pooling
  • Other Useful Reads/References on Connection Pooling
  • Wrapping up

ADO.NET Connection Pooling at a Glance

Establishing a Connection with a database server is a hefty and high resource consuming process. If in any application needs to fire any query against any database server, we need to first establish a connection with server and then execute the query against that database server.

Not sure whether you felt like this or not, when you are writing any stored proc Or a query, the query returns the results with better response time than the response time, when you execute that same query from your any client application. I believe, one of the reasons for such behavior is the overheads involved in getting the desired results from the database server to the client application; and one of such overheads is establishing the Connection between the ADO.

Web applications frequently establish the database connection and close them as soon as they are done. Also notice how most of us write the database driven client applications. Usually, we have a configuration file specific to our application and keep the static information like Connection String in it. That intern means that most of the time we want to connect to same database server, same database, and with same user name and password, for every small and big data.

ADO.NET with IIS uses a technique called Connection Pooling, which is very helpful in applications with such designs. What it does is, on first request to database, it serves the database call. Once it is done and when client application requests for closing the connection, ADO.NET does not destroy the complete connection rather it creates a connection pool and puts the released connection object in the pool and holds the reference to it. And next time when the request to execute any query/stored proc comes up, it bypasses the hefty process of establishing the connection and just picks up the connection from the connection pool and uses that for this database call. This way, it can return the results faster comparatively.

Let us see Connection Pooling Creation Mechanism in more detail.

Connection Pool Creation

Connection Pool and Connection String goes hands in hands. Every connection pool is associated with distinct connection string and that too, it is specific to the application. What in turn means is – a separate connection pool maintained for every distinct process, app domain and connection string.

When any database request is made through ADO.NET, ADO.NET searches for the pool associated with exact match for the connection string, in the same app domain and process. If such pool is not found, ADO.NET creates a new one for it, however, if it is found, it tries to fetch the usable connection from that pool. If no usable free connection is found in the pool, new connection is created and added to the pool. This way, new connections keeps on adding to the pool till Max Pool Size is reached, after that when ADO.NET gets request for further connection, it waits for Connection Timeout time and then errors out.

Now the next question arises is - How any connection is released to pool to be available for such occasions? Once any connection has served and is closed/disposed, the connection goes to the connection pool and becomes usable. At times, connections are not closed/disposed explicitly, these connections do not go to the pool immediately. We can explicitly close the connection by using Close() or Dispose() method of connection object Or by using the "using" statement in C# to instantiate the connection object. It is highly recommended that we close or dispose(don't wait for GC or connection pooler to do it for you) the connection once it has served the purpose.

Connection Pool Deletion / Clearing Connection Pool

Connection Pool is removed, once the app domain from the connection request came unloads. Once the app domain is unloaded, all the connections from the connection pool becomes invalid and are thus removed. Say for example, if you have an ASP.NET application, the connection pool gets created as soon as you hit the database very first time, and connection pool is destroyed as soon as we do iisreset. We'll see it later with example. Note that connection pooling has to do with IIS Web Server and not with the Dev Environment, so do not expect the connection pool to be cleared automatically by closing your visual studio .Net dev environment.

ADO.NET 2.0 introduces two new methods to clear the pool: ClearAllPools and ClearPool. ClearAllPools clears the connection pools for a given provider, and ClearPool clears the connection pool that is associated with a specific connection. If there are connections in use at the time of the call, they are marked appropriately. When they are closed, they are discarded instead of being returned to the pool.

Refer to the section "Simple ways to View Connections in the pool created by ADO.NET" for details of how to determine the status of the pool.

Controlling Connection Pool through Connection String

Connection string plays a vital role in connection pooling. The handshake between ADO.NET and database server happens on the basis of this connection string only. Below is the table with important Connection Pooling specific keywords of the connection strings with their description.

Name

Default

Description

Connection Lifetime

0

When a connection is returned to the pool, its creation time is compared with the current time, and the connection is destroyed if that time span (in seconds) exceeds the value specified by Connection Lifetime.

A value of zero (0) causes pooled connections to have the maximum connection timeout.

Enlist

'true'

When true, the pooler automatically enlists the connection in the creation thread's current transaction context. Recognized values are true, false, yes, and no.

Set Enlist = "false" to ensure that connection is not context specific.

Max Pool Size

100

The maximum number of connections allowed in the pool.

Min Pool Size

0

The minimum number of connections allowed in the pool.

Pooling

'true'

When true, the SQLConnection object is drawn from the appropriate pool, or if it is required, is created and added to the appropriate pool. Recognized values are true, false, yes, and no.

* Table extracted from Microsoft MSDN Library for reference

Other than the above mentioned keywords, one important thing to note here. If you are using Integrated Security, then the connection pool is created for each user accessing the client system, whereas, when you use user id and password in the connection string, single connection pool is maintained across for the application. In the later case, each user can use the connections of the pool created and then released to the pool by other users. Thus using user id and password is recommended for better end user performance experience.

Simple ways to View Connections in the pool created by ADO.NET

We can keep a watch on the connections in the pool by determining the active connections in the database after closing the client application. This is a database specific stuff, so to see the active connections in the database server we must have to use database specific queries. This is with the exception that connection pool is perfectly valid and none of the connection in the pool is corrupted.

For MS SQL Server: Open the Query Analyser and execute the query : EXEC SP_WHO

For Oracle : Open the SQL Plus or any other editor like PL/SQL Developer or TOAD and execute the following query -- SELECT * FROM V$SESSION WHERE PROGRAM IS NOT NULL

All right, let us do it with SQL Server 2000

1. Create a Sample ASP.NET Web Application

2. Open an instance of Query Analyzer and run the EXEC SP_WHO query. Note the loginname column, and look for MACHINENAME\ASPNET. If you have not run any other ASP.NET application, you will get no rows with loginname as "MACHINENAME\ASPNET".

3. On Page load of default startup page, add a method that makes a database call. Say your connection string is "initial catalog=Northwind; Min Pool Size=20;Max Pool Size=500; data source=localhost; Connect Timeout=30; Integrated security=sspi"

4. Run your ASP.NET application

5. Now repeat Step 2 and observe that there are exactly 20 (Min Pool Size) connections in the results. Note that you made the database call only once.

6. Close the web page of your web application and repeat step 2. Observe that even after you closed the instance of the web page connections persists.

7. Now Reset the IIS. You can do that by execute the command "iisreset" on the Run Command.

8. Now Repeat Step 2 and observe that all the 20 connections are gone. This is because your app domain has got unloaded with IIS reset.

Common Issues/Exceptions/Errors with Connection Pooling

1. You receive the exception with the message: "Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached" in your .NET client application.

This occurs when you try using more than Max Pool Size connections. By default, the max pool size is 100. If we try to obtain connection more than max pool size, then ADO.NET waits for Connection Timeout for the connection from the pool. If even after that connection is not available, we get the above exception.

Solution(s):

1. Very first step that we should do is – Ensure that every connection that is opened, is Closed explicitly. At times what happens is, we open the connection, performs the desired database operation, but we do not close the connection explicitly. Internally it cannot be used as available valid connection from pool. The application would have to wait for GC to claim it, until then it is not marked as available from pool. In such case, even though you are not using max pool size number of connection simultaneously, you may get this error. This is the most probable cause of this issue.

2. Increase Max Pool Size value to a sufficient Max value. You can do so by including "Max Pool Size = N;" in the connection string, where N is the new Max Pool size.

3. Set the Pooling Off. Well, this indeed is not a good idea as Connection Pooling puts a positive performance effect but it definitely is better that getting any such exceptions.

2. You receive the exception with the message: "A transport-level error has occurred when sending the request to the server. (provider: Shared Memory Provider, error: 0 - Shared Memory Provider: )" in your ASP.NET application with MS SQL Server


This occurs when MS SQL Server 2000 encounter some issues and has to refresh all the connections and ADO.NET still expects the connection from the pool. Basically, it occurs when connection pool gets corrupted. What in turn happens is, ADO.NET thinks that the valid connection exists with database server, but actually, due to database server getting restarted it has lost all the connections.

Solution(s) :

1. If you are working with .NET and Oracle using ODP.NET v 9.2.0.4 or above, you can probably try adding "Validate Connection=true" in the connection string. Well, in couple of places, I noticed people saying use "validcon=true" works for them for prior versions on ODP.NET. See which works for you. With ODP.NET v 9.2.0.4, "validcon=true" errors out and "Validate Connection=true" works just fine.

2. If you are working with .NET 2.0 and MS SQL Server, You can clear a specific connection pool by using the static (shared in Visual Basic .NET) method SqlConnection.ClearPool or clear all of the connection pools in an appdomain by using the SqlConnection.ClearPools method. Both SqlClient and OracleClient implement this functionality.

3. If you are working with .NET 1.1 and MS SQL Server,

a. In the connection string at the run time append a blank space and try establishing the connection again. What in turn it would do is, a new connection pool would be created and will be used by your application, In the meantime the prior pool will get removed if it's not getting used.

b. Do exception handling, and as soon as you get this error try connection afresh repeatedly in the loop. With time, ADO.NET and database server will automatically get in sync.

Well, I am not totally convinced with either approach, but frankly speaking, I could not get any better workable solution for this so far.

3. Leaking Connections

When we do not close/dispose the connection, GC collects them in its own time, such connections are considered as leaked from pooling point of view. There is a strange possibility that we reach max pool size value and at that given moment of time without actually using all of them, having couple of them leaked and waiting for GC to work upon them. This would actually lead to the exception mentioned above, even if we are not using max pool size number of connections.

Solution(s):

1. Ensure that we Close/Dispose the connections once its usage is over.

Other Useful Reads/References on Connection Pooling

1. http://samples.gotdotnet.com/QuickStart/howto/doc/adoplus/connectionpooling.aspx

2. ADO.NET Connection Pooling Explained :::: http://www.ondotnet.com/pub/a/dotnet/2004/02/09/connpool.html

3. The .NET Connection Pool Lifeguard :::: http://msdn2.microsoft.com/en-us/library/aa175863(SQL.80).aspx

Wrapping up

In nutshell, Connection pooling can increase the performance of any application by using active connections of the pool for consecutive requests, rather than creating a new connection each time. By default, ADO.NET enables and uses Connection Pooling due to its positive impact. And at the same time, the developer who is the best judge of his/her application, can configure the connection pooling features, or can even switch it off, based on the applications need by simply using power keywords of connection string.

Friday, April 23, 2010

ViewState and Dynamic Control

thought I understand ViewState, until I came cross this exception:

Failed to load viewstate. The control tree into which viewstate is being loaded must match the control tree that was used to save viewstate during the previous request. For example, when adding controls dynamically, the controls added during a post-back must match the type and position of the controls added during the initial request.



This is a question asked by someone on a .NET mailing list. My first guess of what causing the problem is that on a page postback, when LoadViewState() is invoked to restore the saved ViewState values to the page and its controls (both Control tree and ViewState tree have been created at this stage), somehow, the ViewState tree doesn't match the control tree. So when ASP.NET tries to restore a ViewState value to a control, no control or a wrong control is found and then the exception occurs.

Note: the ViewState tree (type of Triplet or Pair) is NOT the ViewState property (type of StateBag) of the page or any of its controls. You can think it as an object representation of the ViewState value on the html page (the __VIEWSTATE hidden field), which contains all the values need to be written back to the controls during a page postback. If you don't change the default behavior, during the page initialize/load phrase, the ViewState tree will be created by de-serializing the value __VIEWSTATE field by LoadPageStateFromPersistenceMedium(), and the values on the ViewState tree will be put into the controls ViewState bag in LoadViewState() . During the page save/render phrase, the ViewState tree will be created again by SaveViewState (), then serialized and written onto html page by SavePageStateToPersistenceMedium ()

So, I thought I could reproduce same exception with something simple like this:

Defualt.aspx







Default.aspx.cs

public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

}

}

}

It is indeed a very simple page with a button named btnPostback created statically on .aspx file, and another button named btnClickMe created dynamically in Page.OnInit(), and I will not recreate the btnClickMe for postbacks. So on a page postback, by the time OnInit() and LoadPageStateFromPersistenceMedium() is executed, the control tree and ViewState tree would have different structure, the ViewState tree will have value for btnClickMe, but the control tree will not have the control btnClickMe. I thought it would be good enough to cause the exception, but soon I was proved wrong, there was no exception thrown.

To find out why, let's have a look of the actual ViewState value generated on the html page

“/wEPDwUKMTQ2OTkzNDMyMWRkOWxNFeQcY9jzeKVCluHBdzA6WBo=”

With a little help from ViewState Decoder I got this:







1469934321







There is no view state data for the neither of the buttons! I did expect something like for a control has empty state though.

So, I think here is the first thing I learned:

For a control on the Control tree, there may not be a corresponding item on the ViewState tree (if there is no state for this control need to be saved). If there is nothing found on ViewState tree for a control, the control’s LoadViewState() will not be invoked.

So, let's do something to make the button "dirty" and its ViewState saved.

public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

btnClickMe.Text = "Click me";

}

}

}



The ViewState now became:



/wEPDwUKMTQ2OTkzNDMyMQ9kFgICAw9kFgICAw8PFgIeBFRleHQFCENsaWNrIG1lZGRkaZj77nQ7KGQERj05RRc1lk+fvNA=









1469934321





3





3







Text

Click me























The format of the ViewState looks quite interesting, but let's worry about it later.For now, it does look like Text property of the btnClickMe having been saved. Great!



But when I ran it, still no exception was thrown.



So, I guess that is just the way it works:



For an Item on the ViewState tree, if there is no corresponding control can be found on control tree, this ViewState Item will be ignored.



So, how about creating a different control instead? Something like this:



public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

btnClickMe.Text = "Click me";

}

else

{

Label label = new Label();

form1.Controls.Add(label);

}

}

}



Still no exception! And it is very interesting, after the the page postback, the btnClickMe was gone and a label was shown with text "Click me"! I didn’t assign any value to its Text Property. Why "Click me" was there? The ASP.NET has restored the ViewState value onto label, but the value actually doesn't belong to it!



So, here is another interesting thing:



ASP.NET doesn’t really know which control a ViewState item belongs to. It only matches a item on the ViewState tree and a control on Control tree by the index.



If we have a look of the format of the saved ViewState, it contains nothing but just the indices of the control and the value-keys, so there is no way for ASP.NET can figure out which control exactly it belongs to. Anyway, I think this make perfect sense, we do want the _VIEWSTATE fields as small as possible, don't we?



The above sample has demonstrated a ViewState value for a Button's Text property was restored to a Label's Text Property on page postback. Now it comes to some interesting questions: what will happen if



1. The second control doesn't have the property with same name?

2. The second control has the property with different data type?

3. 2 controls are very different, say, a button and a GridView?



Let's find out!



public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

btnClickMe.Text = "Click me";

btnClickMe.CommandName = "XXX";

}

else

{

Label label = new Label();

form1.Controls.Add(label);

}

}

}



This time I have assigned a value to button's CommandName property, but Label doesn't have this property. If you run this code, still no exception will occur. Is it "XXX" for CommandName just simply ignored? Let's have a look of the saved ViewState after postback:



/wEPDwUKMTQ2OTkzNDMyMQ9kFgICAw9kFgICAw8PFgQeBFRleHQFCENsaWNrIG1lHgtDb21tYW5kTmFtZQUDWFhY

ZGRk7q5i15YA6gDUPW8m/IVLqGXnb+4=









1469934321





3





3







Text

Click me

CommandName

XXX























Note this is the saved ViewState after a post back, the values are for the label. So even a label doesn't have CommandName property, the value was still written to Label's ViewState bag, and then saved. So if you dynamically change a control at runtime, the new control may silently "inherit" some rubbish ViewState from control was previously sitting at the position, and carry it all the time, pass it from server to client, and client back to server.



To test the second question out, there is a bit more code I had to write, as I couldn't find a property which on two different controls with different data type, so I have defined my own ones.



public class MyButton : Button

{

public string MyProperty

{

get { return ViewState["MyProperty"] == null ? String.Empty : ViewState["MyProperty"] as String; }

set { ViewState["MyProperty"] = value; }

}

}



public class MyLabel: Label

{

public Color MyProperty

{

get { return ViewState["MyProperty"] == null ? Color.Black : (Color)ViewState["MyProperty"]; }

set { ViewState["MyProperty"] = value; }

}

}



public partial class _Default : System.Web.UI.Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

MyButton btnClickMe = new MyButton();

form1.Controls.Add(btnClickMe);

btnClickMe.MyProperty = "XXX";

}

else

{

MyLabel label = new MyLabel ();

label.ID = "label";

form1.Controls.Add(label);

}

}



protected override void OnLoad(EventArgs e)

{

base.OnLoad(e);

if (IsPostBack)

{

MyLabel label = form1.FindControl("label") as MyLabel;

label.Text = label.MyProperty.ToString();

}

}

}



As you can see both MyButton and MyLabel have Property called MyProperty, though MyButton.MyProperty is string type, but MyTextBox.MyProperty is Color type.



From previous test, we have learned that the LoadViewState() will write the ViewState values into controls' ViewState bag directly without being bothered to go through the controls' property. So, I would expect "XXX" will be written into MyLabel’s ViewState bag successfully even though MyLabel.MyProperty really expects a Color value, but we are going to have problem if we try to access the value in MyLabel.MyProperty.



My guess was right this time, if you run the code, an InvlaidCastException will be thrown by (Color)ViewState["MyProperty"] when the Property is accessed in OnLoad().



OK, the last one now, what will happen if two controls are very different? Ok, maybe we don't need something complicated as GridView, let's just try a DropDownList:



public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

btnClickMe.Text = "Click me";

}

else

{

DropDownList ddl = new DropDownList();

form1.Controls.Add(ddl);

}

}

}



When I ran the page, after clicking Postback button, I got a page returned with this error message:

Failed to load viewstate. The control tree into which viewstate is being loaded must match the control tree that was used to save viewstate during the previous request. For example, when adding controls dynamically, the controls added during a post-back must match the type and position of the controls added during the initial request.

Aha, finally get it!



To find out why, I made some little changes, first I defined MyDropDownList:



public class MyDropDownList : DropDownList

{

protected override void LoadViewState(object savedState)

{

base.LoadViewState(savedState);

}

}



There is nothing in MyDropDownList, it just overrides the LoadViewState(), so I can place a break point there.



And then I changed page to use MyDropDownList:



public partial class _Default : Page

{

protected override void OnInit(EventArgs e)

{

base.OnInit(e);

if (!IsPostBack)

{

Button btnClickMe = new Button();

form1.Controls.Add(btnClickMe);

btnClickMe.Text = "Click me";

}

else

{

MyDropDownList ddl = new MyDropDownList();

form1.Controls.Add(ddl);

}

}

}



Let’s see what is going to happen, put a break point at base.LoadViewState(), run it, click the Postback button, MyDropDownList created, the its LoadViewState() invoked, program hanged at base.LoadViewState(), good, just worked as expected! Hold on, this looks like a problem: LoadViewState(object savedState) seems to be expecting a Triplet object as a parameter, but what is actually passed in here is a Pair!



It does make sense, doesn't it? Don't forget the Pair object is the saved ViewState left behind from the btnClickMe, and on the postback, ASP.NET doesn’t know which control it belongs to, what the ViewState tree can tell is "it belongs to the 3rd control on the form1" On the postback, the 3rd control on form1 now became a DropDownList, but ASP.NET is silly enough to try to restore it with the Pair object. For a DropDownList, only a Triplet object can be used to restore it, so, of course, when LoadViewState() is trying to do something like "Triplet triplet1 = (Triplet)savedState;", an exception will occur.



After having inspected some ASP.NET framework code using Lutz Roeder's .NET Reflector, expecially SaveViewState() and LoadViewState() method, I finally got a better picture of the what happened. A Control actually has full responsibility of saving/loading its ViewState. In ASP.NET, most of controls inherit their parent’s behavior defined in Control, WebControl or ListControl, but a control have complete control over it, and in theory, a control can have any data structure for holding its saved ViewState, as long as it is serializable and both SaveViewState() and LoadViewState() understand it. Normally, the ViewState of a WebControl will be saved as a Pair object, if Pair is not enough, a Triplet object may be used, like what ListControl does(ListControl needs a another object to hold the states for its child items). When ASP.NET tries to restore a control's ViewState with a velue which is saved for another control, if the two controls have different save ViewState object type, the Exception above will be thrown.



Conclusion:

When ASP.NET tries to restore ViewState values to a page and its controls, a ViewState tree will be created by deserializing the _VIEWSTATE value on html page. The ViewState tree contains only control indices and key-value pairs. ASP.NET finds the control for a ViewState item by the index*, and directly writes the value into control's ViewState bag. If you dynamically create/remove controls at runtime, it will be very likely to fool ASP.NET to restore ViewState values to a wrong control and causing a problem. Depending on what control is dynamically created/removed, following problems may occur:



1 Runtime exception when restoring ViewState

2 Runtime exception when accessing a property of a control

3 A control's property may have an unexpected value

4 A control may carry rubbish ViewState value and increase the size of _VIEWSTATE field on html page



Problems above may be difficult to notice and debug, especially 3 and 4

*One can override this behavior using ViewStateModeByIdAttribute.