Tuesday, July 21, 2015

Load Balancing with Apache: The Mighty mod_proxy_balancer

As your super-hot website becomes crowded with more visitors each day, you may want to consider installing multiple web servers to serve it more efficiently and with lesser latency. In such cases, distributing the HTTP traffic among the servers becomes a crucial task. Fortunately, the de facto Apache HTTP server inherently supports load balancing both at plain simple and extremely complex levels. Even if you don't have a super-hot website, you may want to set up a load balancer on your local server for statistical purposes, e.g. measuring the pending requests (requests-in-flight) counts during a load test.

Apache's load balancing goes hand in hand with virtual hosts and proxies. A virtual host (vhost) allows Apache to simulate a virtual domain inside the web server's context. For example, you can set up multiple vhosts to host several websites on your system; this is commonly used by hosting providers to host websites from multiple users on a limited number of systems, as it provides an easy way of practically isolating the sites from one another. In our context, a proxy mediates by routing traffic to a destination different from its initial target, which in our case would be one worker from a pool of load balancing servers (balancer members).

Reminder: If you plan to follow this guide incrementally, don't forget to restart Apache after each config change or a2enmod command!

You need to enable a few modules to get proxies working under Apache: mod_proxy (for proxying) and mod_proxy_http (for proxying HTTP traffic). The required command a2enmod should be run with root privileges, so if you are not root but are in the sudoers group, you'll have to prepend it with sudo.

Syntax:
a2enmod <space-separated list of modules to enable, without the mod_ prefix>

e.g. for our case:
a2enmod proxy proxy_http

To configure a vhost, you have to add a <VirtualHost> entry to Apache's config. This usually goes into a separate proxy.conf file under the mods-enabled subdirectory in the Apache config directory, e.g. /etc/apache2 or /etc/httpd). This file usually appears after you enable mod_proxy as described earlier.

<IfModule mod_proxy.c>
	Listen 8080
	<VirtualHost 127.0.0.1:8080>
		ErrorLog /var/log/apache2/proxy.log
		DocumentRoot /var/www/html/my-subdomain
	</VirtualHost>
</IfModule>

Above example simply instructs Apache to listen for connections to port 8080 on localhost (127.0.0.1), and serve those requests from content found under the directory /var/www/html/my-subdomain, logging any errors to /var/log/apache2/proxy.log.

For adding a load balancer, we remove the DocumentRoot directive and define a proxy under the vhost:

		<Proxy balancer://myproxy>
			BalancerMember http://127.0.0.1:8081
			BalancerMember http://127.0.0.1:8082
		</Proxy>

and route traffic for the desired domain to that proxy:

		ProxyPass / balancer://myproxy/ lbmethod=bybusyness

This balances requests arriving at 127.0.0.1:8080 (directed at the root, /) into two proxy endpoints (BalancerMembers) at 127.0.0.1:8081 and 127.0.0.1:8082, based on their current busyness levels (queued requests counts).

Don't forget to enable the mod_proxy_balancer and mod_lbmethod_bybusyness modules. The first provides the actual load balancing feature while the other enables the by-busyness traffic routing policy. Other policies like byrequests and bytraffic are also supported, and you'll have to edit the ProxyPass directive and enable the relevant modules accordingly in order to use them.

For viewing load balancing statistics, we may also define a proxy-balancer endpoint (which will be provisioned by mod_proxy_balancer) under the vhost:

		<Location /balancer-manager>
			SetHandler balancer-manager
			Order deny,allow
			Allow from all
		</Location>

and add another ProxyPass directive for making it accessible, also under the vhost:

		ProxyPass /balancer-manager !
Visiting http://127.0.0.1:8080/proxy-balancer would then display overall and per-proxy statistics, such as the number of requests served and pending, and errors.

The full config now looks like this:

<IfModule mod_proxy.c>
	Listen 8080
	<VirtualHost 127.0.0.1:8080>
		ErrorLog /var/log/apache2/proxy.log

		<Proxy balancer://myproxy>
			BalancerMember http://127.0.0.1:8081
			BalancerMember http://127.0.0.1:8082
		</Proxy>

		<Location /balancer-manager>
			SetHandler balancer-manager
			Order deny,allow
			Allow from all
		</Location>

		ProxyPass /balancer-manager !
		ProxyPass / balancer://myproxy/ lbmethod=byrequests
	</VirtualHost>
</IfModule>

Unfortunately, the current set-up does not handle sessions properly; if your site tracks user data with sessions (e.g. $_SESSION in PHP), the session data will intermittently become unavailable as the user's requests alternate among proxy endpoints (balancer members), as the session data would be available only on the first member that served a request of that user.

One solution is to add (under <VirtualHost>) a ROUTEID cookie to the request header:

		Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED

and make new requests to 'stick to' the initially chosen endpoint via a ProxySet directive (under <Proxy>):

			ProxySet stickysession=ROUTEID

making the complete config look like this:

<IfModule mod_proxy.c>
	Listen 8080
	<VirtualHost 127.0.0.1:8080>
		ErrorLog /var/log/apache2/proxy.log

		Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
		<Proxy balancer://myproxy>
			BalancerMember http://127.0.0.1:8081
			BalancerMember http://127.0.0.1:8082
			ProxySet stickysession=ROUTEID
		</Proxy>

		<Location /balancer-manager>
			SetHandler balancer-manager
			Order deny,allow
			Allow from all
		</Location>

		ProxyPass /balancer-manager !
		ProxyPass / balancer://myproxy/ lbmethod=byrequests
	</VirtualHost>
</IfModule>

Now you will also need to enable mod_headers for the Header add operation to work.

In this case, the first worker (endpoint, say worker1) to handle the first request coming from some user X (strictly speaking, this would be a client; a browser, in most cases) would set the value of ROUTEID cookie to its own path, and subsequent requests from X would be routed exclusively to worker1, based on the already set value of ROUTEID cookie.

The above was a very brief introduction to load balancing with Apache. Combining this with other advanced features of mod_proxy and mod_proxy_balancer, as well as other modules of Apache, you will be able to set up a sophisticated site config on Apache quite easily.

Thursday, July 16, 2015

Fine Tune Apache Web Server

Tuning HTTPD - Apache2 Web Server

Hello there!
I will show you how to configure apache2 httpd web server to withstand heavy load conditions. This is indeed a pre-configuration if you are about to do a performance analysis on your web application running on apache. Adding/modifying the following lines of code to your config file is all you need to do.

Config file can be found either in [/etc/apache2/apache.conf] or [/etc/httpd/httpd.conf]

<IfModule prefork.c>
StartServers       4
MinSpareServers    3
MaxSpareServers   10
ServerLimit      256
MaxClients       256
MaxRequestsPerChild  10000
</IfModule>
  
First of all, whenever an apache is started, it will start 2 child processes which is determined by StartServers parameter. Then each process will start 25 threads determined by ThreadsPerChild parameter so this means 2 process can service only 50 concurrent connections/clients i.e. 25x2=50. Now if more concurrent users comes, then another child process will start, that can service another 25 users. But how many child processes can be started is controlled by ServerLimit parameter, this means that in the configuration above, i can have 16 child processes in total, with each child process can handle 25 thread, in total handling 16x25=400 concurrent users. But if number defined in MaxClients is less which is 200 here, then this means that after 8 child processes, no extra process will start since we have defined an upper cap of MaxClients. This also means that if i set MaxClients to 1000, after 16 child processes and 400 connections, no extra process will start and we cannot service more than 400 concurrent clients even if we have increase the MaxClient parameter. In this case, we need to also increase ServerLimit to 1000/25 i.e. MaxClients/ThreadsPerChild=40

Once you appended the above lines you need to restart apache in order for the configurations to be loaded.

sudo service apache2 restart


Watch The Server

Keep an eye on the number of Apache processes, and the total RAM used. Here's a command that wraps this into a single output that updates every second:

watch -n 1 "echo -n 'Apache Processes: ' && ps -C apache2 --no-headers | wc -l && free -m"

It produces output like this:

Every 1.0s: echo -n 'Apache Processes: ' && ps -C apache2 --no-headers | wc -l && free -m

Apache Processes: 27
            total       used        free        shared      buffers     cached
Mem:        8204        7445        758         0           385         4657
-/+ buffers/cache:      2402        5801
Swap:       16383       189         16194