Translate

Monday, November 26, 2012

Tomcat Clustering Series Part 1 : Simple Load Balancer

Source: http://www.ramkitech.com/2012/10/tomcat-clustering-series-simple-load.html
I am going to start new series of posts about Tomcat clustering. In this first post we will see what is problem in normal deployment in only single machine, what is clustering and why is necessary and how to setup the simple load balancer with Apache httpd web server + Tomcat server cluster.[Check the video for better understanding]

Why need Clustering? (Tomcat Clustering) 
          In normal production servers are running in single machine. If that's machine may be failed due to crashed or hardware defects or OutOfMemory exception then  our site can't access by anybody.

so how to solve this problem?. to add more tomcat machine to collectively (group/cluster) run as a production server. (oppose of single machine). Each tomcat has deployed the same web application. so any tomcat can process the client request. If one tomcat is failed, then other tomcat in the cluster to proceeds the request.

Here one big problem is arrive. each tomcat instances are running in dedicated physical machine or many tomcat instances are running in single machine(Check my post about Running multiple tomcat instances in single machine). so each tomcat running on different port and may be in  different IP.

the problem is in client perspective, to which tomcat we need to make the request? because there are lots of tomcat part of clustering is running. each tomcat we need to make IP and Port combination. like
http://192.168.56.190:8080/ or http://192.168.56.191:8181/
so how to solve this problem?
To add one server in-front of all tomcat clusters. to accept all the request and distribute to the cluster. so this server acts as a load balancer. There is lots of server is available with load balancing capability. here we are going to use Apache httpd web server as a load balancer. with mod_jk module.
so now all clients to access the load balancer (Apache httpd web server) and don't bother about tomcat instances. so now ur URL is http://ramkitech.com/ (Apache runs on port 80).

Apache httpd Web Server
           Here we are going to use Apache httpd web server as a Load Balancer. To provide the load balancing capability to Apache httpd server we need to include the either mod_proxy module or mod_jk module. here we are using mod_jk module.

Before continuing this post check my old post (Virtual Host Apache httpd server) about How to install the Apache httpd server and mod_jk module. and how to configure the mod_jk.

How to setup the Simple Load Balancer
         For simplicity purpose i going to run 3 tomcat instances in single machine(we can run on dedicated machine also) with Apache httpd web server. and single web application is deployed in all tomcat instances.
here we use mod_jk module as the load balancer. by default its use the round robin algorithm to distribute the requests. now we need to configure the workers.properties file like virtual host concept in Apache httpd server.

worker.list=tomcat1,tomcat2,tomcat3

worker.tomcat1.type=ajp13
worker.tomcat1.port=8009
worker.tomcat1.host=localhost

worker.tomcat2.type=ajp13
worker.tomcat2.port=8010
worker.tomcat2.host=localhost

worker.tomcat3.type=ajp13
worker.tomcat3.port=8011
worker.tomcat3.host=localhost

here i configure the 3 tomcat instances in workers.properties file. here type is ajp13 and port is ajp port (not http connector port) and host is IP address of tomcat instance machine.

there are couple of special workers we need add into workers.properties file.

First one is add load balancer worker, here the name is  balancer (u can put any name).


worker.balancer.type=lb
worker.balancer.balance_workers=tomcat1,tomcat2,tomcat3

here this worker type is lb, ie load balancer. its special type provide by load balancer. and another property is balance_workers to specify all tomcat instances like tomcat1,tomcat2,tomcat3 (comma separated)

Second one, add the status worker, Its optional. but from this worker we can get statistical of load balancer.

worker.stat.type=status

here we use special type status.

now we modify the worker.list property.

worker.list=balancer,stat

so from outside there are 2 workers are visible (balancer and stat). so all request comes to balancer. then balancer worker manage all tomcat instances.

complete workers.properties file

  1. worker.list=balancer,stat  
  2.   
  3. worker.tomcat1.type=ajp13  
  4. worker.tomcat1.port=8009  
  5. worker.tomcat1.host=localhost  
  6.   
  7. worker.tomcat2.type=ajp13  
  8. worker.tomcat2.port=8010  
  9. worker.tomcat2.host=localhost  
  10.   
  11. worker.tomcat3.type=ajp13  
  12. worker.tomcat3.port=8011  
  13. worker.tomcat3.host=localhost  
  14.   
  15.   
  16. worker.balancer.type=lb  
  17. worker.balancer.balance_workers=tomcat1,tomcat2,tomcat3  
  18.   
  19. worker.stat.type=status  


Now workers.properties confiuration is finished. now we need to send the all request to balancer worker.
so modify the httpd.conf file of Apache httpd server

  1. LoadModule    jk_module  modules/mod_jk.so  
  2.   
  3. JkWorkersFile conf/workers.properties  
  4.   
  5. JkLogFile     logs/mod_jk.log  
  6. JkLogLevel    emerg  
  7. JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "  
  8. JkOptions     +ForwardKeySize +ForwardURICompat -ForwardDirectories  
  9. JkRequestLogFormat     "%w %V %T"  
  10.   
  11. JkMount  /status  stat  
  12. JkMount  /  balancer  

the above code is just boiler plate code. 1st line load the mod_jk module, 2nd line to specified the worker file (workers.properties file). all others are just logging purpose.

The last 2 lines are important.
JkMount  /status  stat   means any request to match the /status then that request forward to stat worker. Its status type worker. so its shows status of load balancer.

JkMount  /  balancer this line matches all the request, so all request is forward to balancer worker. In balancer worker ist uses the round robin algorithm to distribuete the request to other tomcat instances.
That's it.
now access the load balancer from the browser. each and every request is distribute to 3 tomcat instances. If one of the tomcat instances are failed then load balancer dynamically understand and stop to forward the request to that failed tomcat instances. Other tomcat instances are continue to work. If that failed tomcat is recovered from failed state to normal state then load balancer add to cluster and forward the request to that tomcat. (check the video)

Here big question is How Load balancer knows when one tomcat instance is failed or tomcat is just recovered from failed state?
Ans : when one tomcat instance is failed, load balancer don't know about that instances is failed. so its try to forward the request to all tomcat instances. If load balancer try to forward the request to failed tomcat instance, its will not respond. so load balancer understand and marked the state as a failed and forward the same request to another tomcat instances. so client perspective we not feel one tomcat instances are failed.

when tomcat instances recovered from failed state. that time also load balancer don't know that tomcat is ready for processing. Its still marked the state is failed. In periodic interval load balancer checks the health status of all tomcat instances. (by default 60 sec). after checking health status then only load balancer came to know that tomcat instance is ready. and its update the status is OK.