Reading Time: 5 mins
HAProxy which stands for High Availability proxy, is open-source software for TCP and HTTP protocols. Noted for its speed and lightweight framework, it is one of the industry’ most opted load balancers. During high-traffic days such as holidays and festive seasons, hundreds and thousands of concurrent users may end up hitting e-commerce sites like Magento, at such instances, it forms critical to balance the server’s workload.
HAProxy as a Load Balancer
Being the only load balancing solution to various high-profile websites like GitHub, Stack Overflow, Twitter and so forth, HAProxy considerably reduces the workload by distributing the load across multiple servers. The other primary reasons to use a load balancer is that it prevents web servers crash and efficiently handles a large number of concurrent connections. The following are the widely used load balancers in trend,
- AWS AutoScaling
- Nginx Load Balancer
- HAProxy Load Balancer
Here, we go with HAProxy to be used as a load balancer for our Website, since we use Nginx as a web server. Though it is possible to use Nginx as a web server and load balancer, we stick with HAProxy, as using a single software for both the purpose, may lead to a system slow down.
Understanding the HAProxy Algorithms
It is the algorithm which is used to select a server for load balancing. HAProxy comes with several algorithms for effective load balancing, in which few of them is explained in the paragraphs below. To know more, visit http://cbonte.github.io/haproxy-dconv/2.2/configuration.html
Round Robin
It is one of the simple, practical and smoothest algorithms of all. Here, the scheduling is done by fixing a particular time (a.k.a time quantum or time slice) for each process in a pre-emptive design. By this way, it ensures that each process gets an equal share of CPU. Once the process gets completed in that allotted time quantum, the process either gets terminated or falls back in the Ready Queue (RQ).
After every process has had its equal share of CPU, another allocation of CPU for processes (by picking the first in the RQ list) will start afresh and this cycle continues.
Least Connections
It works by dynamically choosing a server with the least number of connections whenever a client tries to connect. This is done in order to enhance the performance of the server as the rule goes as follows,
Least Connections means Less Load = Better Performance.
It’s recommended using for connections with longer sessions.
By identifying the hash of the source IP and dividing it by the total weight of active servers, it is ensured that a specific IP is always connected to the same server. Only at the time when the concerned server goes down or up, the user (identified via IP address) is connected to a different server.
Sticky Sessions
For each client, a unique session object will be allotted by creating an affinity between a client and a network server. This duration of a session, the time an IP spends on a website is what is called a Session Persistence or Sticky Session.
Health Check
The health check is done in order to identify an available back-end server for processing requests. The server which fails health check will be longer forwarded any requests ( and receives no traffic) unless and until it becomes healthy again. The motive of this algorithm is to eliminate the task of manually removing a server in the back-end.
Setting Up HAProxy as a Load Balancer
Haproxy server:
server example.com
This is our web server and if this server gets more traffic then the incoming new connections will be redirected to the below servers based on the HAProxy load balancing algorithm.
Load balancing servers:
server example1.com
server example2.com
server example3.com
Here, you need to make sure that the above servers having the same content and listens on port 80.
Since we already installed HAProxy in our system, we have the haproxy.cfg file in the following directory,
cd /etc/haproxy/
To install HAProxy on your system, visit our blog post, Installing HAProxy on Ubuntu 18.04
To directly open the HAProxy file, use the below command:
sudo nano /etc/haproxy/haproxy.cfg
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxyDaemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). This list is from: #https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/ ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256::RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http
Now, we need to add the web server’s IP address in the front-end section.
frontend haproxynode bind ssl crt /etc/ssl/private/example.pem mode http reqadd X-Forwarded-Proto:\ https acl secure dst_port eq 443 redirect scheme https if !{ ssl_fc } rspadd Strict-Transport-Security:\ max-age=31536000;\ IncludeSubDomains;\preload rsprep ^Set-Cookie:\ (.*) Set-Cookie:\ \1;\ Secure If secure default_backend backendnodes
As for as the back-end is concerned, you need to include the concerned server’s IP address with port. The label which you have given at the end of the front-end section (default_backend backendnodes) to define the back-end section must be reciprocated in the back-end section too (as backend backendnodes).
backend backendnodes balance roundrobin option forwardfor http-request set-header X-Forwarded-Port %[dst_port] http-request add-header X-Forwarded-Proto https if { ssl_fc } server vpsieprod option httpchk HEAD / HTTP/1.1\r\nHost:localhost server example1.com check server example2.com check server example3.com check
It forms crucial for you to understand some of the key aspects while configuring HAProxy as a Load Balancer on your system.
- The bind-address should be the web server’s IP address and the balance forwarded by your chosen algorithm as per your requirements. Here, the default algorithm Round-robin is used.
- HAProxy does not recognize the normal SSL certificates’ format, so we need to convert it into a PEM format. Post which you need to configure that path in our haproxy.cfg file as mentioned above.
Don’t Know How to Convert into a PEM Format?
After installing the SSL certificate on your domain name, follow the below steps.
Step 1: The SSL certificate installed on the following directory
cd /etc/letsencrypt/live/example.com
Step 2: Here, you need to cat the following files,
privkey.pem and fullchain.pem
Like the below ones,
cat privkey.pem fullchain.pem > example.pem
Now, copy this PEM file to this directory cd /etc/ssl/private/
Step 3: Finally include the PEM file in the HAProxy configuration file. Next, you need to include your servers as done in the above configuration.
Note: The naming of servers are just for identification purposes only, it is the IP address and configuring the servers to listen on port 80 is what plays a vital role.
Enabling HAProxy Stats
To enable the HAProxy statistics, add the following in the HAProxy configuration file. In case, if you need to secure your stats page from third-party viewing, you can set a username and password.
listen stats bind ip_address:1930 stats enable stats hide-version stats refresh 30s stats show-node stats auth username:password stats uri /stats
Note: The stats indicate the landing page of the web browser which pops-up once you run the above-mentioned command.
Final View of the Configured HAProxy File
This is how the final configuration file looks like:
global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxyDaemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). This list is from: #https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/ ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256::RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode http option httplog option dontlognull timeout connect 5000 timeout client 50000 timeout server 50000errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http frontend haproxynode bind ssl crt /etc/ssl/private/example.pem mode http reqadd X-Forwarded-Proto:\ https acl secure dst_port eq 443 redirect scheme https if !{ ssl_fc } rspadd Strict-Transport-Security:\ max-age=31536000;\ IncludeSubDomains;\preload rsprep ^Set-Cookie:\ (.*) Set-Cookie:\ \1;\ Secure If secure default_backend backendnodes backend backendnodes balance roundrobin option forwardfor http-request set-header X-Forwarded-Port %[dst_port] http-request add-header X-Forwarded-Proto https if { ssl_fc } server vpsieprod option httpchk HEAD / HTTP/1.1\r\nHost:localhost server example1.com check server example2.com check server example3.com check listen stats bind ip_address:1930 stats enable stats hide-version stats refresh 30s stats show-node stats auth username:password stats uri /stats
Restarting HAProxy
For checking syntax errors, use the command:
sudo haproxy -c -f /etc/haproxy/haproxy.cfg
For restarting haproxy, use the command:
sudo service haproxy restart
Thus, HAProxy has been set up and configured as a load balancer in an Ubuntu system. Setting up HAProxy as a load balancer comes with so many benefits as it prevents the system from server’s crash and also works to eliminate the system slow down. Being reputed for efficiently distributing the workload across multiple servers, it is the best go-to solution for high-traffic websites like Magento.