2.0 Introduction
Todayās internet user experience demands performance and uptime. To achieve this, multiple copies of the same system are run, and the load is distributed over them. As the load increases, another copy of the system can be brought online. This architecture technique is calledĀ horizontal scaling. Software-based infrastructure is increasing in popularity because of its flexibility, opening up a vast world of possibilities. Whether the use case is as small as a set of two for high availability, or as large as thousands around the globe, thereās a need for a load-balancing solution that is as dynamic as the infrastructure. NGINX fills this need in a number of ways, such as HTTP, TCP, and user datagram protocol (UDP) load balancing, which we cover in this chapter.
When balancing load, itās important that the impact to the clientās experience is entirely positive. Many modern web architectures employ stateless application tiers, storing state in shared memory or databases. However, this is not the reality for all. Session state is immensely valuable and vast in interactive applications. This state might be stored locally to the application server for a number of reasons; for example, in applications for which the data being worked is so large that network overhead is too expensive in performance. When state is stored locally to an application server, it is extremely important to the user experience that the subsequent requests continue to be delivered to the same server. Another facet of the situation is that servers should not be released until the session has finished. Working with stateful applications at scale requires an intelligent load balancer. NGINX Plus offers multiple ways to solve this problem by tracking cookies or routing. This chapter covers session persistence as it pertains to load balancing with NGINX and NGINX Plus.
Itās important to ensure that the application that NGINX is serving is healthy. For a number of reasons, upstream requests may begin to fail. It could be because of network connectivity, server failure, or application failure, to name a few. Proxies and load balancers must be smart enough to detect failure of upstream servers (servers behind the load balancer or proxy), and stop passing traffic to them; otherwise, the client will be waiting, only to be delivered a timeout. A way to mitigate service degradation when a server fails is to have the proxy check the health of the upstream servers. NGINX offers two different types of health checks: passive, available in the open source version; and active, available only in NGINX Plus. Active health checks at regular intervals will make a connection or request to the upstream server, and can verify that the response is correct. Passive health checks monitor the connection or responses of the upstream server as clients make the request or connection. You might want to use passive health checks to reduce the load of your upstream servers, and you might want to use active health checks to determine failure of an upstream server before a client is served a failure. The tail end of this chapter examines monitoring the health of the upstream application servers for which youāre load balancing.
2.1 HTTP Load Balancing
Problem
You need to distribute load between two or more HTTP servers.
Solution
Use NGINX's HTTP module to load balance over HTTP servers using the upstream block:
This configuration balances load across two HTTP servers on port 80, and defines one as a backup which is used when the primary two are unavailable. The weight parameter instructs NGINX to pass twice as many requests to the second server, and the weight parameter defaults to 1.
Discussion
The HTTPĀ upstreamĀ module controls the load balancing for HTTP. This module defines a pool of destinationsāany combination of Unix sockets, IP addresses, and DNS records, or a mix. TheĀ upstreamĀ module also defines how any individual request is assigned to any of the upstream servers.
Each upstream destination is defined in the upstream pool by theĀ serverĀ directive. TheĀ serverĀ directive is provided a Unix socket, IP address, or aĀ fullyĀ qualifiedĀ domain nameĀ (FQDN), along with a number of optional parameters. The optional parameters give more control over the routing of requests. These parameters include the weight of the server in the balancing algorithm; whether the server is in standby mode, available, or unavailable; and how to determine if the server is unavailable. NGINX Plus provides a number of other convenient parameters like connection limits to the server, advanced DNS resolution control, and the ability to slowly ramp up connecā tions to a server after it starts.