Lecture
Nginx is one of the most popular web servers in the world. It can handle high loads and a large number of simultaneous connections. Nginx can also be used as a balancer, mail server or reverse proxy.
This manual will tell you how Nginx handles client requests. Understanding this mechanism will help you optimize the processing of requests.
Nginx logically divides the configurations intended for serving various content into blocks that are assembled into a hierarchical structure. Nginx starts processing each client request by defining the necessary configuration blocks. This decision making process will be the focus of this manual.
The basic blocks that we discuss are called server and location.
The server block is a subset of the Nginx configuration that defines the virtual server used to process requests of a particular type. Administrators often set up multiple server blocks, where each block handles connections based on the requested domain, port, and IP address.
The location block is located in the server block and is used to enable Nginx to process requests for different resources and the URI of the parent server. With this block, the administrator can divide the URI space as required. This is an extremely flexible model.
Nginx allows you to define several server blocks that function as separate instances of virtual web servers. Therefore, Nginx needs a procedure for determining which of these blocks will be used to process the request.
For this, Nginx applies a specific system of checks that are used to find the best match. The main server block directives that help Nginx determine the required block are listen and server_name.
Nginx first looks at the IP address and port of the request. It maps these values to the listen directive of each server block and creates a list of blocks that can service the request.
The listen directive usually specifies the IP address and port of the server block. By default, any server block that does not have a listen directive receives parameters 0.0.0.0:80 (or 0.0.0.0:8080 if Nginx is started by a regular non-root user). This allows such blocks to respond to requests on any interface on port 80. But this standard value does not have much weight in the process of selecting a block.
The listen directive can specify:
The latter option, as a rule, is used only when sending requests between different servers.
First, Nginx will try to select a block based on the listen directive using the following rules:
It is important to understand that Nginx will evaluate the server_name directive only when it needs to select one block from the list of blocks selected by the listen directive. For example, if the example.com domain is located on port 80 at 192.168.1.10, the request for example.com will always be served by the first block in the example below, despite the server_name directive in the second block.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
If Nginx selected several blocks with the same level of specificity, then it will check the server_name directive.
For further evaluation of requests that have the same definition of the listen directive, Nginx checks the Host request header. This value contains the domain or IP address that the client is requesting.
Nginx is looking for the best match of this value in the server_name directive of each block that passed the previous selection stage. Nginx evaluates this directive using this formula:
For each combination of IP address and port, there is a default server block, which is used if the web server could not find another block. As a rule, this is either the first block in the configuration, or the block that contains the default_server parameter as part of the listen directive (it overrides the search algorithm for the first match). For each combination of IP address and port, there can be only one default_server declaration.
If there is a block in the configuration with the server_name directive, the value of which completely coincides with the Host request header, the request is passed to this block for processing.
For example, if the Host request header is host1.example.com, the web server will select the second server block to service it:
server {
listen 80;
server_name * .example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
If Nginx does not find exact matches, it will look for a block in which server_name starts with a special character. If Nginx finds several matches, the most accurate one will be used to service the request. For example, if the request contains the Host header www.example.org, Nginx will select the second block:
server {
listen 80;
server_name www.example. *;
. . .
}
server {
listen 80;
server_name * .example.org;
. . .
}
server {
listen 80;
server_name * .org;
. . .
}
If it was not possible to find a block by a special character at the beginning of the directive, Nginx will search for a block whose server_name value ends with a special character. If he finds several matches, he uses the most accurate one. For example, to process a request with the Host www.example.com header, the web server uses the third server block:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example. *;
. . .
}
If the search for a block by a special character did not work, Nginx will look for server_name directives that contain regular expressions. To process the request, the first block whose regular expression in the directive matches the request header will be used.
For example, to service a request with the Host www.example.com header, the web server selects the second server block:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~ ^ (www | host1). * \. example \ .com $;
. . .
}
server {
listen 80;
server_name ~ ^ (subdomain | set | www | host1). * \. example \ .com $;
. . .
}
If none of the search engines yielded any results, the web server will apply the server block by default.
A similar algorithm Nginx uses to find the location block.
First you should familiarize yourself with the syntax of the location block. Location blocks are located inside server blocks (or other location blocks) and are used to determine how to handle the request URI (that part of the request that follows the domain name or IP address / port).
As a rule, the location block has the following form:
location optional_modifier location_match {
. . .
}
The location_match in the example above indicates that Nginx should verify the request URI. The presence or absence of the modifier in the example above affects how Nginx will look for the location block.
There are such location block modifiers:
As an example of a prefix search, you can use the following location block to respond to URI requests (/ site, /site/page1/index.html, or /site/index.html).
location / site {
. . .
}
Below you will find an example of exact URI matching. Such a block will always be used to service the URI.
/ page1. It will not respond to the request URI /page1/index.html. Keep in mind that if this block is selected and the request is served by an index page, there will be an internal redirect to another location block, which will be the actual request handler.
location = / page1 {
. . .
}
The interpretation of the location block as a regular expression is case sensitive in the following example. This block will be used to process requests for /tortoise.jpg, but not for /FLOWER.PNG:
location ~ \. (jpe? g | png | gif | ico) $ {
. . .
}
In the following example, the location block is interpreted as a regular expression and is not case sensitive. Such a block will be able to process requests for /tortoise.jpg and /FLOWER.PNG.
location ~ * \. (jpe? g | png | gif | ico) $ {
. . .
}
The next block will disable regular expression search if it is selected as the best match without regular expressions. It can handle requests for /costumes/ninja.html:
location ^ ~ / costumes {
. . .
}
As you can see, modifiers indicate how the location block should be interpreted. However, this does not define the algorithm that Nginx uses to decide which location block to send the request to.
Nginx selects the location block in the same way as it selects the server block. It starts a process that determines the best location block for a particular request. Understanding this process is essential for reliable and accurate Nginx configuration.
Keeping in mind the types of ads that we reviewed above, Nginx evaluates possible location contexts by comparing the request URI with each of the locations. He does this using the following algorithm:
It is important to understand that by default Nginx will serve regular expressions, preferring prefix matches. However, he first evaluates the prefix location, allowing the administrator to override this behavior by specifying location using the = and ^ ~ modifiers.
It is also important to note that, although prefix locations are usually chosen based on the maximum length prefix (the most accurate match), Nginx will stop evaluating regular expressions when it finds the first suitable location. This means that the location in the configuration of location blocks with regular expressions is of paramount importance.
Now we need to figure out in which cases the evaluation of the location blocks goes to other locations.
In general, since the location block is selected to serve the request, the request is processed entirely in this context. Only the selected location block and inherited directives determine how the request is processed, and the neighboring location blocks cannot interfere with this process.
This is a general rule that allows you to design location blocks in a predictable way. But it is important to understand that there are cases when certain directives start a new search for a location inside the selected location block. Exceptions to the rule can lead to unpredictable results.
Here are some of the directives that can cause this behavior:
The index directive always causes an internal redirect if it is used to process a request. Exact matches of location are often used to speed up the selection process, because it will immediately stop the execution of the algorithm. However, if the exact location match is a directory, there is a chance that for actual processing the request will be redirected to another location.
In this example, the first location matches the request’s / exact URI, but the index directive inherited by the block triggers an internal redirect to the second block to process the request:
index index.html;
location = / exact {
. . .
}
location / {
. . .
}
If you want the request in the above case to be processed by the first block, you will have to come up with another method of dropping the request into the directory. For example, you can set the wrong index for this block and enable autoindex:
location = / exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
This is one way to prevent the request from being redirected from the first context, but it is probably not suitable for most configurations. Basically, an exact match in directories can be useful for operations such as rewriting a query (which also leads to a new search for the location block).
Another case in which a new search for a location can begin is the use of the try_files directive. This directive tells Nginx to check for a named set of files or directories. The last parameter can be the URI to which Nginx will redirect internally.
Consider this configuration:
root / var / www / main;
location / {
try_files $ uri $ uri.html $ uri / / fallback/index.html;
}
location / fallback {
root / var / www / another;
}
If in the example above, a request is made for / blahblah, the first location will receive the request first. It will try to find a file named blahblah in the / var / www / main directory. If he cannot find it, he will look for a file named blahblah.html. He will then try to find out if the blahblah / directory is in the / var / www / main directory. If all these attempts fail, the request will be redirected to / fallback/index.html. This will trigger a new location search, and the request will go to the second block. It will serve the file /var/www/another/fallback/index.html.
Also the block directive is affected by the rewrite directive. By processing rewrite with no parameters or with the last parameter, Nginx will search for a new location block based on the results of the rewriting.
For example, if you change the last example and add a rewrite to it, you will see that the request is sometimes sent directly to the second location block, without relying on the try_files directive:
root / var / www / main;
location / {
rewrite ^ / rewriteme /(.*)$ / $ 1 last;
try_files $ uri $ uri.html $ uri / / fallback/index.html;
}
location / fallback {
root / var / www / another;
}
In the example above, the request / rewriteme / hello will be processed first by the first location block. It will be rewritten to / hello, and the web server will look for the location. In this case, it will again match the first location and will be processed by the try_files directive (perhaps using internal redirection to return to / fallback/index.html if nothing was found).
However, if a request is made for / rewriteme / fallback / hello, the first block will again respond to the request. At the same time, overwriting is applied again, this time / fallback / hello. Then the request will be served by the second block.
A similar situation arises with the return directive when sending status codes 301 or 302. The difference in this case is that it leads to a completely new request from outside the redirect. The same situation can occur with the rewrite directive when using the redirect or permanent flags.
The error_page directive can lead to internal redirection in the same way that try_files does. This directive is used to define actions that are performed when detecting certain status codes. These actions will probably never be executed if the try_files directive is set, since this directive handles the entire request life cycle.
Consider this example:
root / var / www / main;
location / {
error_page 404 /another/whoops.html;
}
location / another {
root / var / www;
}
Каждый запрос (кроме тех, которые начинаются с /another) будет обрабатываться первым блоком, который будет обслуживать файлы из каталога /var/www/main. Однако если файл не найден (статус 404), произойдет внутреннее перенаправление на /another/whoops.html, что приведет к новому поиску блока location, который в конечном итоге окончится вторым блоком. Этот блок будет обслуживать файл /var/www/another/whoops.html.
Как видите, понимание условий, при которых Nginx запускает новый поиск блока location, может помочь предсказать поведение веб-сервера при выполнении запросов.
Comments
To leave a comment
Running server side scripts using PHP as an example (LAMP)
Terms: Running server side scripts using PHP as an example (LAMP)