Lecture
Apache is the most common HTTP server . Distributed free of charge, including source code. CGI (including FastCGI ), PHP , Perl , Java scripts are supported. Authentication - basic , message-digest , TLS (SSL) . Since April 1996 it is the most popular HTTP server on the Internet, in August 2007 it worked on 51% of all web servers.
.htaccess - additional configuration file for the Apache web server, as well as similar servers. Allows you to set a large number of additional parameters and permissions for the web server for individual users (as well as on different folders of individual users), such as managed access to directories , reassigning file types , etc., without providing access to the main configuration file, those. without affecting the operation of the entire service.
.htaccess is similar to httpd.conf with the difference that it affects only the directory in which it is located, and its child directories. The ability to use .htaccess is present in any directory of the user.
The .htaccess file can be placed in any site directory. The directives of this file affect all files in the current directory and in all its subdirectories (unless these directives are overridden by the directives of the underlying .htaccess files ).
This advantage is at the same time its disadvantage. The Apache searches this file in the folders of the called scripts, therefore, to speed up the server, the option is sometimes disabled and the entire configuration is located in the configuration file in the httpd.conf file or in the personal settings file of each host /ets/apache2/site-enable/*.conf
The .htaccess directives provide the user with a wide range of options for customizing their site, including:
Simple redirection directives (redirect); Complex redirection directives (mod_rewrite); Index pages; Error processing; Encoding; Access control; Password protection directories; PHP options; |
A list of all available directives can be found here.
The most commonly used and most complex .htaccess directives. Suppose we want to redirect the user to another URL when requesting our site. To do this, we need to add a .htaccess file with the following contents to the root directory of the site:
Redirect / http://www.example.com # http://www.example.com - the URL to which we redirect requests |
A more complex example - we want to redirect certain pages of our site to other sites:
Redirect / linux http://www.linux.org Redirect /linux/download.html http://www.linux.org/dist/download_info.html Redirect 301 / kernel http://www.linux.org |
Now, when accessing http://mysite.ru/linux, http://www.linux.org will open, and when accessing http://mysite.ru/linux/download.html, it will be http: //www.linux .org / dist / download_info.html . In the last example, the WEB-server will transmit the code 301, which means "the document is moved constantly."
The syntax of the Redirect command is as follows:
Redirect [status] URL_LOCAL URL_REDIRECT status: optional, specifies the return code. Valid values are: * permanent (301 - the document is moved permanently) * temp (302 - the document is moved temporarily) * seeother (303 - see other) * gone (410 - removed) URL_LOCAL: the local part of the URL of the requested document. URL_REDIRECT: URL to which the redirect should be performed. |
The RedirectMatch directive is similar to the Redirect directive except that it is possible to use regular expressions in RedirectMatch , which certainly can be convenient in some conditions. For example, to organize the transfer of parameters to the script in the body of the URL:
RedirectMatch /(.*)/(.*)/index.html$ http://mysite.ru/script.php?par1=$1&par2=$2 |
Although this example will cause a page reload, it can be further improved. Here it is necessary to make a small digression and talk about the syntax of regular expressions .
You can use any printable characters and a space in a regular expression, but some of the characters have a special meaning:
|
These are all basic primitives with which you can build any regular expression.
The mod_rewrite module, available as part of Apache, is a powerful intellectual tool for converting URLs . Almost all types of transformations are possible with it, which can be performed or not depending on different conditions, factors.
This module is a rule-based mechanism (a regular expression parser) that performs URL conversion on the fly. The module supports an unlimited number of rules and conditions associated with each rule, implementing a truly flexible and powerful URL control mechanism. URL transformations can use different data sources, for example, server variables, environment variables , HTTP headers , time, and even requests to external databases in different formats to get the URL of the type you need.
The RewriteCond directive defines the condition under which the conversion occurs. RewriteCond defines the conditions for a rule. There are one or more RewriteCond directives in front of the RewriteRule directive. The following transformation rule is used only when the URI meets the conditions of this directive, as well as the conditions of these additional directives.
Feedback implies the use of parts of the compared URLs for further use, i.e. passing parameters or building a new URL .
$ N | (0 <= N <= 9) providing access to the grouped parts (in parentheses!) Of the template from the corresponding RewriteRule directive (the only one immediately following the current set of RewriteCond directives). |
% N | (1 <= N <= 9) providing access to the grouped parts (in parentheses!) Of the template from the corresponding RewriteCond directive in the current set of conditions. |
% {NAME_OF_VARIABLE} | where NAME_OF_VARIABLE can be one of the following variables |
The following is a list of all available % {NAME_OF_VARIABLE} variables with their brief description.
HTTP_USER_AGENT | Contains information about the type and version of the browser and operating system of the visitor. |
HTTP_REFERER | The address of the page from which the visitor came to this page. |
HTTP_COOKIE | List of COOKIE transmitted by the browser. |
HTTP_FORWARDED | Contains the IP address of the proxy or load balancing server. |
HTTP_HOST | Server address, for example, beget.ru. |
HTTP_ACCEPT | Describes customer preferences regarding the type of document. |
REMOTE_ADDR | The IP address of the visitor. |
REMOTE_HOST . | The address of the visitor in normal form - for example, rt99.net.ru. |
REMOTE_IDENT | The name of the deleted user. It has the format name. Host, for example, kondr.www.rtt99.net.ru |
REMOTE_USER | Same as REMOTE_IDENT, but contains only the name. Example: kondr |
REQUEST_METHOD | Allows you to determine the type of request (GET or POST). Must be analyzed, because defines a further way to process information. |
SCRIPT_FILENAME | Full path to the web page on the server. |
PATH_INFO | Contains everything that was transferred to the script. |
QUERY_STRING | Contains the line passed as a request when calling the CGI script. |
AUTH_TYPE | Used to identify the user |
DOCUMENT_ROOT | Contains the path to the root directory of the server. |
SERVER_ADMIN | Email address of the server owner, specified during installation. |
SERVER_NAME | Server address, for example, kondr.beget.ru |
SERVER_ADDR | The IP address of your site. |
SERVER_PORT | The port on which Apache is running. |
SERVER_PROTOCOL | HTTP protocol version. |
SERVER_SOFTWARE | Server name, for example, Apache / 1.3.2 (Unix) |
TIME_YEAR TIME_MON TIME_DAY TIME_HOUR TIME_MIN TIME_SEC TIME_WDAY TIME | Variables designed to work with time in different formats. |
API_VERSION | This is the Apache module API version (internal interface between the server and the module) in the current server build, which is defined in include / ap_mmn.h . |
THE_REQUEST | The complete HTTP request string sent by the browser to the server (i.e., "GET /index.html HTTP / 1.1" ). It does not include any additional headers sent by the browser. |
REQUEST_URI | Resource requested in HTTP request string. |
REQUEST_FILENAME | The full path in the server file system to the file or script corresponding to this request. |
IS_SUBREQ | It will contain the text “true” if the query is currently being executed as a subquery, “false” otherwise. Subqueries can be generated by modules that need to deal with additional files or URIs in order to perform their own tasks. |
A condition is a condition template, i.e. Any regular expression applied to the current instance of the "Compare String", i.e. "Compare String" is viewed on the search for compliance with the Condition.
Remember that Condition is a perl- compatible regular expression with some additions:
|
All these checks can also be prefixed with an exclamation mark ('!') To invert their meaning.
RewriteEngine enables or disables the conversion mechanism. If it is set to off , this module does not work at all. Note that the default transformation settings are not inherited. This means that you must have a rewriteEngine on directive for each virtual host in which you want to use this module.
The syntax for RewriteEngine is as follows:
RewriteEngine on | off # Default RewriteEngine off |
Use to combine conditions in OR rules instead of AND . A typical example is redirecting requests for subdomains to separate directories.
RewriteEngine on RewriteCond% {REMOTE_HOST} ^ mysubdomain1. * [OR] RewriteCond% {REMOTE_HOST} ^ mysubdomain2. * [OR] RewriteCond% {REMOTE_HOST} ^ mysubdomain3. * RewriteRule ^ (. *) $ ^ Mysubdomain_public_html / $ 1 RewriteCond% {REMOTE_HOST} ^ mysubdomain4. * RewriteRule ^ (. *) $ ^ Mysubdomain4_public_html / $ 1 |
To issue the main page of a site, according to the "User-Agent:" request header, you can use the following directives:
RewriteEngine on RewriteCond% {HTTP_USER_AGENT} ^ Mozilla. * RewriteRule ^ / $ /homepage.max.html [L] RewriteCond% {HTTP_USER_AGENT} ^ Lynx. * RewriteRule ^ / $ /homepage.min.html [L] RewriteRule ^ / $ /homepage.std.html [L] |
To issue different sites for different browsers, according to the "User-Agent:" request header, you can use the following directives:
RewriteEngine on RewriteCond% {HTTP_USER_AGENT} ^ Mozilla. * RewriteRule ^ (. *) $ / Mozilla / $ 1 [L] RewriteCond% {HTTP_USER_AGENT} ^ Lynx. * RewriteRule ^ (. *) $ / Lynx / $ 1 [L] RewriteRule ^ (. *) $ / Default / $ 1 [L] |
The general syntax of the RewriteRule directive is as follows:
RewriteRule Pattern Substitution [flag] # flag - optional field indicating additional options |
In the substitution, you can use special flags, including by adding the RewriteRule directive as the third argument. Flags are a comma-separated list of flags:
'redirect | R [= code]'
(causes redirect)
The prefix in the HTTP lookup http: // thishost [: thisport] / (which creates a new URL from a URI ) starts an external redirect (redirect). If there is no code, in the substitution, the response will be with HTTP status 302 (TEMPORARY MOVED). To stop the conversion process, you also need to write the 'L' flag.
'forbidden | F [= code]'
(makes URL denied)
This makes the current URL denied, for example, the client is immediately sent a response with the HTTP status 403 (FORBIDDEN). Use this flag in conjunction with the relevant RewriteConds to block URLs by some criteria.
'gone | G [= code]'
(makes the URL "dead")
This flag makes the current URL "dead", i.e., an HTTP response with a status of 410 ( GONE ) is immediately sent. Use this flag to mark the dead with non-existent pages.
'proxy | P [= code]'
(calls proxy)
This flag marks the wildcard part as an internal proxy request and immediately (i.e. the conversion process stops here) passes it through the proxy module. Use this flag to achieve a more powerful implementation of the ProxyPass directive, which integrates some content on remote servers into the namespace of the local server.
'last | L [= code]'
(last rule)
Stop the conversion process at this location and do not apply any more conversion rules. Use this flag to not convert the current URL to others following this by the conversion rules.
'next | N [= code]'
(next round)
Restart the conversion process (starting with the first rule). In this case, the URL is again matched by certain conditions, but not the original URL , but the URL that emerged from the last conversion rule. Use this flag to restart the conversion process, i.e. unconditional transition to the beginning of the cycle.
'chain | C [= code]'
(link to the following rule)
This flag associates the current rule with the next one (which, in turn, may be associated with the next one, etc.). This has the following effect: if there is compliance with the rule, the process continues as usual, i.e. the flag has no effect. If the rule does not match the condition, all the following, related rules are skipped.
'type | T = MIME type [= code]'
(force set MIME type)
Force the MIME type of the target file to MIME type . For example, this can be used to simulate the mod_alias ScriptAlias directive, which forcibly sets all files inside the MIME directory to be type “application / x-httpd-cgi” .
'nosubreq | NS [= code]'
(used only in case of non internal subquery)
This flag instructs the transformation engine to skip a directive if the current subquery is an internal subquery. For example, internal subqueries in Apache occur when mod_include tries to get information about possible default files for directories (index.xxx). With subqueries, this is not always useful and sometimes even causes a problem in the work of a set of conversion directives. Use this flag to exclude certain rules.
'nocase | NC [= code]'
(ignore case)
This makes the Pattern case-insensitive, i.e. there is no difference between 'AZ' and 'a-z' when the Template is applied to the current URL .
'qsappend | QSA [= code]'
(add query string)
This flag indicates that the transformation mechanism adds, rather than replace, the query string from the URL to the existing one in the substitution string. Use this when you want to add additional data to the query string using conversion directives.
'noescape | NE [= code]'
(do not screen URI when output)
This flag does not allow mod_rewrite to apply normal URI escape rules to the result of the conversion. Usually, special characters (such as '%', '$', ';', and so on) will be escaped by their hex substitutions ('% 25', '% 24', and '% 3B', respectively); This flag does not allow to do it.
If there are no mod_rewrite directives in subdirectories in .htaccess , then all conversion rules are inherited from the parent directory.
If there are any mod_rewrite directives in the .htaccess file, nothing is inherited, and the default state is set the same as in the main configuration file of the web server ("off" by default). Therefore, if you need conversion rules for a specific directory, then you need to re-insert the " RewriteEngine on " directive into .htaccess for a specific directory.
When inheriting rules from top directories and adding new ones specific to this directory to them, you need to set the following at the beginning: " RewriteEngine on " and " RewriteOptions inherit " - the last directive tells the server to continue.
Examples of using mod_rewrite can be found here.
When a user logs on to a host, for example, http://gentoo.org , it is assumed that the index file index. * Is opened, and if it does n’t appear, either the contents of the directory are displayed, or 403 FORBIDDEN is given (if the "directory browsing" option is disabled).
The Indexes directive is responsible for listing the files (show the visitor a list of files if the selected directory does not have an index.html file or its equivalent).
Sometimes it is necessary to make so that in case of absence in the directory of the file which is shown by default, the listing, that is, the list of files in the directory, is not issued. In this case, add the following line to .htaccess :
# Prohibiting the issuance of listing an empty directory Options -Indexes |
And to give a listing, you need:
Options indexes |
If you need to allow viewing the list of files, but in order to avoid part of the files of a certain format being displayed, then we will write:
IndexIgnore * .php * * .pl |
Lists the directory, i.e. its content with all its contents, with the exception of PHP and Perl script files.
If your website is scripted, then files with other extensions can often be used as index files. You can specify these files using the DirectoryIndex directive.
DirectoryIndex index.html index.shtml index.pl index.cgi index.php |
If, on the other hand, if you want to access the directory, you would not open index.html , but, for example, the file htaccess.php or /cgi-bin/index.pl :
DirectoryIndex htaccess.php /cgi-bin/index.pl |
Errors sometimes occur during server operation, but it is more correct to call them not server failures, but standard return codes specified in the HTTP_RFC2616 standard. In general, in the RFC, errors are called " Status Codes ", but we will call them errors precisely - so familiar.
The return code is a three-digit number, based on which you can judge how successfully the request was processed. Return codes beginning with 1,2,3 are considered successful, the rest are classified as errors.
Here is a list of 4xx and 5xx errors: 400 - Bad Request 401 - Unauthorized 402 - Payment Required 403 - Forbidden 404 - Not Found 405 - Method Not Allowed 406 - Not Acceptable 407 - Proxy Authentication Required 408 - Request Time-out 409 - Conflict 410 - Gone 411 - Length Required 412 - Precondition Failed 413 - Request Entity Too Large 414 - Request-URI Too Large 415 - Unsupported Media Type 500 - Internal Server Error 501 - Not Implemented 502 - Bad Gateway 503 - Service Unavailable 504 - Gateway Time-out 505 - HTTP Version not supported |
При возникновении ошибки 4xx или 5xx посетитель Вашего сайта увидит в браузере сообщение от сервера, которое вряд ли можно назвать предельно понятным рядовому пользователю. Apache предоставляет возможность выдать вместо аскетичного технического текста, не изобилиющего деталями, свою страницу, где Вы можете человеческим языком объяснить пользователю, что произошло и что делать.
Пример переопределения страниц ошибок приведен ниже:
# содержание файла .htaccess: ErrorDocument 404 http://bg10.ru/error/404.htm ErrorDocument 403 http://bg10.ru/error/403.htm ErrorDocument 400 http://bg10.ru/error/400.htm ErrorDocument 500 http://bg10.ru/error/500.htm # в случае ошибки "FORBIDDEN" показывается текстовое сообщение, которое # обязательно должно начинаться с кавычки, кавычка в сообщении не выводится: |
Более подробно об обработке ошибок можно прочитать в документации по Apache на странице "Custom error responses".
Иногда браузер пользователя не может корректно определить тип кодировки выдаваемого документа. Для решения этой проблемы используемая кодировка указывается в настройках Web -сервера Apache и заголовке передаваемого документа. Причем для корректного распознания эти кодировки должны совпадать. На наших серверах по умолчанию используется кодировка cp1251
В HTML для указания кодировки используется тег:
<meta http-equiv="content-type" content="text/html; charset=Windows-1251"> |
Наиболее часто встречаются типы кодировки для русского языка передаваемые в заголовке документа:
|
Now consider the default encoding specification via .htaccess . AddDefaultCharset sets the default symbol table (encoding) for all output pages on the Apache web server . We specify the encoding for all files in which the browser receives documents by default:
AddDefaultCharset WINDOWS-1251 |
When uploading a file to the server, transcoding is possible. We indicate that all received files will have windows-1251 encoding , for this we will write:
CharsetSourceEnc WINDOWS-1251 |
If you need to cancel the server file transcoding:
CharsetDisable on |
Очень часто возникает необходимость запретить доступ к определенным файлам или папкам для определенных групп пользователей. В Web -сервере Apache есть встроенные средства для решения данной проблемы.
Для запрета или разрешения доступа ко всем файлам и папкам в текущей и во всех вложенных директориях используется директива Order , синтаксис ее очень прост:
Order [Deny,Allow] | [Allow,Deny] # По умолчанию Deny,Allow |
В зависимости от того, в каком порядке указаны директивы, меняется логика работы сервера. В случае, если Deny,Allow , то запрещается доступ со всех IP кроме оговоренных, в случае, если Allow,Deny , разрешается доступ со всех IP кроме оговоренных. Далее должны идти секции описания для доступа и запрета. Ключевое слово all означает со всех IP
Например, мы хотим запретить (блокировать) доступ с IP 81.222.144.12 и 81.222.144.20 и разрешить всем остальным. Нам необходимо добавить в .htaccess следующий код:
Order Allow, Deny Allow from all Deny from 81.222.144.12 81.222.144.20 |
For the reverse situation, when we want to deny access from all IP except 81.222.144.12 and 81.222.144.20, we need to add the following code to .htaccess :
Order Deny, Allow Deny from all Allow from 81.222.144.12 81.222.144.20 |
A ban or permission to access can be indicated not only on all files, but also it is possible to point to a separate file or groups of files. For example, we want to prohibit access of all users except IP 81.222.144.12 to the file passwd.html , which is located in the current directory:
<Files "passwd.html"> Order Deny, Allow Deny from all Allow from 81.222.144.12 </ Files> |
Так же можно запретить или разрешить доступ к определенной группе файлов. Например, к файлам с расширением " .key ":
<Files "\.(key)$"> Order Deny,Allow Deny from all Allow from 81.222.144.12 </Files> |
.htaccess можно также использовать для установки пароля на доступ к определенным папкам, файлам и группам файлов. Приведем рабочий пример, а потом поясним все содержимое:
AuthName "Protected area, need authorization" AuthType Basic AuthUserFile /home/t/test/.authfile require valid-user |
Данный файл нужно положить в ту директорию, на которую мы хотим поставить пароль.
Директива AuthName выводит сообщение при запросе пароля, все сообщение необходимо писать в одну строчку, синтаксис директивы тривиален:
AuthName "SEE TEXT" |
The AuthType directive selects the type of authentication. The following types are possible: Basic or Digest . The second may not be supported by some browsers, so it is not recommended to use it.
AuthType Basic | Digest |
AuthUserFile указывает имя файла с паролями для аутентификации пользователей (пароли в этом файле будут шифрованными). Путь к файлу с паролями задается относительно корня веб-сервера. Храните файл с паролями в папке, доступ к которой закрыт для пользователей (желательно поместить этот файл вне иерархии вашего веб-сайта).
Если у Вас установлена операционная система семейства Windows , Вы можете подключится к серверу по SSH (инструкцию по подключению можно найти тут) и воспользоваться утилитой htpasswd .
Запустив htpasswd без параметров мы увидим:
beget@ginger ~ # htpasswd |
All the parameters of this command will not be considered here, but you can read the details yourself by running htpasswd in the unix shell , or by reading the corresponding page of the Apache documentation.
So, initially we do not have a password file yet and we need to create it:
beget@ginger ~ # htpasswd -c authfile test1 |
All the parameters of this command will not be considered here, but you can read the details yourself by running htpasswd in the unix shell or by reading the corresponding page of the Apache documentation.
After performing this operation, htpasswd will create a passwords file, in which the user test1 and its password will be encrypted:
beget @ ginger ~ $ cat .authfile test1: zgco1KREjBY8M beget @ ginger ~ $ |
And now we want to add another user. Since we already have the password file, we simply will not use the '-c' option:
beget@ginger ~ # htpasswd .authfile test2 |
Let's return to the description of directives password protection directories. Require directive defines users who can access
Require USER_NAME | valid-user |
By specifying valid-user , you allow access to all users listed in the password file.
Here is an example for accessing certain users from a file with .htpasswd passwords.
AuthName "Protected area, need authorization" AuthType Basic AuthUserFile /home/t/test/.authfile require Alexey Kondr Fenix |
Also, as with the prohibition of access by IP, here you can use the extension <Files> . Below are two examples: setting a password for one specific file and for a group of files.
<Files "passwd.html"> AuthName "Protected area, need authorization" AuthType Basic AuthUserFile /home/t/test/.authfile require valid-user </ Files> |
<Files "\. (Key) $"> AuthName "Protected area, need authorization" AuthType Basic AuthUserFile /home/t/test/.authfile require valid-user </ Files> |
It should be remembered that with such a restriction of access, passwords are transmitted through communication channels in the open form and under certain circumstances can be intercepted by intruders. Therefore, for security purposes, it is recommended to organize access to restricted areas of the website via a secure SSL connection .
The directives for configuring PHP can be placed not only in the php.ini file , but also in the Apache configuration files for your site - .htaccess . This allows you to fine-tune php for different directories.
4 directives are available for working with PHP in the Apache configuration files: php_value , php_flag , php_admin_value , php_admin_flag , which differ in importance, type of values to be set and application.
The php_admin_value , php_admin_flag directives are set only in the httpd.conf file, so they are not interesting to us. In essence, these directives override the value of other directives.
The php_flag directive serves to set the logical values of the directives in php.ini , while the php_value directive sets the string and numeric values of the php.ini directives, i.e. any types of values, except logical.
The directive syntax is very simple:
php_flag directive name on | off php_value VALUE directive name |
Here is a list of the most commonly used directives.
mysql.default_host | Sets the host name of the database. |
mysql.default_user | Sets the database username |
mysql.default_password | Sets the database user password |
display_errors | Allows output of errors and warnings to the browser. |
display_startup_errors | Enables the display of errors that occur during PHP startup. |
error_reporting | Defines the types (severity levels) of the errors to be fixed. |
auto_prepend_file | The definition of the file that will be displayed at the beginning of each php-script. The path is specified from the root of the server file system. |
auto_append_file | The definition of the file that will be displayed at the end of each php-script. |
sendmail_from | Sets the sender's e-mail address, which is used when sending mail using PHP. |
user_agent | Sets the User-agent string that is used by PHP when accessing remote servers. |
For example, to display all error messages generated by php in .htaccess, you need to register the following lines:
php_flag display_errors 1 php_flag display_startup_errors 1 php_value error_reporting “E_ALL & ~ E_NOTICE” |
To prohibit php execution in the current directory and in all nested ones, you need to write the following lines in .htaccess :
php_flag engine off |
Comments
To leave a comment
Running server side scripts using PHP as an example (LAMP)
Terms: Running server side scripts using PHP as an example (LAMP)