-- Apache logging facilities (Part-1)
-- Log manipulation (Part-1)
-- Remote logging (Part-2)
-- Logging strategies (Part-2)
-- Log forensics (Part-2)
-- Monitoring (Part-2)
Apache Logging Facilities
Apache can produce many types of logs. The two essential types are the access log, where all requests are noted, and the error log, which is designed to log various informational and debug messages, plus every exceptional event that occurs. Additional information can be found in module-specific logs, as is the case with mod_ssl, mod_rewrite and mod_security. The access log is created and written to by the module mod_log_config, which is not a part of the core, but this module is so important that everyone treats it as if it is.
Request Logging
You only need to be familiar with three configuration directives to manage request logging:
-
LogFormat
-
transferLog
-
CustomLog
In fact, you will need to use only two. The CustomLog directive is so flexible and easy to use that you will rarely need to use transferLog in your configuration. (It will become clear why later.)
Other directives are available, but they are deprecated and should not be used because CustomLog can achieve all the necessary functionality. Some have been removed from Apache 2:
- CookieLog
Deprecated, but still available
- AgentLog
Deprecated and removed from Apache 2
- RefererLog
Deprecated and removed from Apache 2
- RefererIgnore
Deprecated and removed from Apache 2
LogFormat
Before covering the process of logging to files, consider the format of our log files. One of the benefits of Apache is its flexibility when it comes to log formatting. All this is owed to the LogFormat directive, whose default is the following, referred to as the Common Log Format (CLF):
LogFormat "%h %l %u %t \"%r\" %>s %b" common
The first parameter is a format string indicating the information to be included in a log file and the format in which it should be written; the second parameter gives the format string a name. You can decipher the log format using the symbol table. The table is available from the Apache reference documentation (http://httpd.apache.org/docs-2.0/mod/mod_log_config.html).
Standard logging format strings
|
Format string
|
Description
|
|
%%
|
The percent sign
|
|
%...a
|
Remote IP address
|
|
%...A
|
Local IP address
|
|
%...B
|
Bytes sent (excluding headers)
|
|
%...b
|
Bytes sent (excluding headers); a dash (-) is used instead of a zero
|
|
%...{Name}C
|
The contents of the cookie Name
|
|
%...D
|
Time taken to serve the request, in microseconds (Apache 2 only)
|
|
%...{Name}e
|
The contents of the environment variable Name
|
|
%...f
|
Filename
|
|
%...h
|
Remote host
|
|
%...H
|
Request protocol
|
|
%...{Name}i
|
The contents of the request header Name
|
|
%...l
|
Remote log name (from identd)
|
|
%...m
|
Request method
|
|
%...{Name}n
|
Contents of the note Name
|
|
%...{Name}o
|
Contents of the response header Name
|
|
%...p
|
Canonical port of the server
|
|
%...P
|
Process ID
|
|
%...{Format}P
|
Depending on Format, Process ID (pid) or thread ID (tid)
|
|
%...q
|
Query string
|
|
%...r
|
Request line
|
|
%...s
|
Response status
|
|
%...t
|
Time, in common log format
|
|
%...{Format}t
|
Time, in custom format
|
|
%...T
|
Time taken to serve the request, in seconds
|
|
%...u
|
Remote user
|
|
%...U
|
The URL, excluding the query string
|
|
%...v
|
Canonical server name
|
|
%...V
|
Server name according to UseCanonicalName directive
|
|
%...X
|
Connection status at the end of the request ("X" for aborted, "+" for persistent, and "-" for closed)
|
You have a lot of fields to play with. Format strings support optional parameters, as represented by the " . . . " in each format string representation in the table. Optional parameters can be used for the following actions:
-
Conditionally include the format item in the log line. If the parameter consists of a list of (comma-separated) HTTP status codes, the item will be included only if the response status code was one of the specified ones. Otherwise, a "-" will be placed in the output. For example, to log bytes sent only for requests with responses 200 or 404, use %200,404B. An exclamation mark preceding the status codes is used for negation. That is, the item will be included only if the response status code is not one of the ones specified after the exclamation mark. For example, to omit logging the request line when the request was rejected due to the request line being too long, use %!414r. (This comes in handy to prevent the logs from growing too quickly.)
-
Access values of fields from internally redirected requests, when the parameter is < for the original request or > for the last request in the chain. By default, the %s format string refers to the status of the original request, and you can use %>s to record the status of the last request in the chain.
Apache modules can collaborate on logging if they create a named note (a text string) and attach it to the request. If the %{note}n format string is used, the contents of the note will be written to the log. A change in the Apache architecture in the second generation allows for modules to collaborate and provide custom format strings. These format strings are available if the module that provides them is included in the configuration.
Format string directives available only in Apache 2
|
Format string
|
Module
|
Description
|
|
%I
|
mod_logio
|
Total bytes received, on a network level
|
|
%O
|
mod_logio
|
Total bytes sent, on a network level
|
|
%{Variable}x
|
mod_ssl
|
The contents of the variable Variable
|
|
%{Variable}c
|
mod_ssl
|
Deprecated cryptography format function, included for backward compatibility with mod_ssl 1.3.x
|
With the inclusion of mod_logio, you can measure the number of bytes transferred for every request. This feature allows hosting providers to put accurate billing mechanisms in place. (With Apache 1, you can only record the size of the response body, leaving request headers, request body, and response headers unmeasured.)
Now that you are familiar with format strings, look at commonly used log formats (You will need to define these formats in httpd.conf if they are not already there.)
Commonly used log formats
|
Name
|
LogFormat string
|
|
common (the default)
|
%h %l %u %t "%r" %>s %b
|
|
combined
|
%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"
|
|
vcommon
|
%v %h %l %u %t "%r" %>s %b
|
|
vcombined
|
%v %h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"
|
Though you can create your own log format, you will most likely use one of the formats above since that is what web server log analyzers support. Nevertheless, the ability to create logs with a custom format is convenient for advanced uses, as we shall see later in this article.
TransferLog
transferLog is the basic request logging directive, which creates an access log with the given filename:
TransferLog /var/www/logs/access_log
The filename can be given with an absolute path, as above; if a relative filename is supplied, Apache will create the full path by pre-pending the server home directory (e.g. /usr/local/apache).
By default, the transferLog directive uses the Common Log Format (CLF), which logs every request on a single line with information formatted, Here is an example of what such a line looks like:
81.137.203.240 - - [29/Jun/2005:14:36:04 +0100] "POST /upload.php
HTTP/1.1" 200 3229
However, if a LogFormat directive has been used earlier in the configuration file, the transferLog directive will use the format it defined and not the CLF. This is unexpected and can lead to errors since changing the order in which formats are defined can lead to a different format being used for the log files. I prefer not to use transferLog, and instead use the CustomLog directive (which forces me to explicitly define the log format).
CustomLog
The real power comes from using the CustomLog directive. The equivalent to the transferLog usage described above looks like this:
CustomLog /var/www/logs/access_log custom
The explicit naming of the log format helps us avoid mistakes. I like this directive because of its conditional logging features. Have a look at the following configuration fragment:
# determine which requests are static - you may need to adjust the regular expression
# to exclude other files, such as PDF documents, or archives
SetEnvIfNoCase REQUEST_URI "\.(gif|png|jpg)$" static_request
# only log dynamic requests
CustomLog /var/www/logs/application_log combined env=!static_request
The conditional logging opens the door to many interesting logging opportunities, which really helps in real life. Most commonly, you will use mod_setenvif or mod_rewrite (which can also set environment variables) to determine what gets logged.
I mentioned that, by default, Apache uses the CLF, which does not record many request parameters. At the very least you should change the configuration to use the combined format, which includes the UserAgent and the Referer fields.
Looking at the log format string table shown in the LogFormat section, you can see over twenty different format strings, so even the use of a combined format results in loss of information. Create your own log format based on your information requirements. A nice example can be found at:
- "Profiling LAMP Applications with Apache's Blackbox Logs" by Chris Josephes (http://www.onlamp.com/pub/a/apache/2004/04/22/blackbox_logs.html)
In the article, Chris makes a case for a log format that allows for web serving troubleshooting and performance management. At the end, he names the resulting log format Blackbox.
Error Logging
The Apache error log contains error messages and information about events unrelated to request serving. In short, the error log contains everything the access log doesn't:
-
Startup and shutdown messages
-
Various informational messages
-
Errors that occurred during request serving (i.e., status codes 400-503)
-
Critical events
-
Standard error output (stderr)
The format of the error log is fixed. Each line essentially contains only three fields: the time, the error level, and the message. In some rare cases, you can get raw data in the error log (no time or error level). Apache 2 adds the Referer information to 404 responses noted in the error log.
Error logs are created using the ErrorLog configuration directive. Standard file naming conventions apply here; a relative filename will be assumed to be located in the server main folder.
ErrorLog /var/www/logs/error_log
The directive can be configured globally or separately for each virtual host. The LogLevel directive configures log granularity and ensures more information is not in the log than necessary. Its single parameter is one of the levels. Events that are on the specified level or higher will be written to the log file.
Error log levels
|
Level
|
Description
|
|
emerg
|
Emergencies (system unstable)
|
|
alert
|
Alerts to act on immediately
|
|
crit
|
Critical conditions
|
|
error
|
Error messages
|
|
warn
|
Warning messages
|
|
notice
|
Normal but significant conditions
|
|
info
|
Informational messages
|
|
debug
|
Debugging information
|
The default setting is warn. However, Apache always logs the messages of level notice when logging to text files. Some interesting messages are emitted on the informational level (e.g., that a client timed out on a connection, a potential sign of a DoS attack). Consider running the error log on the information level:
LogLevel info
Take some time to observe the error log to get a feeling as to what constitutes normal Apache behavior. Some messages seem dangerous but may not be.
On server startup, you will get a message similar to this one:
[Mon Jul 05 12:26:27 2004] [notice] Apache/2.0.50 (Unix) DAV/2
PHP/4.3.4 configured -- resuming normal operations
You will see a message to log the shutdown of the server:
[Mon Jul 05 12:27:22 2004] [notice] caught SIGTERM, shutting down
Most other relevant events will find their way to the error log as well.
The Apache error log is good at telling you that something bad has happened, but it may not contain enough information to describe it. For example, since it does not contain information about the host where the error occurred, it is difficult to share one error log between virtual hosts.
There is a way to get more informational error messages using the mechanism of custom logging. Here is an example:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{error-notes}n\"" commone
CustomLog logs/super_error_log commone
Most of the time, the error message that caused a request to fail is contained in the error-notes note. By adding the contents of that variable to the log line output to the access log, we can get any request detail we want and the error message at the same time. This trick does not remove a need for the error log but makes forensic log analysis much easier.
Special Logging Modules
Apache processes should never crash, but when they do, a message such as the following will appear in the error log:
[Mon Jul 5 08:33:08 2004] [notice] child pid 1618 exit signal
Segmentation fault (11)
A segmentation fault appears because of an error in Apache code or because a hacker is taking advantage of the web server through a buffer overflow attack. Either way, this is bad and you have to find out why it is happening. Having frequent unexplained segmentation faults is a reason for concern.
Your first impulse after discovering a segmentation fault will probably be to find the request that caused it. Due to the inadequate format of the error log, this may be difficult. Segmentation fault messages appear only in the main error log and not in the virtual hosts. Finding the corresponding request log entry may prove difficult when hosting a server with more than a couple of virtual hosts since the information about which virtual host was being processed at the time is unavailable.
To make the matter worse, the request usually is not logged to the access log. The logging phase is one of the last phases of request processing to take place, so nothing is logged when the server crashes during one of the earlier phases.
The purpose of mod_forensics (available since Versions 1.3.31 and 2.0.50) is to reveal the requests that make the server crash. It does that by having a special log file where requests are logged twice: once at the beginning and once at the end. A special utility script is used to process the log file. If a request appears only once in the log file, we know the server crashed before it could log the request for the second time.
To enable mod_forensics you also need to enable mod_unique_id. After you add the module to your configuration, decide where to put the new log file:
ForensicLog /var/www/logs/forensic_log
After restarting the server, the beginning of each request will be marked with a log of the request data (with headers but excluding the request body). Here is an example:
+QOmjHtmgtpkAADFIBBw|GET /cgi-bin/modsec-test.pl
HTTP/1.1|Accept:text/xml,application/xml,application/xhtml+xml,text/html
%3bq=0.9,text/plain%3bq=0.8,image/png,image/jpeg,image/gif%3
bq=0.2,%2a/%2a%3bq=0.1|Accept-Charset:ISO-8859-1,utf-8%3bq=0.7,%2a%3bq=0.7|
Accept-Encoding:gzip,deflate|Accept-Language:en-us,en%3bq=0.5|
Cache-Control:max-age=0|Connection:keep-alive|Host:www.linuxexposed.com:8080|
Keep-Alive:300|User-Agent:Mozilla/5.0 %28Windows%3b U%3b Windows NT 5.1%3b
en-US%3b rv:1.7%29 Gecko/20040616
For each request that was properly handled, the unique identifier will be written to the log, too:
-QOmjHtmgtpkAADFIBBw
As you can see, a lot of data is being logged, so implement frequent log rotation for the forensic log. I don't think it is a good idea to leave mod_forensics enabled on a production server because excessive logging decreases performance.
The chances of catching the offending request with mod_forensics are good though in some rare instances this module will fail:
-
If the segmentation fault occurs before mod_forensics gets to log the request into the log
-
If the segmentation fault occurs after mod_forensics writes the second log entry to the log, in which case the complete pair will be in the log in spite of a segmentation fault
Once you figure out the request, you should determine which of the active modules causes it. Your goal here is to determine whether to contact the module author (for a third-party module) or the Apache developers at dev@apache.org (for standard modules).
If you have to continue on your own, consider the following tips:
Audit Log
One major disadvantage of Apache's (and most other web servers') logging facilities is that there is no way to observe and log request and response bodies. While most web application attacks take place through GET requests, that is only because they are performed (or programmed) by less capable attackers. The dangerous types will take the extra two minutes to craft a POST request, knowing the chances of the attack being logged are very small.
However, audit logging becomes a possibility with the help of mod_security (http://www.modsecurity.org). This module audit logging configuration directives that can be placed almost anywhere in the configuration. It works with the main server, virtual servers, or in a directory context. To specify the audit log file and start audit logging, add the following to your configuration:
SecAuditEngine On
SecAuditLog /var/www/logs/audit_log
After the installation and configuration, you will be able to log the contents of those POST payloads for the first time. Below is an example of an individual audit log entry, where mod_security denied the request because a pattern "333" was detected in the request body. ("333" is not a real attack but something I often use for testing to make sure my configuration works.)
= = = = = = = = = = = = = = = = = = = = = = =
UNIQUE_ID: QOFMpdmgtpkAAFM1ALQ
Request: 127.0.0.1 - - [29/Jun/2004:12:04:05 +0100] "POST /cgi-bin/
modsec-test.pl
HTTP/1.0" 500 539
Handler: cgi-script
----------------------------------------
POST /cgi-bin/modsec-test.pl HTTP/1.0
Connection: Close
Content-Length: 5
Content-Type: application/x-www-form-urlencoded
Host: 127.0.0.1:8080
User-Agent: mod_security regression test utility
mod_security-message: Access denied with code 500. Pattern match
"333" at POST_PAYLOAD.
mod_security-action: 500
5
p=333
HTTP/1.0 500 Internal Server Error
Connection: close
Content-Type: text/html; charset=iso-8859-1
The entry begins with a few request identifiers followed by the request headers and the request body, followed by the response headers. The module will automatically detect and make use of the unique ID generated by mod_unique_id. This variable can help track a request over several log files. Currently, the module does not support response body logging, though the filter architecture of Apache 2 allows for it.
The audit engine of mod_security supports several logging levels (configured with the SecAuditEngine directive):
Off
----
No logging takes place.
On
-----
Perform full audit logging. Not recommended since it results in large amounts of data (of small value) in the log file. Besides, static resources do not support POST requests and they cannot be hacked, so it is not useful to log static resource requests.
RelevantOnly