Send Apache Logs to Graylog

I was looking for an easy way to forward Apache access logs to Graylog, and came across GELF.

GELF

While Syslog is fine for logging system messages of server, Graylog Extended Log Format (GELF) is a great choice for logging from within applications.

A GELF (version 1.1) format message is a JSON string with a couple of mandatory fields:

  1. version – GELF spec version “1.1”, must be set by client library.
  2. host – the name of the host or application that sent this message, must be set by client library.
  3. short_message – a short descriptive message, must be set by client library.

A timestamp field should be set, but isn’t a must, since it will be set to NOW by server if absent. Then every field we send and prefix with an underscore (_) will be treated as an additional field.

Apache Configuration

This is how the default CustomLog format looks like on Apache 2.4 (CentOS 7):

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

We are going to create a new CustomLog format called graylog_access to format the access log into a GELF format (JSON message):

LogFormat "{ \"version\": \"1.1\", \"host\": \"%V\", \"short_message\": \"%r\", \"timestamp\": %{%s}t, \"level\": 6, \"_user_agent\": \"%{User-Agent}i\", \"_source_ip\": \"%{X-Forwarded-For}i\", \"_duration_usec\": %D, \"_duration_sec\": %T, \"_request_size_byte\": %O, \"_http_status_orig\": %s, \"_http_status\": %>s, \"_http_request_path\": \"%U\", \"_http_request\": \"%U%q\", \"_http_method\": \"%m\", \"_http_referer\": \"%{Referer}i\", \"_from_apache\": \"true\" }" graylog_access

Our Apache is behind a proxy so we use X-Forwarded-For. Note that you need to enable mod_logio to use %O. Check this page https://httpd.apache.org/docs/current/mod/mod_log_config.html for other format arguments.

This is a human-readable format:

{ 
 "version": "1.1",
 "host": "%V",
 "short_message": "%r",
 "timestamp": %{%s}t,
 "level": 6,
 "_user_agent": "%{User-Agent}i",
 "_source_ip": "%{X-Forwarded-For}i",
 "_duration_usec": %D,
 "_duration_sec": %T,
 "_request_size_byte": %O,
 "_http_status_orig": %s,
 "_http_status": %>s,
 "_http_request_path": "%U",
 "_http_request": "%U%q",
 "_http_method": "%m",
 "_http_referer": "%{Referer}i",
 "_from_apache": "true"
}

The field _http_status_orig contains the status of the original request which reached our proxy (and that has been internally redirected).

By default, the % directives %s, %U, %T, %D, and %r look at the original request while all others look at the final request.

Apache is capable of writing error and access log files through a pipe to another process, rather than directly to a file. Therefore we can send Apache logs to Graylog by piping the log data through nc (or ncat). Put the following into Apache’s host configuration:

CustomLog "|/usr/bin/nc -u graylog.example.com 12201" graylog_access

The above assumes that a graylog.example.com server has a GELF input listener on a UDP port 12201.

The server will start the piped-log process when the server starts, and will restart it if it crashes while the server is running. Quotes are used to enclose the entire command that will be called for the pipe.

Apache Dashboard

Below is a Graylog dashboard example that visualises Apache logs.

References

http://docs.graylog.org/en/2.1/pages/gelf.html
http://httpd.apache.org/docs/current/mod/mod_log_config.html
https://serverfault.com/questions/310695/sending-logs-to-graylog2-server/

Related Posts

Install Graylog Server 1.x on CentOS 7
Install Graylog Server 1.x on CentOS 6
Graylog Server Upgrade from 1.3.x to 2.0.x on CentOS 6
Set up MongoDB Authentication for Graylog

13 thoughts on “Send Apache Logs to Graylog

  1. Hello,

    Thanks for the artcile very interesting . I am still surprised how is it ‘complicated’ to ship apache log in GELF JSON , finally I have less issues with IIS ;) . I have tried following settings base on your article. My source IP field is empty . I am not behind a proxy . (NAT firewall) . Any idea why the source ip is empty ?

    Gmiga

    • Tomcat logs are complicated, Apache logs is a breeze! :) I use NXlog for IIS, cannot say it’s any easier TBH.

      What argument do you use to send the source IP? Try %a, that should work.

  2. Hi,

    nice tutorial…. any reason why the GELF input should listen on UDP instead of TCP? Is TCP problematic? If so, why?

    • It depends on your use case really. I’m currently on UDP in production as I require performance over reliability in this particular case. UDP is simply “send and forget” and I don’t have to wait for a server to send confirmation. This is extremely important when I want to send huge amounts of non-critical Apache logs (I can live with incomplete Apache monitoring data). Hope this helps.

  3. Dont we need any JARs to be applied to Tomcat Apache? Because when i configured for JBoss I have applied GELF Jars. Please clarify. thanks

  4. Thanks for the tutorial, however when I follow your instructions for my setup changing the proper fields where needed my Apache instance will not restart. My configuration is not behind a proxy so if that is the case would I also need to remove some of the directives which you have in the apache.conf file? One other thing is I’m running my webservers on Ubuntu 18.04. Any assistance would be greatly appreciated.

  5. Hopefully most will be smarter than me while implementing this fantastic how-to that still works as of 2020. Having said that, if everything seems to be working correctly but Graylog isn’t showing any messages/metrics in the search though you see data coming into the input in input view, make sure you have configured the input as GELF UDP or GELF TCP depending upon which protocol you choose, and not GELF HTTP as I did. Banged my head for an hour over that one before it dawned on me.

    This is an efficient, straightforward method of implementing Graylog for Apache. The “official” method is dependency heavy and seemingly much more cumbersome in comparison. Thanks for this tutorial Tomas.

  6. Hello,

    I’m new Graylog user, can you explain how Graylog data sends to Prometheus? Because I want to show the metrics on Grafana

    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *