24 January 2014

Postfix log centralize and analysis in realtime with fluentd tdagent elasticsearch and kibana - part 2

<< Back to part 1 <<

2. Config Fluentd (td-agent) to receive log stream from Postfix

Fluentd (td-agent) is really a very good log transport and parser, it has a very clearly modular model, support for lot of log format - including custom format, it also has a lot of plugins which support multiple database type.

Fluentd (td-agent) also supports H.A (high availability) and log stream load balancing - this will help able to scale out to very heavy traffic model.

We will install td-agent on the server2 to receive log stream from the server1.  Download the rpm package from rpmse : http://www.rpmse.org/#/dashboard?s=td-agent

The best rpm search engine I've used, millions of packages have been indexed.

Td-agent also provides a yum based repo-file, just create /etc/yum.repos.d/td-agent.repo include :

[treasuredata]
name=TreasureData
baseurl=http://packages.treasure-data.com/redhat/$basearch
gpgcheck=0

then # yum install td-agent

The main config file : /etc/td-agent/td-agent.conf
Log files stay at : /var/log/td-agent/

In order to write event log stream to ElasticSearch, we will need to install two td-agent plugin :
To install plugin :

# cd /usr/lib64/fluent/ruby/bin/
# ./fluent-gem install fluent-plugin-parser
# ./fluent-gem install fluent-plugin-elasticsearch

Now we need to config the td-agent service : # vim /etc/td-agent/td-agent.conf

### Listen on UDP 5140 ###
<source>
  type syslog
  port 5140
  bind 0.0.0.0
  tag syslog
</source>

### Parse events into fields : queueid, rcpt-to, relay, status ###
<match syslog.mail.info>
  type parser
  remove_prefix syslog
  format /^(?<queueid>[^ ]*): to=<(?<rcpt-to>[^ ]*)>, relay=(?<relay>[^ ]*), [^*]* status=(?<status>[^ ]*)/
  key_name message
  reserve_data yes
</match>

### Write event to ElasticSearch ###
<match mail.info>
  buffer_type file
  buffer_path /mnt/ramdisk/postfix-mail.buff
  buffer_chunk_limit 4m
  buffer_queue_limit 50
  flush_interval 0s
  type elasticsearch
  logstash_format true
  logstash_prefix postfix_mail
</match>

Restart the td-agent service for applying : # /etc/init.d/td-agent restart

Check to see if the td-agent has open udp port :

# netstat -pnatu|grep 5140
udp        0      0 0.0.0.0:5140           0.0.0.0:*               6662/ruby

and it works ! Ready to receive input log stream.

Explain the config :

The config file itself is very clearly, but there are something that we need to notice :
  • Module source syslog is built-in, we dont need to install it.
  • format /^(?<queueid>[^ ]*): to=<(?<rcpt-to>[^ ]*)>, relay=(?<relay>[^ ]*), [^*]* status=(?<status>[^ ]*)/ . this using regexp to parse the mail log into some fields before writing to ElasticSearch. You can try ignore this module to see the difference.
  • buffer_path /mnt/ramdisk/postfix-mail.buff : I have mount a tmpfs (using ram) filesystem to /mnt/ramdisk to speedup the buffer and to able to see the buffered files changing in realtime (by default, this module does use memory to place the buffer). This is really such a performance hint, more detail here : http://docs.fluentd.org/articles/buf_file 
  • For debuging purpose, instead of writing to ElasticSearch, you can also try writing events log to normal files, using the out_file modulehttp://docs.fluentd.org/articles/out_file

>> Continue to part 3 - Config ElasticSearch >>