This repository has been archived on 2024-04-26. You can view files and clone it, but cannot push or open issues or pull requests.
static/jekyll/_posts/fluent-bit-memory/2023-08-09-fluent-bit-memory.md
2023-08-17 20:33:38 +02:00

3.7 KiB

layout title date categories
post Monitoring Correct Memory Usage in Fluent Bit 2023-08-09 16:19:00 Europe/Amsterdam fluentd fluentbit memory

Previously, I have used Prometheus' node_exporter to monitor the memory usage of my servers. However, I am currently in the process of moving away from Prometheus to a new Monioring stack. While I understand the advantages, I felt like Prometheus' pull architecture does not scale nicely. Everytime I spin up a new machine, I would have to centrally change Prometheus' configuration in order for it to query the new server.

In order to collect metrics from my servers, I am now using Fluent Bit. I love Fluent Bit's way of configuration which I can easily express as code and automate, its focus on effiency and being vendor agnostic. However, I have stumbled upon one, in my opinion, big issue with Fluent Bit: its mem plugin to monitor memory usage is completely useless. In this post I will go over the problem and my temporary solution.

The Problem with Fluent Bit's mem Plugin

As can be seen in the documentation, Fluent Bit's mem input plugin exposes a few metrics regarding memory usage which should be self-explaining: Mem.total, Mem.used, Mem.free, Swap.total, Swap.used and Swap.free. The problem is that Mem.used and Mem.free do not accurately reflect the machine's actual memory usage. This is because these metrics include caches and buffers, which can be reclaimed by other processes if needed. Most tools reporting memory usage therefore include an additional metric that specifices the memory available on the system. For example, the command free -m reports the following data on my laptop:

               total        used        free      shared  buff/cache   available
Mem:           15864        3728        7334         518        5647       12136
Swap:           2383         663        1720

Notice that the available memory is more than free memory.

While the issue is known (see this and this link), it is unfortunately not yet fixed.

A Temporary Solution

The issues I linked previously provide stand-alone plugins that fix the problem, which will hopefully be merged in the official project at some point. However, I didn't want to install another plugin so I used Fluent Bit's exec input plugin and the free Linux command to query memory usage like so:

[INPUT]
    Name exec
    Tag memory
    Command free -m | tail -2 | tr '\n' ' '
    Interval_Sec 1

To interpret the command's output, I created the following filter:

[FILTER]
    Name parser
    Match memory
    Key_Name exec
    Parser free

Lastly, I created the following parser (warning: regex shitcode incoming):

[PARSER]
    Name free
    Format regex
    Regex ^Mem:\s+(?<mem_total>\d+)\s+(?<mem_used>\d+)\s+(?<mem_free>\d+)\s+(?<mem_shared>\d+)\s+(?<mem_buff_cache>\d+)\s+(?<mem_available>\d+) Swap:\s+(?<swap_total>\d+)\s+(?<swap_used>\d+)\s+(?<swap_free>\d+)
    Types mem_total:integer mem_used:integer mem_free:integer mem_shared:integer mem_buff_cache:integer mem_available:integer swap_total:integer swap_used:integer

With this configuration, you can use the mem_available metric to get accurate memory usage in Fluent Bit.

Conclusion

Let's hope Fluent Bit's mem input plugin is improved upon soon so this hacky solution is not needed. I also intend to document my new monitoring pipeline, which at the moment consists of:

  • Fluent Bit
  • Fluentd
  • Elasticsearch
  • Grafana