add post about fluent bit memory monitoring

This commit is contained in:
Pim Kunis 2023-08-09 17:10:14 +02:00
parent c8b8d925be
commit f6a97c113e

View file

@ -0,0 +1,74 @@
---
layout: post
title: Monitoring Correct Memory Usage in Fluent Bit
date: 2023-08-09 16:19:00 Europe/Amsterdam
categories: fluentd fluentbit memory
---
Previously, I have used [Prometheus' node_exporter](https://github.com/prometheus/node_exporter) to monitor the memory usage of my servers.
However, I am currently in the process of moving away from Prometheus to a new Monioring stack.
While I understand the advantages, I felt like Prometheus' pull architecture does not scale nicely.
Everytime I spin up a new machine, I would have to centrally change Prometheus' configuration in order for it to query the new server.
In order to collect metrics from my servers, I am now using [Fluent Bit](https://fluentbit.io/).
I love Fluent Bit's way of configuration which I can easily express as code and automate, its focus on effiency and being vendor agnostic.
However, I have stumbled upon one, in my opinion, big issue with Fluent Bit: its `mem` plugin to monitor memory usage is _completely_ useless.
In this post I will go over the problem and my temporary solution.
# The Problem with Fluent Bit's `mem` Plugin
As can be seen in [the documentation](https://docs.fluentbit.io/manual/pipeline/inputs/memory-metrics), Fluent Bit's `mem` input plugin exposes a few metrics regarding memory usage which should be self-explaining: `Mem.total`, `Mem.used`, `Mem.free`, `Swap.total`, `Swap.used` and `Swap.free`.
The problem is that `Mem.used` and `Mem.free` do not accurately reflect the machine's actual memory usage.
This is because these metrics include caches and buffers, which can be reclaimed by other processes if needed.
Most tools reporting memory usage therefore include an additional metric that specifices the memory _available_ on the system.
For example, the command `free -m` reports the following data on my laptop:
```
total used free shared buff/cache available
Mem: 15864 3728 7334 518 5647 12136
Swap: 2383 663 1720
```
Notice that the `available` memory is more than `free` memory.
While the issue is known (see [this](https://github.com/fluent/fluent-bit/pull/3092) and [this](https://github.com/fluent/fluent-bit/pull/5237) link), it is unfortunately not yet fixed.
# A Temporary Solution
The issues I linked previously provide stand-alone plugins that fix the problem, which will hopefully be merged in the official project at some point.
However, I didn't want to install another plugin so I used Fluent Bit's `exec` input plugin and the `free` Linux command to query memory usage like so:
```conf
[INPUT]
Name exec
Tag memory
Command free -m | tail -2 | tr '\n' ' '
Interval_Sec 1
```
To interpret the command's output, I created the following filter:
```conf
[FILTER]
Name parser
Match memory
Key_Name exec
Parser free
```
Lastly, I created the following parser (warning: regex shitcode incoming):
```conf
[PARSER]
Name free
Format regex
Regex ^Mem:\s+(?<mem_total>\d+)\s+(?<mem_used>\d+)\s+(?<mem_free>\d+)\s+(?<mem_shared>\d+)\s+(?<mem_buff_cache>\d+)\s+(?<mem_available>\d+) Swap:\s+(?<swap_total>\d+)\s+(?<swap_used>\d+)\s+(?<swap_free>\d+)
Types mem_total:integer mem_used:integer mem_free:integer mem_shared:integer mem_buff_cache:integer mem_available:integer swap_total:integer swap_used:integer
```
With this configuration, you can use the `mem_available` metric to get accurate memory usage in Fluent Bit.
# Conclusion
Let's hope Fluent Bit's `mem` input plugin is improved upon soon so this hacky solution is not needed.
I also intend to document my new monitoring pipeline, which at the moment consists of:
- Fluent Bit
- Fluentd
- Elasticsearch
- Grafana