diff --git a/jekyll/_posts/fluent-bit-memory/2023-08-09-fluent-bit-memory.md b/jekyll/_posts/fluent-bit-memory/2023-08-09-fluent-bit-memory.md new file mode 100644 index 0000000..d5160c4 --- /dev/null +++ b/jekyll/_posts/fluent-bit-memory/2023-08-09-fluent-bit-memory.md @@ -0,0 +1,74 @@ +--- +layout: post +title: Monitoring Correct Memory Usage in Fluent Bit +date: 2023-08-09 16:19:00 Europe/Amsterdam +categories: fluentd fluentbit memory +--- + +Previously, I have used [Prometheus' node_exporter](https://github.com/prometheus/node_exporter) to monitor the memory usage of my servers. +However, I am currently in the process of moving away from Prometheus to a new Monioring stack. +While I understand the advantages, I felt like Prometheus' pull architecture does not scale nicely. +Everytime I spin up a new machine, I would have to centrally change Prometheus' configuration in order for it to query the new server. + +In order to collect metrics from my servers, I am now using [Fluent Bit](https://fluentbit.io/). +I love Fluent Bit's way of configuration which I can easily express as code and automate, its focus on effiency and being vendor agnostic. +However, I have stumbled upon one, in my opinion, big issue with Fluent Bit: its `mem` plugin to monitor memory usage is _completely_ useless. +In this post I will go over the problem and my temporary solution. + +# The Problem with Fluent Bit's `mem` Plugin + +As can be seen in [the documentation](https://docs.fluentbit.io/manual/pipeline/inputs/memory-metrics), Fluent Bit's `mem` input plugin exposes a few metrics regarding memory usage which should be self-explaining: `Mem.total`, `Mem.used`, `Mem.free`, `Swap.total`, `Swap.used` and `Swap.free`. +The problem is that `Mem.used` and `Mem.free` do not accurately reflect the machine's actual memory usage. +This is because these metrics include caches and buffers, which can be reclaimed by other processes if needed. +Most tools reporting memory usage therefore include an additional metric that specifices the memory _available_ on the system. +For example, the command `free -m` reports the following data on my laptop: +``` + total used free shared buff/cache available +Mem: 15864 3728 7334 518 5647 12136 +Swap: 2383 663 1720 +``` + +Notice that the `available` memory is more than `free` memory. + +While the issue is known (see [this](https://github.com/fluent/fluent-bit/pull/3092) and [this](https://github.com/fluent/fluent-bit/pull/5237) link), it is unfortunately not yet fixed. + +# A Temporary Solution + +The issues I linked previously provide stand-alone plugins that fix the problem, which will hopefully be merged in the official project at some point. +However, I didn't want to install another plugin so I used Fluent Bit's `exec` input plugin and the `free` Linux command to query memory usage like so: +```conf +[INPUT] + Name exec + Tag memory + Command free -m | tail -2 | tr '\n' ' ' + Interval_Sec 1 +``` + +To interpret the command's output, I created the following filter: +```conf +[FILTER] + Name parser + Match memory + Key_Name exec + Parser free +``` + +Lastly, I created the following parser (warning: regex shitcode incoming): +```conf +[PARSER] + Name free + Format regex + Regex ^Mem:\s+(?\d+)\s+(?\d+)\s+(?\d+)\s+(?\d+)\s+(?\d+)\s+(?\d+) Swap:\s+(?\d+)\s+(?\d+)\s+(?\d+) + Types mem_total:integer mem_used:integer mem_free:integer mem_shared:integer mem_buff_cache:integer mem_available:integer swap_total:integer swap_used:integer +``` + +With this configuration, you can use the `mem_available` metric to get accurate memory usage in Fluent Bit. + +# Conclusion + +Let's hope Fluent Bit's `mem` input plugin is improved upon soon so this hacky solution is not needed. +I also intend to document my new monitoring pipeline, which at the moment consists of: +- Fluent Bit +- Fluentd +- Elasticsearch +- Grafana