Exploration of Docker Swarm alternative #65

Closed
opened 2024-02-25 15:29:14 +00:00 by pim · 1 comment
Owner

While Docker Swarm has worked fine for us for more than a year, it is time to evaluate whether it is still actually needed. Our main use of Docker Swarm is as follows:

  • Automatic scheduling of container workloads
  • Overlay network so we don't have to hard-code hosts and for traefik

It seems there are two alternatives with similar features we can use:

  • Kubernetes
  • Hashicorp Nomad

Recent experimentation with k3s shows Kubernetes has way too many moving parts that I am comfortable maintaining and administrating. Therefore let's explore Nomad.

It seems we are looking for a "service mesh", which can be achieved using Consul: https://developer.hashicorp.com/nomad/docs/networking/service-mesh
Benefit is that we can use Podman as container driver, as well as perhaps rootless containers.
Submitting Nomad jobs can be done in Ansible, similar how we do it currently for Docker Swarm: https://docs.ansible.com/ansible/latest/collections/community/general/nomad_job_module.html
It seems service discovery works fully for Nomad, even without Consul: https://traefik.io/blog/traefik-proxy-fully-integrates-with-hashicorp-nomad/ Question that remains though, doed that work on multiple hosts as well or do we need Consul then? It seems yes: we would need Consul for the mesh networking. But we don't need it for the service discovery as Traefik does that for us.

While Docker Swarm has worked fine for us for more than a year, it is time to evaluate whether it is still actually needed. Our main use of Docker Swarm is as follows: - Automatic scheduling of container workloads - Overlay network so we don't have to hard-code hosts and for traefik It seems there are two alternatives with similar features we can use: - Kubernetes - Hashicorp Nomad Recent experimentation with k3s shows Kubernetes has way too many moving parts that I am comfortable maintaining and administrating. Therefore let's explore Nomad. It seems we are looking for a "service mesh", which can be achieved using Consul: https://developer.hashicorp.com/nomad/docs/networking/service-mesh Benefit is that we can use Podman as container driver, as well as perhaps rootless containers. Submitting Nomad jobs can be done in Ansible, similar how we do it currently for Docker Swarm: https://docs.ansible.com/ansible/latest/collections/community/general/nomad_job_module.html It seems service discovery works fully for Nomad, even without Consul: https://traefik.io/blog/traefik-proxy-fully-integrates-with-hashicorp-nomad/ Question that remains though, doed that work on multiple hosts as well or do we need Consul then? It seems yes: we would need Consul for the mesh networking. But we don't need it for the service discovery as Traefik does that for us.
Author
Owner

An update: have decided to run k3s in the end :^) The problems I had were mostly related to the fact I was running it in a VM with VirtioFS as filesystem. I am now running this on bare-metal and it works well.

There are some questions left:

  • How do we use NFS as volumes in kubernetes? Easy, just create a PV and a PVC which can be used by any container. Example pushed.
  • How do we expose ports other than HTTP and HTTPS? We solved this using metallb. Each service gets its own ip address.
  • How do we import more openapi definitions, e.g. for metallb? This is actually quite tricky. For now, I have created dummy types which which enables creation of these resources but offers no validation on the Nix side.
  • How do we enable inter-pod communication?
  • How to do secret management? Kubenix suggest using Vals. Vals supports multiple backends, relevant to us is only file and sops. Sops is especially interesting because we can encrypt the secrets. We can use the file-based method as an intermediate step.

Nice to have: dns for every service we define. It seems k3s's dns solution supports this (coredns), but needs configuration. Namely, it needs k8s_external plugin enabled. Work around for now: statically assign an ip address to each service. Then configure router for these ips.

An update: have decided to run k3s in the end :^) The problems I had were mostly related to the fact I was running it in a VM with VirtioFS as filesystem. I am now running this on bare-metal and it works well. There are some questions left: - How do we use NFS as volumes in kubernetes? Easy, just create a PV and a PVC which can be used by any container. Example pushed. - How do we expose ports other than HTTP and HTTPS? We solved this using metallb. Each service gets its own ip address. - How do we import more openapi definitions, e.g. for metallb? This is actually quite tricky. For now, I have created dummy types which which enables creation of these resources but offers no validation on the Nix side. - How do we enable inter-pod communication? - How to do secret management? Kubenix [suggest](https://kubenix.org/examples/secrets/) using [Vals](https://github.com/helmfile/vals). Vals supports multiple backends, relevant to us is only `file` and `sops`. [Sops](https://github.com/getsops/sops) is especially interesting because we can encrypt the secrets. We can use the file-based method as an intermediate step. Nice to have: dns for every service we define. It seems k3s's dns solution supports this (coredns), but needs configuration. Namely, it needs k8s_external plugin enabled. Work around for now: statically assign an ip address to each service. Then configure router for these ips.
pim closed this issue 2024-04-06 08:21:10 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: home/nixos-servers#65
No description provided.