98 lines
4.1 KiB
Markdown
98 lines
4.1 KiB
Markdown
# Longhorn notes
|
||
|
||
## Troubleshooting
|
||
|
||
```
|
||
Multi-Attach error for volume "prowlarr" Volume is already exclusively attached to one node and can't be attached to another
|
||
```
|
||
|
||
I solved the above problem like this:
|
||
```
|
||
❯ kubectl get volumeattachments | grep prowlarr
|
||
csi-f13ee1f46a4acc0d7e4abe8a3c993c7e043e9a55cd7573bda3499085654b493a driver.longhorn.io prowlarr lewis true 3m38s
|
||
❯ kubectl delete volumeattachments csi-f13ee1f46a4acc0d7e4abe8a3c993c7e043e9a55cd7573bda3499085654b493a
|
||
❯ kubectl rollout restart -n media deployment prowlarr
|
||
```
|
||
|
||
```
|
||
driver name driver.longhorn.io not found in the list of registered CSI drivers
|
||
```
|
||
|
||
I solved this by restarting k3s:
|
||
```
|
||
systemctl restart k3s
|
||
```
|
||
|
||
## Migration from NFS to Longhorn
|
||
|
||
1. Delete the workload, and delete the PVC and PVC using NFS.
|
||
2. Create Longhorn volumes as described below.
|
||
3. Copy NFS data from lewis.dmz to local disk.
|
||
4. Spin up a temporary pod and mount the Longhorn volume(s) in it:
|
||
```nix
|
||
{
|
||
pods.testje.spec = {
|
||
containers.testje = {
|
||
image = "nginx";
|
||
|
||
volumeMounts = [
|
||
{
|
||
name = "uploads";
|
||
mountPath = "/hedgedoc/public/uploads";
|
||
}
|
||
];
|
||
};
|
||
|
||
volumes = {
|
||
uploads.persistentVolumeClaim.claimName = "hedgedoc-uploads";
|
||
};
|
||
};
|
||
}
|
||
```
|
||
5. Use `kubectl cp` to copy the data from the local disk to the pod.
|
||
6. Delete the temporary pod.
|
||
7. Be sure to set the group ownership of the mount to the correct GID.
|
||
7. Create the workload with updated volume mounts.
|
||
8. Delete the data from local disk.
|
||
|
||
## Creation of new Longhorn volumes
|
||
|
||
While it seems handy to use a K8s StorageClass for Longhorn, we do *not* want to use that.
|
||
If you use a StorageClass, a PV and Longhorn volume will be automatically provisioned.
|
||
These will have the name `pvc-<UID of PVC>`, where the UID of the PVC is random.
|
||
This makes it hard to restore a backup to a Longhorn volume with the correct name.
|
||
|
||
Instead, we want to manually create the Longhorn volumes via the web UI.
|
||
Then, we can create the PV and PVC as usual using our K8s provisioning tool (e.g. Kubectl/Kubenix).
|
||
|
||
Follow these actions to create a Volume:
|
||
1. Using the Longhorn web UI, create a new Longhorn volume, keeping the following in mind:
|
||
- The size can be some more than what we expect to reasonable use. We use storage-overprovisioning, so the total size of volumes can exceed real disk size.
|
||
- The number of replicas should be 2.
|
||
2. Enable the "backup-nfs" recurring job for the Longhorn volume.
|
||
3. Disable the "default" recurring job group for the Longhorn volume.
|
||
4. Create the PV, PVC and workload as usual.
|
||
|
||
## Disaster recovery using Longhorn backups
|
||
|
||
Backing up Longhorn volumes is very easy, but restoring them is more tricky.
|
||
We consider here the case when all our machines are wiped, and all we have left is Longhorn backups.
|
||
To restore a backup, perform the following actions:
|
||
1. Restore the latest snapshot in the relevant Longhorn backup, keeping the following in mind:
|
||
- The name should remain the same (i.e. the one chosen at Longhorn volume creation).
|
||
- The number of replicas should be 2.
|
||
- Disable recurring jobs.
|
||
2. Enable the "backup-nfs" recurring job for the Longhorn volume.
|
||
3. Disable the "default" recurring job group for the Longhorn volume.
|
||
4. Create the PV, PVC and workload as usual.
|
||
|
||
## Recovering Longhorn volumes without a Kubernetes cluster
|
||
|
||
1. Navigate to the Longhorn backupstore location (`/mnt/longhorn/persistent/longhorn-backup/backupstore/volumes` for us).
|
||
2. Find the directory for the desired volume: `ls **/**`.
|
||
3. Determine the last backup for the volume: `cat volume.cfg | jq '.LastBackupName'`.
|
||
4. Find the blocks and the order that form the volume: `cat backups/<name>.cfg | jq '.Blocks'`.
|
||
5. Extract each block using lz4: `lz4 -d blocks/XX/YY/XXYY.blk block`.
|
||
6. Append the blocks to form the file system: `cat block1 block2 block3 > volume.img`
|
||
7. Lastly we need to fix the size of the image. We can simply append zero's to the end until the file is long enough so `fsck.ext4` does not complain anymore.
|
||
8. Mount the image: `mount -o loop volume.img /mnt/volume`.
|