How do you handle Proxmox clusters when you only have 1 or 2 servers?
I technically have 3 servers but I keep one offline because I don’t need it 24/7 most point wasting power on a server I don’t need.
I believe I read somewhere that you can force Proxmox it to a lower number but it isn’t recommended. Has anyone done this and if so have you run into any issues with this?
My main issue is I want my VM to start no matter what. For example I had a power outage. When the servers came back online instead of starting they waited for the quorum number to reach 3. (it will never reach 3 because the third server wasn’t turn on.) so they just waited forever until I got home and ran
pvecm expected 2
- PrivateButts ( @PrivateButts@geddit.social ) 1•1 year ago
You’ll need a QDevice to keep consensus. That wiki article will cover how to set it up and some drawbacks to QDevices. You should be able to run it on a low-power device like a Pi to keep the cluster going.
- Irisos ( @Irisos@lemmy.umainfo.live ) 1•1 year ago
If you are not using any HA feature and only put servers into the same cluster for ease of management.
You could use the same command but with a value of 1.
The reason quorum exist is to prevent any server to arbitrarily failover VMs when it believes the other node(s) is down and create a split brain situation.
But if that risk does not exist to begin with, so do the quorum.
- towerful ( @towerful@beehaw.org ) 1•1 year ago
I have 2 nodes and a raspberry pi as a qdevice.
I can still power off 1 node (so I have 1 node and an rpi) if I want to.
To avoid split brain, if a node can see the qdevice then it is part of the cluster. If it can’t, then the node is in a degraded state.
Qdevices are only recommended in some scenarios, which I can’t remember off the top of my head.With 2 nodes, you can’t set up CEPH cluster (well, I don’t think you can).
But you can set up High Availability, and use ZFS snapshot replication on a 5 minute interval (so, if your VMs host goes down, the other host can start it with a potentially outdated snapshot).This worked for my project as I could have a few stateless services that could bounce between nodes, and I had a postgres VM with streaming replication (postgres not ZFS) and failover. Which lead to a decently fault tolerant setup.
I will have to look into the qdevice. I do have an old PI3 setup as a software defined radio. I might be able to also set it up as a qdevice.
https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
Looking at the documentation it isn’t recommended to use a a qdevice in a odd number node. I guess I technically have.
If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes, 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and it itself fails, no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.
But it seems to be more of an issue in large node clusters. In my situation I don’t think this is a big deal because if the qdevice fails and my third server is offline I am in the same situation I am now.
Just out of ceriosity do you backup your PI at all? Not sure what the recovery process is if the Qdevice fails how easy is it to replace resetup.
- MangoPenguin ( @MangoPenguin@lemmy.blahaj.zone ) 1•1 year ago
AFAIK forcing it to a lower number is fine if you’re not doing HA. I remember reading something along those lines on a forum, but I could be remembering wrong.
If you’re not using Ceph or HA, then I don’t think there would be any negative effects from not having all the servers in the cluster ready.
Oh good, I am not using any of those at least not at the moment.
Oh good, I am not using any of those at least not at the moment.