Finally found the time to look at my Talos clusters and found that one of my Talos nodes didn’t seem to apply a patch regarding an old service removal correctly. Didn’t really bother too much with pinpointing why and just did a reset and reapply of the node’s backed up machine config but for some reason, it wasn’t joining the Etcd cluster (it was a control node) due to the cluster being unhealthy. Using the command,
talosctl --talosconfig <config> --nodes <node ip> etcd members
I found that somehow, the node that was reset was not removed from the list of Etcd members. So, I performed a removal using,
talosctl --talosconfig <config> --nodes <node ip> etcd remove-member <member id>
with the <member id> from the 1st command. Afterwards, I rebooted the node and it successfully joined up with the cluster.
Leave a Reply