Skip to content

Cluster Management

This guide covers the full lifecycle management of a KubeAuto cluster.


Lifecycle Commands

Operation Command When to Use
Start / Resume vagrant up Begin a session
Suspend vagrant suspend Pause for the day
Halt vagrant halt Clean shutdown before reboot
Destroy vagrant destroy -f Full reset, delete everything
Re-provision vagrant provision <node> Re-run setup scripts on a node

Daily Development

# Morning — resume
vagrant up

# Work with your cluster...
vagrant ssh controlplane
kubectl get nodes

# Evening — save state
vagrant suspend

Before Host Reboot

vagrant halt
# Reboot your machine
vagrant up

Complete Reset

vagrant destroy -f
rm -f join-command.sh
vagrant up

Re-provisioning Nodes

If a node's provisioning failed or you want to re-run the setup scripts:

vagrant provision controlplane
vagrant provision node01

Not Idempotent (v0.1.0)

The current provisioning scripts are not fully idempotent. Re-provisioning a fully configured node may fail because kubeadm init detects an existing cluster. For a clean re-setup, destroy and recreate the node instead:

vagrant destroy node01 -f
vagrant up node01


Monitoring Cluster Health

Quick Health Check

vagrant ssh controlplane

# All nodes Ready?
kubectl get nodes

# All system pods Running?
kubectl get pods -n kube-system

# Calico healthy?
kubectl get tigerastatus

Detailed Diagnostics

# Node conditions and events
kubectl describe node controlplane

# Recent cluster events
kubectl get events --sort-by=.metadata.creationTimestamp

# kubelet logs on a node
sudo journalctl -u kubelet --no-pager -n 50

Recovering from Common Issues

Nodes Show NotReady After Resume

Wait 30–60 seconds. If a node doesn't recover:

# From inside the affected node
vagrant ssh node01
sudo systemctl restart containerd kubelet

Recreating a Single Worker

# From host
vagrant destroy node02 -f
vagrant up node02

Full Cluster Reset

vagrant destroy -f
rm -f join-command.sh
vagrant up

This is the nuclear option — only use when the cluster is unrecoverable.


Useful Maintenance Commands

# Check Vagrant VM status
vagrant status

# List all VirtualBox VMs
VBoxManage list vms

# Check disk usage of VMs
VBoxManage list hdds

# Generate a new join token (if expired)
vagrant ssh controlplane
sudo kubeadm token create --print-join-command > /vagrant/join-command.sh