Problem

The CyberPower UPS connected to the Proxmox server had basic shutdown functionality that would leave VMs stuck or force-killed during a power event, losing in-progress work. Proxmox's default UPS integration doesn't account for the difference between workloads that should be suspended vs those that should be fully shut down, and it doesn't handle the case where something refuses to stop gracefully.

Approach

Built a custom Bash script that reads container and VM tags to decide whether to suspend or shut down each object when triggered by the UPS. It runs through all active objects first - suspending where configured, shutting down the rest - then does a second pass to force-stop anything still running. The tag-based approach means shutdown behavior is configured directly on each VM or container, not in a separate config file.

Outcome

Tested successfully during two real power events (storms and local maintenance). All containers and VMs shut down cleanly in sequence, and everything booted back up without issues when power was restored. No lost work, no stuck processes, no manual cleanup required after either event.

Tech Stack

Bash Proxmox UPS