For vSAN, the definition of balancing is to spread out resources (in this case, disk utilization) across the disk nodes that make up the group. This can be done manually or automatically. In either case some performance degradation can occur. Rebalancing can take up to 24 hrs so be patient when its triggered.
Post a comment if you’d like me to go further in depth within any of the scenarios listed.
Automatic:
- When/If any capacity disk reaches 80% threshold utilization. When this occurs, vSAN will initiate a rebalance of the disks. This action will continue until all of the disks within that specific Disk Group are under that 80% threshold.
- What can potentially cause this:
- HW Failures/Removals
- Host being placed in Maintenance Mode
- Incorrect policy change
- What can potentially cause this:
Manual:
- When a host is placed in maintenance mode and the ‘Evacuate all data to other hosts’ option is selected a rebalance procedure occurs.
How to Initiate:
- Within the Host Cluster health > vSAN Disk Balance > Rebalance Disks
- Using RVC (Ruby vSphere Commands)
- vsan.health.cluster_rebalance < host cluster name>
How to Monitor:
- Using vSphere Web Client:
- Select your ‘Host Cluster’ > ‘Monitor’ tab > ‘Virtual SAN’ sub tab > ‘Health’ on the left side panel > ‘Rebalance Disks’ button
- Using RVC:
- vsan.check_limits
- Verifies the disk utilization in the cluster.
- vsan.whatif_host_failures
- Checks the capacity of each host and verifies whether a node failure will cause data loss.
- vsan.resync_dashboard
- Shows current rebuild status.
- vsan.check_limits