closeup photo of black and blue keyboard

Shutdown all virtual machines gracefully on ESXi v6.x

Covering the various acceptable ways to shutdown all of your virtual machines gracefully with VMware ESXi v6.7 (v6.x).

Overview

When shutting down the ESXi host, you may find that virtual machines are not gracefully shutting down, and the shutdown procedures of some services are not being respected.

We will go over a few different options for you to try, starting with the easiest and ending with a more involved, but sure-fire way to gracefully shutdown your virtual machines.

Getting started…

Prepare your environment

Before you continue, make sure you have the following ready:

  • Linux machine with network access to the ESXi host (instructions will be for Debian/Ubuntu)
  • Access to ESXi GUI

Enable ESXi shell / SSH

  1. Go to your ESXi interface
  2. Go to Navigator section > Manage 
  3. Click the Services tab
  4. In Services section, select TSM from the list
  5. Click Actions, then select Start to enable the ESXi shell
  6. In Services section, select TSM-SSH from the list
  7. Click Actions, then select Start to enable SSH for your ESXi host

Install VMware Tools

To install VMware Tools on your virtual machines, see the official VMware instructions and then come back to here.

Ensure VMware Tools is installed on your virtual machines so ESXi can collect the virtual machine data necessary for commands later on.

To test if VMware Tools is successfully installed on Linux, run:

$ vmware-toolbox-cmd -v
11.0.5.17716 (build-15389592)

Graceful shutdown methods

There are a number of ways we can go about shutting down our virtual machines.

I have broken this article into a few sections. Thumb through and find the method you feel is suitable for your purposes.

PowerShell Method

If you are using ESXi under the free license, skip to the next method. PowerShell commands are easy to follow, but are “read-only” under unlicensed ESXi, thereby disallowing the shutdown command.

You will want to run this PowerShell script on a machine outside your ESXi host, otherwise it may consequently shutdown itself before the script finishes executing.

If you are a bit of a programmer, using PowerShell, you could exclude the machine running the commands from the list, but I will not be diving into that.

Install PowerShell on Ubuntu 18.04

For latest instructions for all Linux OS, see the official Microsoft instructions and then come back to here.

# Download the Microsoft repository GPG keys
$ wget -q https://packages.microsoft.com/config/ubuntu/18.04/packages-microsoft-prod.deb

# Register the Microsoft repository GPG keys
$ sudo dpkg -i packages-microsoft-prod.deb

# Update the list of products
$ sudo apt-get update

# Enable the "universe" repositories
$ sudo add-apt-repository universe

# Install PowerShell
$ sudo apt-get install -y powershell
# Enter PowerShell
$ pwsh

# Install VMware PowerCLI module
PS> Install-Module -Name VMware.PowerCLI

Shut down machines with PowerShell

# Connect to ESXi host (use your own host IP)
PS> Connect-VIServer 10.0.0.2

# Store list of virtual machines that are powered on
PS> $vms = Get-VM |  Where-Object { $_.PowerState -eq "PoweredOn" }

# Shutdown all virtual machines
PS> $vms | Shutdown-VMGuest

# Shutdown without being prompted per-VM
PS> $vms | Shutdown-VMGuest -Confirm:$false

ESXCLI Method

While the PowerShell method requires a licensed instance of ESXi, in the free version you can still use the CLI for ESX.

# SSH into your ESXi host
$ ssh [email protected]

# List all virtual machines
$ esxcli vm process list

# Get World ID's of all virtual machines (ESXi uses an older version of grep, so some RegEx is not supported)
$ esxcli vm process list | grep 'World ID:' | grep -o '[0-9]*'

# Shut down a single virtual machine, gracefully (use your own World ID)
$ esxcli vm process kill -type=soft – world-id=12345678

# Shutdown all virtual machines
$ for i in $(esxcli vm process list | grep 'World ID:' | grep -o '[0-9]*'); do esxcli vm process kill --type=soft --world-id=$i; done;

Ansible Method (sure-fire way)

You have tried everything above, but to no avail. Despite your efforts, your virtual machines do not seem to be shutting down gracefully.

On your VM, when you use shutdown -h now, does it shut down gracefully?

If the answer is no, then you need to diagnose why your virtual machine is not shutting down as expected, because at the very least, that command should work.

Assuming the answer is yes, then your solution may be Ansible.

What is Ansible?

Ansible is a an agent-less automation tool that can be used for basic actions like executing commands on multiple hosts (as we will do here) or incredibly intricate orchestration that we will not be touching with a 10-foot pole in this article.

Why Ansible?

Ansible will let us send the shutdown command to every virtual machine via SSH.

Before we dive into just running commands, it is important you understand how this will work.

How it works

A host machine will have Ansible installed. This machine will send out the shutdown commands to each virtual machine you configure within an Ansible hosts file.

The command will be configured in what is known as a “playbook”, which is just a list of operations to perform for each target host, or virtual machine.

Thankfully I have already created the playbook for you, so all we have to do now is get set up.

Install Ansible on your host machine (Ubuntu 18.04)

For latest instructions for all Linux OS, see the official Ansible instructions and then come back to here.

We will want to designate a host machine that will distribute the shutdown commands. You can do this on a virtual machine within the ESXi host, or using an external device.

This host must have SSH (port 22) access to all of your virtual machines you want to shut down.

# SSH into intended Ansible host
$ ssh [email protected]

# Update apt
$ sudo apt update

# Install dependencies
$ sudo apt install software-properties-common

# Add Ansible repository, and update apt afterwards
$ sudo apt-add-repository --yes --update ppa:ansible/ansible

# Install Ansible
$ sudo apt install ansible

Create your Ansible hosts file

At the bottom of your Ansible hosts file (/etc/ansible/hosts) we are going to append your target VM’s. I recommend using the format below, but you can just list the IP’s as well.

[vms]
avirtualmachine		ansible_host=192.168.1.2
anothermachine		ansible_host=192.168.1.3
thirdmachine		ansible_host=192.168.1.4

Note: A host name cannot have spaces, and does not need to match the actual hostname of the machine.

FYI: You can actually dynamically update this list if you use your own DNS server. This is advanced, but if you wanted to take a stab at it, read more here.

Create your playbook

# Create playbook folder
$ sudo mkdir /etc/ansible/playbook

# Create playbook file
$ sudo touch /etc/ansible/shutdown.yml

Populate your /etc/ansible/shutdown.yml playbook with the instructions below

---
- hosts: vms
  remote_user: shutdown_account
#  become: yes #(optional)
#  become_user: root #(optional)
#  become_method: sudo #(optional)
  gather_facts: no
  tasks:
    - name: shutdown
      command: sudo /sbin/shutdown -h now
      async:
      poll: 0
      ignore_errors: true
    - name: disconnected
      wait_for:
        host: '{{ inventory_hostname }}'
        port: 22
        state: stopped
        timeout: 20
        delay: 5
      delegate_to: localhost

The --- line at the top designates that this is a YAML file. It has additional meaning, but that is something you can research on your own time.

hosts: vms refers to the section header you used in your Ansible hosts file. Make sure that it matches.

gather_facts is like a shortcut playbook, which runs some commands on the target virtual machines and collects system data (i.e. the OS / OS version) which you can then utilize as variables in your playbook. We are setting it to no because we do not need any of these variables.

become describes what user account to utilize after SSHing with the remote_user account. In my commented example, it uses sudo and the root account. You will see why it is commented out.

shutdown task runs the shutdown command as sudo. You will see why. It is also async (as in asynchronous), because otherwise when the connection terminates, Ansible thinks the playbook has failed because of the dropped connection. By making it asynchronous, Ansible sends it in another thread, coupled with poll: 0, it continues with the rest, ignorant of the result. Furthermore, ignore_errors will enforce this.

disconnected task will first sleep 5 seconds, then wait_for a closed port connection on port 22 of host {{ inventory_hostname }} (a default variable exposed by Ansible, letting us reference the current target host), timing out after 20 seconds. The task will delegate_to: localhost, otherwise Ansible will be attempting to monitor the port from the remote host, which obviously will not work.

Run your Ansible playbook

# Shutdown all virtual machines
$ sudo ansible-playbook /etc/ansible/playbook/shutdown.yml

Preparing your virtual machines

There is one hiccup in this playbook, which is that SSHing into every box will require a password prompt for the account you are SSHing with, thereby making this quite a tedious process.

You have a few options:

  1. Enter the password every time, it is fine
  2. Recommended: Create an SSH key for a non-root account so that the Ansible host can connect without password prompt

You might now ask, “If I am using a non-root account, then how will I use the sudo shutdown -h now command defined in the shutdown task?”

The Solution

On every VM you wish to control, create an account that has sudoer privileges on the shutdown command.

When you edit the sudoers file, make sure you do not make syntax errors, because you can potentially lock yourself out of sudo control from any account. There are ways to correct this, but those will not be covered in this article.

So as to avoid the issue above, you will see that we will be switching to a root account. That way, if you corrupt your sudoers file, you can fix it without using sudo which requires parsing a valid sudoers file.

On every virtual machine you want to control, do the following:

# SSH into your virtual machine
$ ssh [email protected]

# Switch to root user
$ sudo su

# Add a user called "shutdown"
$ useradd shutdown

# Edit the sudoers file
$ visudo

At the end of the /etc/suoders file, append:

shutdown ALL=(ALL:ALL) NOPASSWD:/sbin/shutdown

This grants the “shutdown” user you created access to the shutdown command, without having to type a password.

Let us confirm the /etc/sudoers file is okay:

$ sudo visudo -c

If any error is thrown, fix it before continuing. This is critical. A syntax error in your sudoers file can lock you out of sudo access.

# Exit root account
$ exit

# End shell connection
$ logout

On your Ansible host, create a keypair so that you can connect to your virtual machines without password prompt (allowing the playbook to run).

# Connect to Ansible host
$ ssh [email protected]

# Generate keypair (name: id_rsa)
$ ssh-keygen

# Copy over generated key to your virtual machines
$ sudo ssh-copy-id -i ~/.ssh/id_rsa [email protected]
$ sudo ssh-copy-id -i ~/.ssh/id_rsa [email protected]
$ sudo ssh-copy-id -i ~/.ssh/id_rsa [email protected]

Re-run your Ansible playbook

# Shutdown all virtual machines
$ sudo ansible-playbook /etc/ansible/playbook/shutdown.yml

All done!

Now, you will see that every virtual machine has shut down, except of course the one you have run the playbook from. If you wish, you may now shut down the Ansible host box as well with our faithful shutdown -h now. Goodnight!

Conclusion

There are a few different ways to shutdown all of your virtual machines gracefully. Once you have found your desired method, you can create automations such as:

  • Gracefully shut down all virtual machines when UPS battery backup kicks in (you may want to shutdown ESXi host afterwards to protect it as well)
  • Shut down all virtual machines for system maintenance
  • Set up a cron to shutdown all virtual machines on a specific day of the month or time of the night (you can use ESXCLI on the ESXi host to boot them back up)

If you went with the Ansible method, hopefully you feel inspired to use it for even more, having just scratched the surface of its capabilities.

Default image
Grayson Adams
Systems and software engineer in Atlanta, GA, experienced in developing web applications and managing both physical and cloud-based infrastructure.

Newsletter Updates

Enter your email address below to subscribe to the G. BLOG newsletter