The purpose of this article is to describe how op5 Monitor or Nagios used with the Check ESX Plugin can be used to monitor your VMware ESX and vSphere servers. You may monitor either a single ESX(i)/vSphere server or a VMware VirtualCenter/vCenter Server and individual virtual machines. If you have a VMware cluster you should monitor the data center (VMware VirtualCenter/vCenter Server) and not the ESX/vSphere servers by them self.
More information can be found on Monitor Virtual Infrastructure with op5 Monitor
Prerequisites
Before you start you need to make sure you have an account on the server with correct access rights.
In the default installation of VMware ESX/vSphere there is a ‘read only’ profile you should use when creating a new user. That profile has enough rights to be used for monitoring. The user you create must be:
- member of the group user
- be based on the profile ‘read only’
You must install the VMware vSphere SDK for Perl on your op5 Monitor server. Please read the how-to about Installing vSphere SDK for Perl for instructions.
This will be done
We will go through:
- Monitoring a VMWare ESX Datacenter/vCenter
- Monitoring a VMWare ESX/vSphere Host
- Monitoring a VMware Virtual host
- Monitoring a VMware Virtual host trough a Datacenter/vCenter
- Monitoring a VMware ESX/vSphere Host trough a Datacenter/vCenter
Check commands
Add the required check-commands, if they don’t already exist in your configuration (‘Configure’ -> ‘Commands’ -> ‘Check Command Import’):
You should also define a username and password in /opt/monitor/etc/resource.cfg to hide this information from the CGI:s:
$USER11$=username
$USER12$=password
Note: We’ll use the $HOSTALIAS$ macro in the command_line because we need to use the VM-names as they are defined in your VMware server. Set this name as an Alias in the host definition. These changes doesn’t affect the history of your host.
Commands for ESX(i) Datacenter/vCenter
| command_name | command_line |
|---|
| check_esx3_dc_vm | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -l $ARG2$ -s $ARG3$ -N $HOSTALIAS$ -w $ARG4$ -c $ARG5$ |
| check_esx3_dc_host | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -l $ARG2$ -s $ARG3$ -H $HOSTADDRESS$ -w $ARG4$ -c $ARG5$ |
Commands for ESX(i)/vSphere Hosts
| command_name | command_line |
|---|
| check_esx3_host_cpu_usage | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l cpu -s usage -w $ARG1$ -c $ARG2$ |
| check_esx3_host_mem_usage | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l mem -s usage -w $ARG1$ -c $ARG2$ |
| check_esx3_host_swap_usage | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l mem -s swap -w $ARG1$ -c $ARG2$ |
| check_esx3_host_net_usage | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l net -s usage -w $ARG1$ -c $ARG2$ |
| check_esx3_host_vmfs | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l vmfs -s $ARG1$ -w “$ARG2$:” -c “$ARG3$:” |
| check_esx3_host_runtime_status | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l runtime -s status |
| check_esx3_host_runtime_issues | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l runtime -s issues |
| check_esx3_host_io_read | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l io -s read -w $ARG1$ -c $ARG2$ |
| check_esx3_host_io_write | $USER1$/check_esx3 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l io -s write -w $ARG1$ -c $ARG2$ |
Commands for virtual machines on ESX(i)/vSphere servers
| command_name | command_line |
|---|
| check_esx3_vm_cpu_usage | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l cpu -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_mem_usage | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l mem -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_swap_usage | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l mem -s swap -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_net_usage | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l net -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_runtime_cpu | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l runtime -s cpu -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_runtime_mem | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l runtime -s mem -w $ARG2$ -c $ARG3$ |
| check_esx3_vm_runtime_status | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l runtime -s status |
| check_esx3_vm_runtime_state | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l runtime -s state |
| check_esx3_vm_runtime_issues | $USER1$/check_esx3 -H $ARG1$ -u $USER11$ -p $USER12$ -N $HOSTALIAS$ -l runtime -s issues |
Commands for ESX/vSphere Hosts trough your Datacenter/vCenter
| command_name | command_line |
|---|
| check_esx3_dc_host_cpu_usage | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l cpu -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_dc_host_mem_usage | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l mem -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_dc_host_net_usage | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l net -s usage -w $ARG2$ -c $ARG3$ |
| check_esx3_dc_host_runtime_issues | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l runtime -s issues |
| check_esx3_dc_host_runtime_state | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l runtime -s state |
| check_esx3_dc_host_runtime_status | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l runtime -s status |
| check_esx3_dc_host_swap_usage | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l mem -s swap -w $ARG2$ -c $ARG3$ |
| check_esx3_dc_host_io_read | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l io -s read -w $ARG2$ -c $ARG3$ |
| check_esx3_dc_host_io_write | $USER1$/check_esx3 -D $ARG1$ -u $USER11$ -p $USER12$ -H $HOSTALIAS$ -l io -s write -w $ARG2$ -c $ARG3$ |
Generic commands for ESX(i)/vSphere
There are three generic commands for check_esx3 which could be used if you want to monitor anything not mentioned in the tables above. If you do not have them in your system you may add them with the import functionality in op5 Monitor (‘Configure’ -> ‘Commands’ -> ‘Check Command Import’).
| command_name | description |
|---|
| check_esx3_dc | Use this command if you want to monitor (or throuh) a Datacenter/vCenter. |
| check_esx3_host | Use this command if you want to monitor a ESX(i)/vSphere. |
| check_esx3_vm | Use this one to monitor a single VM. |
| check_esx_dc_vm | Use this command to monitor a single VM trough Datacenter/vCenter |
Adding the services
Add the required services, (‘Configure’ -> ‘Host: ‘ -> ‘Go’ -> ‘Services for host ‘ -> ‘Add new service’ -> ‘Go’):
Add the following services (Argumenst are just examples, you need to adjust them to suite your environment).
Services for ESX(i) Datacenter
| service_description | check_command | check_command_args | Note |
|---|
| VMware DC VM | check_esx3_dc_vm | VCserver-ip!command!subcommand! warning!critical | |
| VMware DC Host | check_esx3_dc_host | VCserver-ip!command!subcommand! warning!critical |
Services for ESX(i) hosts
| service_description | check_command | check_command_args | Note |
|---|
| VMware CPU Usage | check_esx3_host_cpu_usage | 80!90 | * |
| VMware Mem Usage | check_esx3_host_mem_usage | 80!90 | * |
| VMware Swap Usage | check_esx3_host_swap_usage | 80!90 | * |
| VMware Net Usage | check_esx3_host_net_usage | 102400!204800 | ** |
| VMware VMFS main-storage | check_esx3_host_vmfs | main-storage!15%!10% | |
| VMware Runtime Status | check_esx3_host_runtime_status | | *** |
| VMware Runtime Issues | check_esx3_host_runtime_issues | | **** |
| VMware IO Read | check_esx3_host_io_read | 40!90 | ***** |
| VMware IO Write | check_esx3_host_io_write | 40!90 | ***** |
Services for virtual machines on ESX(i)/vSphere server
| service_description | check_command | check_command_args | Note |
|---|
| VMware VM CPU Usage | check_esx3_vm_cpu_usage | esx-host-ip!80!90 | * |
| VMware VM Mem Usage | check_esx3_vm_mem_usage | esx-host-ip!80!90 | * |
| VMware VM Swap Usage | check_esx3_vm_swap_usage | esx-host-ip!80!90 | * |
| VMware VM Net Usage | check_esx3_vm_net_usage | esx-hostip! 102400!204800 | ** |
| VMware VM Runtime CPU | check_esx3_vm_runtime_status | esx-host-ip!80!90 | * |
| VMware VM Runtime Mem | check_esx3_vm_runtime_status | esx-host-ip!80!90 | * |
| VMware VM Runtime Status | check_esx3_vm_runtime_status | esx-host-ip | *** |
| VMware VM Runtime Issues | check_esx3_vm_runtime_issues | esx-host-ip | **** |
Services for virtual machines through your Datacenter/vCenter
| service_descr. | check_command | check_command_args | Note |
|---|
| VMware Host CPU Usage | check_esx3_dc_host_cpu_usage | VCserver-ip!80!90 | * |
| VMware Host Mem Usage | check_esx3_dc_host_mem_usage | VCserver-ip!80!90 | * |
| VMware Host Swap Usage | check_esx3_dc_host_swap_usage | VCserver-ip!80!90 | * |
| VMware Host Net Usage | check_esx3_dc_host_net_usage | VCserver-ip !102400!204800 | ** |
| VMware Host Runtime Status | check_esx3_dc_host_runtime_status | VCserver-ip | *** |
| VMware Host IO Read | check_esx3_dc_host_io_read | VCserver-ip!40!90 | ***** |
| VMware Host IO Write | check_esx3_dc_host_io_write | VCserver-ip!40!90 | ***** |
Services for ESX/vSphere Hosts through your Datacenter/vCenter
| service_descr. | check_command | check_command_args | Note |
|---|
| VMware Host CPU Usage | check_esx3_dc_host_cpu_usage | VCserver-ip!80!90 | * |
| VMware Host Mem Usage | check_esx3_dc_host_mem_usage | VCserver-ip!80!90 | * |
| VMware Host Swap Usage | check_esx3_dc_host_swap_usage | VCserver-ip!80!90 | * |
| VMware Host Net Usage | check_esx3_dc_host_net_usage | VCserver-ip !102400!204800 | ** |
| VMware Host Runtime Status | check_esx3_dc_host_runtime_status | VCserver-ip | *** |
| VMware Host IO Read | check_esx3_dc_host_io_read | VCserver-ip!40!90 | ***** |
| VMware Host IO Write | check_esx3_dc_host_io_write | VCserver-ip!40!90 | ***** |
Notes:
* Warn and critical in percent.
** Warn and critical in kb/s
*** Anything else than “green” as response results in a Critical state
**** Any issues found results in a Critical state
***** Warn and critical in ms
“” as the last char on each row meens that the command is splitted for readability, should be on one line.
Ranges for Warning and Critical thresholds:
10 < 0 or > 10, (outside the range of {0 .. 10})
10: < 10, (outside {10 .. ∞})
~:10 > 10, (outside the range of {-∞ .. 10})
10:20 < 10 or > 20, (outside the range of {10 .. 20})
@10:20 ≥ 10 and ≤ 20, (inside the range of {10 .. 20})
10 < 0 or > 10, (outside the range of {0 .. 10})
More info can be found here:
http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT
Use the “Test this service” button for the services to see if they work. Once the are correct and working as they should you may add the services to all of your hosts with the clone-function.
Monitoring VMware ESX 3.x, ESXi, vSphere 4 and vCenter Server
The purpose of this article is to describe how op5 Monitor or Nagios used with the Check ESX Plugin can be used to monitor your VMware ESX and vSphere servers. You may monitor either a single ESX(i)/vSphere server or a VMware VirtualCenter/vCenter Server and individual virtual machines. If you have a VMware cluster you should monitor the data center (VMware VirtualCenter/vCenter Server) and not the ESX/vSphere servers by them self.
More information can be found on Monitor Virtual Infrastructure with op5 Monitor
Prerequisites
Before you start you need to make sure you have an account on the server with correct access rights.
In the default installation of VMware ESX/vSphere there is a ‘read only’ profile you should use when creating a new user. That profile has enough rights to be used for monitoring. The user you create must be:
You must install the VMware vSphere SDK for Perl on your op5 Monitor server. Please read the how-to about Installing vSphere SDK for Perl for instructions.
This will be done
We will go through:
Check commands
Add the required check-commands, if they don’t already exist in your configuration (‘Configure’ -> ‘Commands’ -> ‘Check Command Import’):
You should also define a username and password in /opt/monitor/etc/resource.cfg to hide this information from the CGI:s:
$USER11$=username
$USER12$=password
Note: We’ll use the $HOSTALIAS$ macro in the command_line because we need to use the VM-names as they are defined in your VMware server. Set this name as an Alias in the host definition. These changes doesn’t affect the history of your host.
Commands for ESX(i) Datacenter/vCenter
Commands for ESX(i)/vSphere Hosts
Commands for virtual machines on ESX(i)/vSphere servers
Commands for ESX/vSphere Hosts trough your Datacenter/vCenter
Generic commands for ESX(i)/vSphere
There are three generic commands for check_esx3 which could be used if you want to monitor anything not mentioned in the tables above. If you do not have them in your system you may add them with the import functionality in op5 Monitor (‘Configure’ -> ‘Commands’ -> ‘Check Command Import’).
Adding the services
Add the required services, (‘Configure’ -> ‘Host: ‘ -> ‘Go’ -> ‘Services for host ‘ -> ‘Add new service’ -> ‘Go’):
Add the following services (Argumenst are just examples, you need to adjust them to suite your environment).
Services for ESX(i) Datacenter
warning!critical
warning!critical
Services for ESX(i) hosts
Services for virtual machines on ESX(i)/vSphere server
102400!204800
Services for virtual machines through your Datacenter/vCenter
!102400!204800
Services for ESX/vSphere Hosts through your Datacenter/vCenter
!102400!204800
Notes:
* Warn and critical in percent.
** Warn and critical in kb/s
*** Anything else than “green” as response results in a Critical state
**** Any issues found results in a Critical state
***** Warn and critical in ms
“” as the last char on each row meens that the command is splitted for readability, should be on one line.
Ranges for Warning and Critical thresholds:
10 < 0 or > 10, (outside the range of {0 .. 10})
10: < 10, (outside {10 .. ∞})
~:10 > 10, (outside the range of {-∞ .. 10})
10:20 < 10 or > 20, (outside the range of {10 .. 20})
@10:20 ≥ 10 and ≤ 20, (inside the range of {10 .. 20})
10 < 0 or > 10, (outside the range of {0 .. 10})
More info can be found here:
http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT
Use the “Test this service” button for the services to see if they work. Once the are correct and working as they should you may add the services to all of your hosts with the clone-function.