Monitoring Driver

The Monitoring Drivers (or IM drivers) collect host and virtual machine monitoring data by executing a monitoring agent in the hosts. The agent periodically executes probes to collect data and periodically send them to the frontend.

This guide describes the internals of the monitoring system. It is also a starting point on how to create a new IM driver from scratch.

Message structure

The structure of monitoring message is:

MESSAGE_TYPE ID RESULT TIMESTAMP PAYLOAD

Name

Description

MESSAGE_TYPE

SYSTEM_HOST, MONITOR_HOST, BEACON_HOST, MONITOR_VM or STATE_VM

ID

ID of the host, which generates the message.

RESULT

Result of the action, possible values SUCCESS or FAILURE

TIMESTAMP

Timestamp of the message as unix epoch time

PAYLOAD

Message data, depends on MESSAGE_TYPE

Description of message types:

  • SYSTEM_HOST - General information about the host, which doesn’t change too often (e.g. total memory, disk cpacity, datastores, pci devices, NUMA nodes, …)

  • MONITOR_HOST - Monitoring information: used memory, used cpu, network traffic, …

  • BEACON_HOST - notification message, indicating that the agent is still alive

  • MONITOR_VM - VMs monitoring information: used memory, used CPUs, disk io, …

  • STATE_VM - VMs state: running, poweroff, …

The provided hypervisors compose each message from data provided by probes in a specific directory:

  • SYSTEM_HOST - im/<hypervisor>-probes.d/host/system

  • MONITOR_HOST - im/<hypervisor>-probes.d/host/monitor

  • BEACON_HOST - im/<hypervisor>-probes.d/host/beacon

  • MONITOR_VM - im/<hypervisor>-probes.d/vm/monitor

  • STATE_VM - im/<hypervisor>-probes.d/vm/status

Each IM probe is composed of one or several scripts that write to stdout information in this form:

KEY1="value"
KEY2="another value with spaces"

Basic Monitoring Scripts

Mandatory values for each category are described below:

SYSTEM_HOST Message

Key

Description

HYPERVISOR

Name of the hypervisor of the host, useful for selecting the hosts with an specific technology.

TOTALCPU

Number of CPUs multiplied by 100. For example, a 16 cores machine will have a value of 1600.

CPUSPEED

Speed in Mhz of the CPUs.

TOTALMEMORY

Maximum memory that could be used for VMs. It is advised to take out the memory used by the hypervisor.

MONITOR_HOST Message

Key

Description

USEDMEMORY

Memory used, in kilobytes.

FREEMEMORY

Available memory for VMs at that moment, in kilobytes.

FREECPU

Percentage of idling CPU multiplied by the number of cores. For example, if 50% of the CPU is idling in a 4 core machine the value will be 200.

USEDCPU

Percentage of used CPU multiplied by the number of cores.

NETRX

Received bytes from the network

NETTX

Transferred bytes to the network

BEACON_HOST Message

No data

MONITOR_VM Message

The format of the MONITOR_VM Message:

VM = [ ID="0",
       UUID="6c1e1565-50f4-43b6-ba71-0fe46477d2ec",
       MONITOR="Q1BVPSIxLjAxIgpNRU1PUlk9IjE0MDgxNiIKTkVUUlg9IjAiCk5FVFRYPSIwIgpESVNLUkRCWVRFUz0iNDQxNjU0NDQiCkRJU0tXUkJZVEVTPSIxMjY2Njg4IgpESVNLUkRJT1BTPSIxMjg5IgpESVNLV1JJT1BTPSI4ODEiCg=="]
VM = [ ID="1",
       ... ]

Key

Description

ID

ID of the VM in OpenNebula.

UUID

Unique ID, must be unique across all hosts.

MONITOR

Base64 encoded monitoring information, the monitoring information includes following data:

TIMESTAMP

Timestamp of the measurement.

CPU

Percentage of 1 CPU consumed (two fully consumed cpu is 200).

MEMORY

MEMORY consumption in kilobytes.

DISKRDBYTES

Amount of bytes read from disk.

DISKRDIOPS

Number of IO read operations.

DISKWRBYTES

Amount of bytes written to disk.

DISKWRIOPS

Number of IO write operations.

NETRX

Received bytes from the network.

NETTX

Sent bytes to the network.

STATE_VM Message

The format of the STATE_VM message is:

VM=[
  ID=115,
  DEPLOY_ID=one-115,
  UUID="6c1e1565-50f4-43b6-ba71-0fe46477d2ec",
  STATE="RUNNING" ]
VM=[
  ID=116,
  DEPLOY_ID=one-116,
  UUID="1a3f2513-50f4-43b6-ba71-0fe46477d2ec",
  STATE="POWEROFF" ]

Key

Description

ID

ID of the VM in OpenNebula.

DEPLOY_ID

ID of the VM in the hypervisor, usually unique in host.

UUID

Unique ID, must be unique across all hosts.

STATE

State of the VM (running, poweroff, …).

System Datastore Information

Monitoring probes are also responsible to collect the datastore sizes and its available space. The datastores information is included in SYSTEM_HOST message.

DS_LOCATION_USED_MB=1
DS_LOCATION_TOTAL_MB=12639
DS_LOCATION_FREE_MB=10459
DS = [
  ID = 0,
  USED_MB = 1,
  TOTAL_MB = 12639,
  FREE_MB = 10459
]
DS = [
  ID = 1,
  USED_MB = 1,
  TOTAL_MB = 12639,
  FREE_MB = 10459
]
DS = [
  ID = 2,
  USED_MB = 1,
  TOTAL_MB = 12639,
  FREE_MB = 10459
]

These are the meanings of the values:

Variable

Description

DS_LOCATION_USED_MB

Used space in megabytes in the DATASTORE LOCATION

DS_LOCATION_TOTAL_MB

Total space in megabytes in the DATASTORE LOCATION

DS_LOCATION_FREE_MB

FREE space in megabytes in the DATASTORE LOCATION

ID

ID of the datastore, this is the same as the name of the directory

USED_MB

Used space in megabytes for that datastore

TOTAL_MB

Total space in megabytes for that datastore

FREE_MB

Free space in megabytes for that datastore

The DATASTORE LOCATION is the path where the datastores are mounted. By default, it is /var/lib/one/datastores but it is specified in the second parameter of the script call.

Creating a New IM Driver

Choosing the Execution Engine

OpenNebula provides two IM probe execution engines: one_im_sh and one_im_ssh. one_im_sh is used to execute probes in the frontend, for example vcenter uses this engine as it collects data via an API call executed in the frontend. On the other hand, one_im_ssh is used when probes need to be run remotely in the hosts, which is the case for KVM.

Populating the Probes

Both one_im_sh and one_im_ssh require an argument which indicates the directory that contains the probes. This argument is appended with ”.d”. Also, you need to create:

  • The /var/lib/one/remotes/im/<im_name>.d directory with only 2 files, the same ones that are provided by default inside kvm.d, which are: collectd-client_control.sh and collectd-client.rb.

  • The probes should be actually placed in the /var/lib/one/remotes/im/<im_name>-probes.d folder.

Enabling the Driver

A new IM section should be placed added to monitord.conf.

Example:

IM_MAD = [
      name       = "ganglia",
      executable = "one_im_sh",
      arguments  = "ganglia" ]