The Hosts will need a CPU with Intel VT or AMD’s AMD-V features in order to support virtualization. KVM’s Preparing to use KVM guide will clarify any doubts you may have regarding whether your hardware supports KVM.
KVM will be installed and configured after following the KVM Host Installation section.
Considerations & Limitations¶
Try to use virtio whenever possible, both for networks and disks. Using emulated hardware, both for networks and disks, will have an impact on performance and will not expose all the available functionality. For instance, if you don’t use
virtio for the disk drivers, you will not be able to exceed a small number of devices connected to the controller, meaning that you have a limit when attaching disks and it will not work while the VM is running (live disk-attach).
The OpenNebula packages will configure KVM automatically, therefore you don’t need to take any extra steps.
The KVM driver is enabled by default in OpenNebula
/etc/one/oned.conf on your Front-end host with reasonable defaults. Read the oned Configuration to understand these configuration parameters and Virtual Machine Drivers Reference to know how to customize and extend the drivers.
There are some attributes required for KVM to boot a VM. You can set a suitable default for them so all the VMs get the required values. These attributes are set in
/etc/one/vmm_exec/vmm_exec_kvm.conf. The following can be set for KVM:
EMULATOR: path to the kvm executable.
RAW: to add libvirt attributes to the domain XML file.
HYPERV_OPTIONS: to enable hyperv extensions.
SPICE_OPTIONS: to add default devices for SPICE.
These values are only used during VM creation; for other actions like nic or disk attach/detach the default values must be set in
/var/lib/one/remotes/etc/vmm/kvm/kvmrc. For more info check Files and Parameters section.
For example (check the actual state in the configuration file on your Front-end):
OS = [ ARCH = "x86_64" ] FEATURES = [ PAE = "no", ACPI = "yes", APIC = "no", HYPERV = "no", GUEST_AGENT = "no" ] DISK = [ DRIVER = "raw" , CACHE = "none"] HYPERV_OPTIONS="<relaxed state='on'/><vapic state='on'/><spinlocks state='on' retries='4096'/>" SPICE_OPTIONS=" <video> <model type='vga' heads='1'/> </video> <sound model='ich6' /> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0'/> </channel> <redirdev bus='usb' type='spicevmc'/> <redirdev bus='usb' type='spicevmc'/> <redirdev bus='usb' type='spicevmc'/>"
These values can be overriden in the Cluster, Host and VM Template
Since OpenNebula 6.0 you should no longer need to modify the
EMULATOR variable to point to the kvm exectuable; instead,
EMULATOR now points to the symlink
/usr/bin/qemu-kvm-one which should link the correct KVM binary for the given OS on a Host.
Live-Migration for Other Cache settings¶
If you are using disks with a cache setting different to
none you may have problems with live migration depending on the libvirt version. You can enable the migration adding the
--unsafe parameter to the virsh command. The file to change is
/var/lib/one/remotes/etc/vmm/kvm/kvmrc. Uncomment the following line, and execute
onehost sync --force afterwards:
Configure the Timeouts (Optional)¶
Optionally, you can set a timeout for the VM Shutdown operation. This feature is useful when a VM gets stuck in Shutdown (or simply does not notice the shutdown command). By default, after the timeout time the VM will return to Running state but is can also be configured so the VM is destroyed after the grace time. This is configured in
# Seconds to wait after shutdown until timeout export SHUTDOWN_TIMEOUT=180 # Uncomment this line to force VM cancellation after shutdown timeout export FORCE_DESTROY=yes
Working with cgroups (Optional)¶
Optionally, you can set-up cgroups to control resources on your Hosts. The libvirt cgroups documentation describes all the cases and the way the cgroups are managed by libvirt/KVM.
Memory Cleanup (Optional)¶
Memory allocated by caches or memory fragmentation may cause the VM to fail to deploy, even if there is enough memory on the Host at first sight. To avoid such failures and provide the best memory placement for the VMs, it’s possible to trigger memory cleanup and compactation before the VM starts and/or after the VM stops (by default enabled only on stop). The feature is configured in
/var/lib/one/etc/remotes/vmm/kvm/kvmrc on the Front-end:
# Compact memory before running the VM #CLEANUP_MEMORY_ON_START=yes # Compact memory after VM stops CLEANUP_MEMORY_ON_STOP=yes
Covered VM actions -
KVM Specific Attributes¶
The following are template attributes specific to KVM. Please refer to the template reference documentation for a complete list of the attributes supported to define a VM.
TYPE: This attribute defines the type of media to be exposed to the VM; possible values are:
cdrom. This attribute corresponds to the
mediaoption of the
-driverargument of the
DRIVER: specifies the format of the disk image; possible values are
qcow2… This attribute corresponds to the
formatoption of the
-driverargument of the
CACHE: specifies the optional cache mechanism; possible values are
IO: sets IO policy; possible values are
IOTHREAD: thread id used by this disk. It can only be used for virtio disk conrtollers and if
DISCARD: controls what to do with trim commands; the options are
unmap. It can only be used with virtio-scsi.
- IO Throttling support - You can limit TOTAL/READ/WRITE throughput or IOPS. Also, burst control for these IO operations can be set for each disk. See the reference guide for the attributed names and purpose.
TARGET: name for the tun device created for the VM. It corresponds to the
ifnameoption of the ‘-net’ argument of the
SCRIPT: name of a shell script to be executed after creating the tun device for the VM. It corresponds to the
scriptoption of the ‘-net’ argument of the
QoS to control the network traffic. We can define different kinds of controls over network traffic:
MODEL: ethernet hardware to emulate. You can get the list of available models with this command:
kvm -net nic,model=? -nographic /dev/null
FILTERto define a network filtering rule for the interface. Libvirt includes some predefined rules (e.g. clean-traffic) that can be used. Check the Libvirt documentation for more information; you can also list the rules in your system with:
virsh -c qemu:///system nwfilter-list
VIRTIO_QUEUESto define how many queues will be used for the communication between CPUs and Network drivers. This attribute is only available with
If properly configured, libvirt and KVM can work with SPICE (check here for more information). To select it, just add the following to the
TYPE = SPICE
Enabling spice will also make the driver inject a specific configuration for these machines. The configuration can be changed in the driver configuration file, variable
Virtio is the framework for IO virtualization in KVM. You will need a Linux kernel with the virtio drivers for the guest. Check the KVM documentation for more info.
If you want to use the virtio drivers add the following attributes to your devices:
DISK, add the attribute
NIC, add the attribute
For disks you can also use SCSI bus (
sd) and it will use the virtio-scsi controller. This controller also offers high speed as it is not emulating real hardware but also adds support to trim commands to free disk space when the disk has the attribute
DISCARD="unmap". If needed, you can change the number of vCPU queues this way:
FEATURES = [ VIRTIO_SCSI_QUEUES = 4 ]
RAW attribute allows the end-users to pass custom libvirt/KVM attributes not yet supported by OpenNebula. Basically, everything placed here will be written literally into the KVM deployment file (use libvirt xml format and semantics). You can selectively disable validation of the RAW data by adding
VALIDATE="no" to the
RAW section. By default, the data will be checked against the libvirt schema.
RAW = [ TYPE = "kvm", VALIDATE = "yes", DATA = "<devices><serial type=\"pty\"><source path=\"/dev/pts/5\"/><target port=\"0\"/></serial><console type=\"pty\" tty=\"/dev/pts/5\"><source path=\"/dev/pts/5\"/><target port=\"0\"/></console></devices>" ]
The following OpenNebula information is added to the metadata section of the Libvirt domain. The specific attributes are listed below:
They correspond to their OpenNebula equivalents for the XML representation of the VM.
deployment_time are the OpenNebula version used during the deployment and deployment time at epoch format, respectively.
Also the VM name is included at libvirt XML
title field, so if the
--title option is used for listing the libvirt domains the VM name will be shown with the domain name.
Live Resize VCPU and Memory¶
If you need to resize the capacity of the VM in
RUNNING state, you have to set-up some extra attributes to the VM template. These attributes must be set before the VM is started.
||Maximum number of VCPUs which can be hotplugged.||NO|
||Maximum memory which can be hotplugged.||NO|
||Optional slots for hotplugging memory. Limits the number of hotplug operations. Defaults to 8.||NO|
KVM supports hotplugging to the
virtio and the
SCSI buses. For disks, the bus the disk will be attached to is inferred from the
DEV_PREFIX attribute of the disk template.
Hotplugging is not supported for CD-ROM and floppy.
TARGET is passed instead of
DEV_PREFIX the same rules apply (what happens behind the scenes is that OpenNebula generates a
TARGET based on the
DEV_PREFIX if no
TARGET is provided).
The defaults for the newly attached disks and NICs are in
/var/lib/one/remotes/etc/vmm/kvm/kvmrc. The relevant parameters are prefixed with
DEFAULT_ATTACH_ and explained in the Files and Parameters below.
For Disks and NICs, if the guest OS is a Linux flavor, the guest needs to be explicitly told to rescan the PCI bus. This can be done by issuing the following command as root:
echo 1 > /sys/bus/pci/rescan
Enabling QEMU Guest Agent¶
QEMU Guest Agent allows the communication of some actions with the guest OS. This agent uses a virtio serial connection to send and receive commands. One of the interesting actions is that it allows you to freeze the filesystem before doing an snapshot. This way the snapshot won’t contain half written data. Filesystem freeze will only be used with
qcow2 storage drivers.
The agent package needed in the Guest OS is available in most distributions. It’s called
qemu-guest-agent in most of them. If you need more information you can follow these links:
The communication channel with guest agent is enabled in the domain XML when the
GUEST_AGENT feature is selected in the VM Template.
Tuning & Extending¶
Multiple Actions per Host¶
This feature is experimental. Some modifications to the code must be made before this is a recommended setup.
By default the drivers use a unix socket to communicate with the libvirt daemon. This method can only be safely used by one process at a time. To make sure this happens, the drivers are configured to send only one action per Host at a time. For example, there will be only one deployment done per Host at a given time.
This limitation can be solved by configuring libvirt to accept TCP connections and OpenNebula to use this communication method.
Here we describe how to configure libvirtd to accept unencrypted and unauthenticated TCP connections in a CentOS 7 machine. For other setup check your distribution and libvirt documentation.
Change the file
/etc/libvirt/libvirtd.conf in each of the hypervisors and make sure that these parameters are set and have the following values:
listen_tls = 0 listen_tcp = 1 tcp_port = "16509" auth_tcp = "none"
You will also need to modify
/etc/sysconfig/libvirtd and uncomment this line:
After modifying these files the libvirt daemon must be restarted:
sudo systemctl restart libvirtd
The VMM driver must be configured so it allows more than one action to be executed per Host. This can be done adding the parameter
-p to the driver executable. This is done in
/etc/one/oned.conf in the VM_MAD configuration section:
VM_MAD = [ NAME = "kvm", EXECUTABLE = "one_vmm_exec", ARGUMENTS = "-t 15 -r 0 kvm -p", DEFAULT = "vmm_exec/vmm_exec_kvm.conf", TYPE = "kvm" ]
Change the file
/var/lib/one/remotes/etc/vmm/kvm/kvmrc to set a TCP endpoint for libvirt communication:
The scheduler configuration should also be changed to let it deploy more than one VM per Host. The file is located at
/etc/one/sched.conf and the value to change is
MAX_HOST For example, to let the scheduler submit 10 VMs per Host use this line:
MAX_HOST = 10
After this update the remote files in the nodes and restart OpenNebula:
onehost sync --force sudo systemctl restart opennebula
Files and Parameters¶
The driver consists of the following files:
/usr/lib/one/mads/one_vmm_exec: generic VMM driver.
/var/lib/one/remotes/vmm/kvm: commands executed to perform actions.
And the following driver configuration files:
/etc/one/vmm_exec/vmm_exec_kvm.conf: This file contains default values for KVM domain definitions (in other words, OpenNebula templates). It is generally a good idea to configure here defaults for the KVM-specific attributes, that is, attributes mandatory in the KVM driver that are not mandatory for other hypervisors. Non-mandatory attributes for KVM but specific to them are also recommended to have a default. Changes to this file require opennebula to be restarted.
/var/lib/one/remotes/etc/vmm/kvm/kvmrc: This file holds instructions to be executed before the actual driver load to perform specific tasks or to pass environmental variables to the driver. The syntax used for the former is plain shell script that will be evaluated before the driver execution. For the latter, the syntax is the familiar:
The parameters that can be changed here are as follows:
||Connection string to libvirtd|
||Protocol used for live migrations|
||Seconds to wait after shutdown until timeout|
||Number of “virsh” command retries when required. Currently used in detach-interface and restore.|
||Trigger VM time synchronization from RTC on resume and after migration. QEMU guest agent must be running. Valid values:
||Force VM cancellation after shutdown timeout|
||Force VMs without ACPI enabled to be destroyed on shutdown|
||Set options for the virsh migrate command|
||Compact memory before running the VM. Values
||Compact memory after VM stops. Values
||This parameter will set the default cache type for new attached disks. It will be used in case the attached disk does not have a specific cache method set (can be set using templates when attaching a disk).|
||Default dicard option for newly attached disks, if the attribute is missing in the template.|
||Default I/O policy for newly attached disks, if the attribute is missing in the template.|
||Default total bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum total bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length total bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default read bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum read bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length read bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default write bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum write bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length write bytes/s I/O throttling for newly attached disks, if the attribute is missing in the template.|
||Default total IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum total IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length total IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default read IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum read IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length read IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default write IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum write IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default Maximum length write IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default size of IOPS throttling for newly attached disks, if the attribute is missing in the template.|
||Default NIC model for newly attached NICs, if the attribute is missing in the template.|
||Default NIC libvirt filter for newly attached NICs, if the attribute is missing in the template.|
See the Virtual Machine drivers reference for more information.
Image Magic Is Incorrect¶
When trying to restore the VM from a suspended state this error is returned:
libvirtd1021: operation failed: image magic is incorrect
It can be fixed by applying:
options kvm_intel nested=0 options kvm_intel emulate_invalid_guest_state=0 options kvm ignore_msrs=1