SAN Datastore

This storage configuration assumes that Hosts have access to storage devices (LUNs) exported by an Storage Area Network (SAN) server using a suitable protocol like iSCSI or Fibre Channel. The Hosts will interface the devices through the LVM abstraction layer. Virtual Machines run from an LV (logical volume) device instead of plain files. This reduces the overhead of having a filesystem in place and thus it may increase I/O performance.

Disk images are stored in file format in the Image Datastore and then dumped into an LV when a Virtual Machine is created. The image files are transferred to the Host through the SSH protocol. Additionally, LVM Thin can be enabled to support creating thin snapshots of the VM disks.

SAN Appliance Configuration

First of all, you need to configure your SAN appliance to export the LUN(s) where VMs will be deployed. Depending on the manufacturer the process may be slightly different, so please refer to the specific guides if your hardware is on the supported list, or your hardware vendor guides otherwise:

Also included in the above guides is a specific multipath configuration for both the front-end and virtualization hosts, which recommended over the more general multipath configuration presented below.

Hypervisor Configuration

First we need to configure hypervisors for LVM operations over the shared SAN storage.

Hosts LVM Configuration

  • LVM2 must be available on Hosts.
  • lvmetad must be disabled. Set this parameter in /etc/lvm/lvm.conf: use_lvmetad = 0, and disable the lvm2-lvmetad.service if running.
  • oneadmin needs to belong to the disk group.
  • All the nodes need to have access to the same LUNs.

Virtual Machine disks are symbolic links to the block devices. However, additional VM files like checkpoints or deployment files are stored under /var/lib/one/datastores/<id>. Be sure that enough local space is present.

Hosts SAN Configuration

In the end, the abstraction required to access LUNs is just block devices. This means that there are several ways to set them up, although it will usually involve using a network block protocol such as iSCSI or Fibre Channel, as well as some way to make it redundant, like DM Multipath.

Here is a sample session for setting up access via iSCSI and multipath:

# === ISCSI ===

TARGET_IP="192.168.1.100"                             # IP of SAN appliance
TARGET_IQN="iqn.2023-01.com.example:storage.target1"  # iSCSI Qualified Name

# === Install tools ===
# RedHat derivates:
sudo dnf install -y iscsi-initiator-utils
# Ubuntu/Debian:
sudo apt update && sudo apt install -y open-iscsi

# === Enable iSCSI services ===
# RedHat derivates:
sudo systemctl enable --now iscsid
# Ubuntu/Debian:
sudo systemctl enable --now open-iscsi

# === Discover targets ===
sudo iscsiadm -m discovery -t sendtargets -p "$TARGET_IP"

# === Log in to the target ===
sudo iscsiadm -m node -T "$TARGET_IQN" -p "$TARGET_IP" --login

# === Make login persistent across reboots ===
sudo iscsiadm -m node -T "$TARGET_IQN" -p "$TARGET_IP" \
     --op update -n node.startup -v automatic
# === MULTIPATH ===

# === Install tools ===
# RedHat derivates:
sudo dnf install -y device-mapper-multipath
# Ubuntu/Debian:
sudo apt update && sudo apt install -y multipath-tools

# === Enable multipath daemon ===
sudo systemctl enable --now multipathd

# === Create multipath config file ===
sudo tee /etc/multipath.conf > /dev/null <<EOF
defaults {
    user_friendly_names yes
    find_multipaths yes
}
# Optional: blacklist local boot disks if needed
# blacklist {
#     devnode "^sd[a-z]"
# }
EOF

# === Reload multipath ===
sudo multipath -F    # Flush existing config (safely if not in use)
sudo multipath       # Re-scan for multipath devices
sudo systemctl restart multipathd

# === Show current multipath devices ===
sudo multipath -ll

Front-end Configuration

The Front-end needs access to the shared SAN server in order to perform LVM operations. It can either access it directly, or using some host(s) as proxy/bridge.

For direct access, the Front-end will need to be configured in the same way as hosts (following the previous section), and no further configuration will be needed. Example for illustration purposes:

-------------
| Front-end | ---- /dev/mapper/mpath* ------+
-------------     (iSCSI + multipath)       |
                                            v
  ---------                              --------------
  | host2 | ---- /dev/mapper/mpath* ---> | SAN server |
  ---------     (iSCSI + multipath)      --------------
                                            ^
                                            |
  ---------                                 |
  | hostN | ---- /dev/mapper/mpath* --------+
  ---------     (iSCSI + multipath)

Otherwise, one or several hosts can be used to perform the required operations by defining the BRIDGE_LIST attribute on the Image Datastore later:

BRIDGE_LIST=host2

-------------
| Front-end |
-------------
      |
      | use as proxy for operations
      |
      v
  ---------                              --------------
  | host2 | ---- /dev/mapper/mpath* ---> | SAN server |
  ---------     (iSCSI + multipath)      --------------
                                            ^
                                            |
  ---------                                 |
  | hostN | ---- /dev/mapper/mpath* --------+
  ---------     (iSCSI + multipath)

OpenNebula Configuration

First, we need to create the two required OpenNebula datastores: Image and System. Both of them will use the fs_lvm_ssh transfer driver (TM_MAD).

Create System Datastore

To create a new SAN/LVM System Datastore, you need to set the following (template) parameters:

AttributeDescription
NAMEName of Datastore
TYPESYSTEM_DS
TM_MADfs_lvm_ssh
DISK_TYPEBLOCK (used for volatile disks)

For example:

> cat ds_system.conf
NAME   = lvm_system
TM_MAD = fs_lvm_ssh
TYPE   = SYSTEM_DS
DISK_TYPE = BLOCK

> onedatastore create ds_system.conf
ID: 100

Afterwards, a LVM VG needs to be created in the shared LUNs for the system datastore with the following name: vg-one-<system_ds_id>. This step just needs to be done once, either in one host, or the front-end if it has access. This VG is where the actual VM images will be located at runtime, and OpenNebula will take care of creating the LVs (one for each VM disk). For example, assuming /dev/mpatha is the LUN (iSCSI/multipath) block device:

# pvcreate /dev/mpatha
# vgcreate vg-one-100 /dev/mpatha

Create Image Datastore

To create a new LVM Image Datastore, you need to set following (template) parameters:

AttributeDescription
NAMEName of Datastore
TYPEIMAGE_DS
DS_MADfs
TM_MADfs_lvm_ssh
DISK_TYPEBLOCK
BRIDGE_LISTList of Hosts with access to the file system where image files are stored before dumping to logical volumes
LVM_THIN_ENABLE(default: NO) YES to enable LVM Thin functionality (RECOMMENDED).

The following example illustrates the creation of an LVM Image Datastore. In this case we will use the nodes node1 and node2 as our OpenNebula LVM-enabled Hosts.

> cat ds_image.conf
NAME = lvm_image
DS_MAD = fs
TM_MAD = fs_lvm_ssh
DISK_TYPE = "BLOCK"
TYPE = IMAGE_DS
BRIDGE_LIST = "node1 node2"
LVM_THIN_ENABLE = yes
SAFE_DIRS="/var/tmp /tmp"

> onedatastore create ds_image.conf
ID: 101

Front-end setup (Image Datastore)

The OpenNebula Front-end will keep the images used in the newly created Image Datastore in its /var/lib/one/datastores/<datastore_id>/ directory. The simplest case will just use the local storage in the Front-end, but you can mount any storage medium in that directory to support more advanced scenarios, such as sharing it via NFS in a Front-end HA setup or even using another LUN in the same SAN to keep everything in the same place. Here are some (non-exhaustive) examples of typical setups for the image datastore:

Option 1: image datastore local to frontend. Assuming the image datastore has ID 100:

# mkdir -p /var/lib/one/datastores/100/
# chown oneadmin:oneadmin /var/lib/one/datastores/100/

Option 2: image datastore in NFS. Assuming the image datastore has ID 100, and nfs-server exposes a share /srv/path_to_share:

# echo "nfs-server:/srv/path_to_share /var/lib/one/datastores/100/ nfs4 defaults 0 2" >> /etc/fstab
# mount /var/lib/one/datastores/100/
# chown oneadmin:oneadmin /var/lib/one/datastores/100/

Option 3: image datastore in LVM. Assuming the image datastore has ID 100, and /dev/sdb contains some block device (either local to frontend, or SAN):

# pvcreate /dev/sdb
# vgcreate image-vg /dev/sdb
# lvcreate -l 100%FREE -n image-lv image-vg
# mkfs.ext4 /dev/image-vg/image-lv
# mkdir -p /var/lib/one/datastores/100/
# echo "/dev/image-vg/image-lv /var/lib/one/datastores/100/ ext4 defaults 0 2" >> /etc/fstab
# mount /var/lib/one/datastores/100/
# chown oneadmin:oneadmin /var/lib/one/datastores/100/

LVM Thin

You have the option to toggle the LVM Thin functionality with the LVM_THIN_ENABLE attribute in the Image Datastore. It is recommended that you enable this mode, as it allows some operations that are not possible to do in the standard, non-thin mode:

  • Creation of thin snapshots
  • Consistent live backups

You can take a look at the Datastore Internals section for more info about the differences in thin and non-thin operation.

Driver Configuration

By default the LVM driver will zero any LVM volume so that VM data cannot leak to other instances. However, this process takes some time and may delay the deployment of a VM. The behavior of the driver can be configured in the file /var/lib/one/remotes/etc/fs_lvm/fs_lvm.conf, in particular:

AttributeDescription
ZERO_LVM_ON_CREATEZero LVM volumes when they are created/resized
ZERO_LVM_ON_DELETEZero LVM volumes when VM disks are deleted
DD_BLOCK_SIZEBlock size for dd operations (default: 64kB)

Example:

#  Zero LVM volumes on creation or resizing
ZERO_LVM_ON_CREATE=no

#  Zero LVM volumes on delete, when the VM disks are disposed
ZERO_LVM_ON_DELETE=yes

#  Block size for the dd commands
DD_BLOCK_SIZE=32M

The following attribute can be set for every datastore type:

  • SUPPORTED_FS: Comma-separated list with every filesystem supported for creating formatted datablocks. Can be set in /var/lib/one/remotes/etc/datastore/datastore.conf.
  • FS_OPTS_<FS>: Options for creating the filesystem for formatted datablocks. Can be set in /var/lib/one/remotes/etc/datastore/datastore.conf for each filesystem type.

Datastore Internals

Images are stored as regular files (under the usual path: /var/lib/one/datastores/<id>) in the Image Datastore, but they will be dumped into a Logical Volumes (LV) upon Virtual Machine creation. The Virtual Machines will run from Logical Volumes in the Host.

image0

This is the recommended driver to be used when a high-end SAN is available. The same LUN can be exported to all the Hosts while Virtual Machines will be able to run directly from the SAN.

For example, consider a system with two Virtual Machines (9 and 10) using a disk, running in an LVM Datastore, with ID 0. The Hosts have configured a shared LUN and created a volume group named vg-one-0. The layout of the Datastore would be:

# lvs
  LV          VG       Attr       LSize Pool Origin Data%  Meta%  Move
  lv-one-10-0 vg-one-0 -wi------- 2.20g
  lv-one-9-0  vg-one-0 -wi------- 2.20g

LVM Thin Internals

In this mode, every launched VM will allocate a dedicated Thin Pool, containing one Thin LV per disk. So, a VM (with id 11) with two disks would be instantiated as follows:

# lvs
  LV              VG       Attr       LSize   Pool            Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv-one-11-0     vg-one-0 Vwi-aotz-- 256.00m lv-one-11-pool         48.44
  lv-one-11-1     vg-one-0 Vwi-aotz-- 256.00m lv-one-11-pool         48.46
  lv-one-11-pool  vg-one-0 twi---tz-- 512.00m                        48.45  12.60

The pool would be the equivalent to a typical LV, and it detracts its total size from the VG. On the other hand, per-disk Thin LVs are thinly provisioned and blocks are allocated in their associated pool.

Thin LVM snapshots are just a special case of Thin LV, and can be created from a base Thin LV instantly and consuming no extra data, as all of their blocks are shared with its parent. From that moment, changed data on the active parent will be written in new blocks on the pool and so will start requiring extra space as the “old” blocks referenced by previous snapshots are kept unchanged.

Let’s create a couple of snapshots over the first disk of the previous VM. As you can see, snapshots are no different from Thin LVs at the LVM level:

# lvs
  LV              VG       Attr       LSize   Pool            Origin       Data%  Meta%  Move Log Cpy%Sync Convert
  lv-one-11-0     vg-one-0 Vwi-aotz-- 256.00m lv-one-11-pool               48.44
  lv-one-11-0_s0  vg-one-0 Vwi---tz-k 256.00m lv-one-11-pool  lv-one-11-0
  lv-one-11-0_s1  vg-one-0 Vwi---tz-k 256.00m lv-one-11-pool  lv-one-11-0
  lv-one-11-1     vg-one-0 Vwi-aotz-- 256.00m lv-one-11-pool               48.46
  lv-one-11-pool  vg-one-0 twi---tz--   1.00g                              24.22  12.70

For more details about the inner workings of LVM, please refer to the lvmthin(7) main page.

Troubleshooting

LVM Devices File

Problem: LVM does not show my iSCSI/multipath devices (with e.g., pvs), although I can see them with multipath -ll or lsblk.

Possible solution:

The LVM version in some operating systems or Linux distributions, by default, doesn’t scan the whole /dev directory for possible disks. Instead, you need to explicitly whitelist them in /etc/lvm/devices/system.devices. You can check whether that’s your case by running:

lvmconfig --type full devices/use_devicesfile

If it returns devices/use_devicesfile=1, then the devices file is being used and enforced. In that case, just add the device path to the whitelist and check again:

# echo /dev/mapper/mpatha >> /etc/lvm/devices/system.devices
# pvs