Xen
There is a line of multiple virtualisation products based on the Xen project hypervisor:
metrics and debugging tools
- xentop – top for Xen to see all domains' CPU and I/O stats
- xl – Xen management tool, based on libxenlight
xl info
– see general overview over the hypervisor
Sizing
RAM allocated to the management domain
Dom0 RAM allocation should be calculated by the formula dom0_mem = 502+int(physical_mem*0.0205)
according to Oracle1, meaning for 64GB physical memory you use 1845MB dom0 RAM, for 128GB you use 3188MB and for 512GB physical RAM the dom0 RAM should be 11249MB.
Troubleshooting
Debugging resources:
can't add Storage Repository on disk where an SR was before due to SR_DEVICE_IN_USE()
Maybe the old disk is still in the device list and not displayed in Xen Orchestra or any other GUI you're running.
Check with xe pbd-list
and xe sr-list
. Output could look something like this:
uuid ( RO) : a862334e-f51c-feb9-6086-7d42c30293bb name-label ( RW): Local HDD (RAID 10) name-description ( RW): Local HDD Storage host ( RO): <not in database> type ( RO): ext content-type ( RO): user
In that case (host <not in database>
), just forget the SR with xe sr-forget uuid=a862334e-f51c-feb9-6086-7d42c30293bb
and try to add the SR again.
VHD coalesce stuck / broken
Find out if coalesce is currently running:
ps axf | grep [c]oalesce 6491 ? R 0:05 \_ /usr/bin/vhd-util coalesce --debug -n /var/run/sr-mount/ed842b8c-49a7-a186-96cd-c18430404bf6/10014af7-27cf-4bb9-b3cd-a081ca470694.vhd
If nothing, rescan SR and repeat the ps
.
Force coalesce
If lock is active, remove the lock:
ls /var/lock/sm/<SR_UUID>/gc_active rm /var/lock/sm/<SR_UUID>/gc_active
Rescan and wait up to 10 min, while tailing SMlog
tail --follow -n 500 /var/log/SMlog
If coalesce errors
Scan all VDI on SR:
vhd-util scan -f -c -p -m /var/run/sr-mount/$SR_UUID/*.vhd
If no error, check error from SMlog on coalesce, like these (coalesce EXCEPTIONS).
Jan 28 15:56:11 xen01 SMGC: [23350] *********************** Jan 28 15:56:11 xen01 SMGC: [23350] * E X C E P T I O N * Jan 28 15:56:11 xen01 SMGC: [23350] *********************** Jan 28 15:56:11 xen01 SMGC: [23350] coalesce: EXCEPTION <class 'util.SMException'>, VHD *811e7004(80.000G/2.155G) corrupted Jan 28 15:56:11 xen01 SMGC: [23350] File "/opt/xensource/sm/cleanup.py", line 1542, in coalesce Jan 28 15:56:11 xen01 SMGC: [23350] self._coalesce(vdi)
Find the UUID incriminated (here 811e7004…), and test it:
vhd-util check --debug -n /var/run/sr-mount/<SR_UUID>/<VDI_UUID>.vhd
Eg:
primary footer invalid: invalid cookie
Repair it:
vhd-util repair -n /var/run/sr-mount/<SR_UUID>/<VDI_UUID>.vhd
Recheck it, should be OK now.
Remove the lock, and wait a bit, coalesce should start.
If that doesn't help, investigate further and possibly mv
(not rm
!) the corresponding VDI to see if a rescan can coalesce.
Commands
Find out UUID from inside the VM
dmidecode -s system-serial-number # or: xenstore read vm
Further reading
- XenServer Administration Handbook: Practical Recipes for Successful Deployments (Mackey and Benedict, 2016)