====== Xen ======
There is a line of multiple virtualisation products based on the [[https://xenproject.org/|Xen project]] hypervisor:
* Citrix XenServer
* Vates [[https://xcp-ng.org/|XCP-ng]]
* [[https://www.oracle.com/virtualization/#rc30p4|Oracle VM]] Server for x86
===== metrics and debugging tools =====
* xentop – top for Xen to see all domains' CPU and I/O stats
* [[https://xenbits.xen.org/docs/unstable/man/xl.1.html|xl]] – Xen management tool, based on libxenlight
* ''xl info'' – see general overview over the hypervisor
===== Sizing =====
==== RAM allocated to the management domain ====
Dom0 RAM allocation should be calculated by the formula ''dom0_mem = 502+int(physical_mem*0.0205)'' according to Oracle[([[https://docs.oracle.com/cd/E35328_01/E35330/html/vmiug-server-dom0-memory.html|Installing Oracle VM Server on x86 – 2.3 Changing the Dom0 Memory Size]] (Oracle, 2014%%)%%)], meaning for 64GB physical memory you use 1845MB dom0 RAM, for 128GB you use 3188MB and for 512GB physical RAM the dom0 RAM should be 11249MB.
===== Troubleshooting =====
Debugging resources:
* [[https://docs.xenserver.com/en-us/citrix-hypervisor/developer/management-api/api-ref-autogen-errors.html|Citrix Hypervisor 8.2 API Reference - Error Handling]]
==== can't add Storage Repository on disk where an SR was before due to SR_DEVICE_IN_USE() ====
Maybe the old disk is still in the device list and not displayed in Xen Orchestra or any other GUI you're running.
Check with ''xe pbd-list'' and ''xe sr-list''. Output could look something like this:
uuid ( RO) : a862334e-f51c-feb9-6086-7d42c30293bb
name-label ( RW): Local HDD (RAID 10)
name-description ( RW): Local HDD Storage
host ( RO):
type ( RO): ext
content-type ( RO): user
In that case (host ''''), just forget the SR with ''xe sr-forget uuid=a862334e-f51c-feb9-6086-7d42c30293bb'' and try to add the SR again.
==== VHD coalesce stuck / broken ====
Find out if coalesce is currently running:
ps axf | grep [c]oalesce
6491 ? R 0:05 \_ /usr/bin/vhd-util coalesce --debug -n /var/run/sr-mount/ed842b8c-49a7-a186-96cd-c18430404bf6/10014af7-27cf-4bb9-b3cd-a081ca470694.vhd
If nothing, rescan SR and repeat the ''ps''.
=== Force coalesce ===
If lock is active, remove the lock:
ls /var/lock/sm//gc_active
rm /var/lock/sm//gc_active
Rescan and wait up to 10 min, while tailing SMlog
tail --follow -n 500 /var/log/SMlog
If coalesce errors
Scan all VDI on SR:
vhd-util scan -f -c -p -m /var/run/sr-mount/$SR_UUID/*.vhd
If no error, check error from SMlog on coalesce, like these (coalesce EXCEPTIONS).
Jan 28 15:56:11 xen01 SMGC: [23350] ***********************
Jan 28 15:56:11 xen01 SMGC: [23350] * E X C E P T I O N *
Jan 28 15:56:11 xen01 SMGC: [23350] ***********************
Jan 28 15:56:11 xen01 SMGC: [23350] coalesce: EXCEPTION , VHD *811e7004(80.000G/2.155G) corrupted
Jan 28 15:56:11 xen01 SMGC: [23350] File "/opt/xensource/sm/cleanup.py", line 1542, in coalesce
Jan 28 15:56:11 xen01 SMGC: [23350] self._coalesce(vdi)
Find the UUID incriminated (here 811e7004…), and test it:
vhd-util check --debug -n /var/run/sr-mount//.vhd
Eg:
primary footer invalid: invalid cookie
Repair it:
vhd-util repair -n /var/run/sr-mount//.vhd
Recheck it, should be OK now.
Remove the lock, and wait a bit, coalesce should start.
If that doesn't help, investigate further and possibly ''mv'' (not ''rm''!) the corresponding VDI to see if a rescan can coalesce.
===== Commands =====
==== Find out UUID from inside the VM ====
dmidecode -s system-serial-number
# or:
xenstore read vm
===== Further reading =====
* [[https://www.amazon.de/Xenserver-Administration-Handbook-Successful-Deployments/dp/149193543X|XenServer Administration Handbook: Practical Recipes for Successful Deployments]] (Mackey and Benedict, 2016)