====== Xen ====== There is a line of multiple virtualisation products based on the [[https://xenproject.org/|Xen project]] hypervisor: * Citrix XenServer * Vates [[https://xcp-ng.org/|XCP-ng]] * [[https://www.oracle.com/virtualization/#rc30p4|Oracle VM]] Server for x86 ===== metrics and debugging tools ===== * xentop – top for Xen to see all domains' CPU and I/O stats * [[https://xenbits.xen.org/docs/unstable/man/xl.1.html|xl]] – Xen management tool, based on libxenlight * ''xl info'' – see general overview over the hypervisor ===== Sizing ===== ==== RAM allocated to the management domain ==== Dom0 RAM allocation should be calculated by the formula ''dom0_mem = 502+int(physical_mem*0.0205)'' according to Oracle[([[https://docs.oracle.com/cd/E35328_01/E35330/html/vmiug-server-dom0-memory.html|Installing Oracle VM Server on x86 – 2.3 Changing the Dom0 Memory Size]] (Oracle, 2014%%)%%)], meaning for 64GB physical memory you use 1845MB dom0 RAM, for 128GB you use 3188MB and for 512GB physical RAM the dom0 RAM should be 11249MB. ===== Troubleshooting ===== Debugging resources: * [[https://docs.xenserver.com/en-us/citrix-hypervisor/developer/management-api/api-ref-autogen-errors.html|Citrix Hypervisor 8.2 API Reference - Error Handling]] ==== can't add Storage Repository on disk where an SR was before due to SR_DEVICE_IN_USE() ==== Maybe the old disk is still in the device list and not displayed in Xen Orchestra or any other GUI you're running. Check with ''xe pbd-list'' and ''xe sr-list''. Output could look something like this: uuid ( RO) : a862334e-f51c-feb9-6086-7d42c30293bb name-label ( RW): Local HDD (RAID 10) name-description ( RW): Local HDD Storage host ( RO): type ( RO): ext content-type ( RO): user In that case (host ''''), just forget the SR with ''xe sr-forget uuid=a862334e-f51c-feb9-6086-7d42c30293bb'' and try to add the SR again. ==== VHD coalesce stuck / broken ==== Find out if coalesce is currently running: ps axf | grep [c]oalesce 6491 ? R 0:05 \_ /usr/bin/vhd-util coalesce --debug -n /var/run/sr-mount/ed842b8c-49a7-a186-96cd-c18430404bf6/10014af7-27cf-4bb9-b3cd-a081ca470694.vhd If nothing, rescan SR and repeat the ''ps''. === Force coalesce === If lock is active, remove the lock: ls /var/lock/sm//gc_active rm /var/lock/sm//gc_active Rescan and wait up to 10 min, while tailing SMlog tail --follow -n 500 /var/log/SMlog If coalesce errors Scan all VDI on SR: vhd-util scan -f -c -p -m /var/run/sr-mount/$SR_UUID/*.vhd If no error, check error from SMlog on coalesce, like these (coalesce EXCEPTIONS). Jan 28 15:56:11 xen01 SMGC: [23350] *********************** Jan 28 15:56:11 xen01 SMGC: [23350] * E X C E P T I O N * Jan 28 15:56:11 xen01 SMGC: [23350] *********************** Jan 28 15:56:11 xen01 SMGC: [23350] coalesce: EXCEPTION , VHD *811e7004(80.000G/2.155G) corrupted Jan 28 15:56:11 xen01 SMGC: [23350] File "/opt/xensource/sm/cleanup.py", line 1542, in coalesce Jan 28 15:56:11 xen01 SMGC: [23350] self._coalesce(vdi) Find the UUID incriminated (here 811e7004…), and test it: vhd-util check --debug -n /var/run/sr-mount//.vhd Eg: primary footer invalid: invalid cookie Repair it: vhd-util repair -n /var/run/sr-mount//.vhd Recheck it, should be OK now. Remove the lock, and wait a bit, coalesce should start. If that doesn't help, investigate further and possibly ''mv'' (not ''rm''!) the corresponding VDI to see if a rescan can coalesce. ===== Commands ===== ==== Find out UUID from inside the VM ==== dmidecode -s system-serial-number # or: xenstore read vm ===== Further reading ===== * [[https://www.amazon.de/Xenserver-Administration-Handbook-Successful-Deployments/dp/149193543X|XenServer Administration Handbook: Practical Recipes for Successful Deployments]] (Mackey and Benedict, 2016)