admin:linux:xen

Xen

There is a line of multiple virtualisation products based on the Xen project hypervisor:

  • xentop – top for Xen to see all domains' CPU and I/O stats
  • xl – Xen management tool, based on libxenlight
    • xl info – see general overview over the hypervisor

Dom0 RAM allocation should be calculated by the formula dom0_mem = 502+int(physical_mem*0.0205) according to Oracle1, meaning for 64GB physical memory you use 1845MB dom0 RAM, for 128GB you use 3188MB and for 512GB physical RAM the dom0 RAM should be 11249MB.

Maybe the old disk is still in the device list and not displayed in Xen Orchestra or any other GUI you're running.

Check with xe pbd-list and xe sr-list. Output could look something like this:

uuid ( RO)                : a862334e-f51c-feb9-6086-7d42c30293bb 
         name-label ( RW): Local HDD (RAID 10) 
   name-description ( RW): Local HDD Storage 
               host ( RO): <not in database> 
               type ( RO): ext
       content-type ( RO): user

In that case (host <not in database>), just forget the SR with xe sr-forget uuid=a862334e-f51c-feb9-6086-7d42c30293bb and try to add the SR again.

Find out if coalesce is currently running:

ps axf | grep [c]oalesce
 
 6491 ?        R      0:05      \_ /usr/bin/vhd-util coalesce --debug -n /var/run/sr-mount/ed842b8c-49a7-a186-96cd-c18430404bf6/10014af7-27cf-4bb9-b3cd-a081ca470694.vhd

If nothing, rescan SR and repeat the ps.

Force coalesce

If lock is active, remove the lock:

ls /var/lock/sm/<SR_UUID>/gc_active
rm /var/lock/sm/<SR_UUID>/gc_active

Rescan and wait up to 10 min, while tailing SMlog

tail --follow -n 500 /var/log/SMlog

If coalesce errors

Scan all VDI on SR:

vhd-util scan -f -c -p  -m /var/run/sr-mount/$SR_UUID/*.vhd

If no error, check error from SMlog on coalesce, like these (coalesce EXCEPTIONS).

Jan 28 15:56:11 xen01 SMGC: [23350]          ***********************
Jan 28 15:56:11 xen01 SMGC: [23350]          *  E X C E P T I O N  *
Jan 28 15:56:11 xen01 SMGC: [23350]          ***********************
Jan 28 15:56:11 xen01 SMGC: [23350] coalesce: EXCEPTION <class 'util.SMException'>, VHD *811e7004(80.000G/2.155G) corrupted
Jan 28 15:56:11 xen01 SMGC: [23350]   File "/opt/xensource/sm/cleanup.py", line 1542, in coalesce
Jan 28 15:56:11 xen01 SMGC: [23350]     self._coalesce(vdi)

Find the UUID incriminated (here 811e7004…), and test it:

vhd-util check --debug -n /var/run/sr-mount/<SR_UUID>/<VDI_UUID>.vhd

Eg:

primary footer invalid: invalid cookie

Repair it:

vhd-util repair -n /var/run/sr-mount/<SR_UUID>/<VDI_UUID>.vhd

Recheck it, should be OK now.

Remove the lock, and wait a bit, coalesce should start.

If that doesn't help, investigate further and possibly mv (not rm!) the corresponding VDI to see if a rescan can coalesce.

dmidecode -s system-serial-number 
# or:
xenstore read vm
  • Last modified: 2023-11-02 17:37