Personal tools
You are here: Home Linux Xen Xen v4.8 on Fedora v26 IRQ balance for xen_netback (vif)

Xen v4.8 on Fedora v26 IRQ balance for xen_netback (vif)

A Xen dom0 host isn't distributing network (xen_netback) interrupts across cores, resulting in poor network performance (video is broken and exibits pixelation). All interrupts are serviced on the first dom0 core.

The dom0 host is given 4G of RAM and 4 cores.  All netback (vif) interfaces are serviced by th first core of dom0. The irqbalance daemon (v1.2) is running in both dom0 and domU.

Theory

The kernel or and tools for Fedora with kernel v4.13.4 don't match the output of the '/proc/interrupts'. These have a sligtly different format where the name of Xen event based interrupts are split from "xen-dyn-event" to "xen-dyn    -event". This means that IRQBalance doesn't recognise those interrupts. For example the  '/proc/interrupts' from Fedora with v4.13.4 kernel:

 18:      71817          0   xen-dyn    -event     eth0-q0-tx
 19:     108047          0   xen-dyn    -event     eth0-q0-rx
 20:      52184          0   xen-dyn    -event     eth0-q1-tx
 21:      37035          0   xen-dyn    -event     eth0-q1-rx
241:        258     158517          0          0   xen-dyn    -event     vif26.0-q0-tx
242:          1          0          0          0   xen-dyn    -event     vif26.0-q0-rx
243:        303          0          0     157885   xen-dyn    -event     vif26.0-q1-tx
244:          1          0          0          0   xen-dyn    -event     vif26.0-q1-rx
245:        133     378112          0          0   xen-dyn    -event     vif26.1-q0-tx
246:          1          0          0          0   xen-dyn    -event     vif26.1-q0-rx
247:       1400          0          0    3457479   xen-dyn    -event     vif26.1-q1-tx
248:          1          0          0          0   xen-dyn    -event     vif26.1-q1-rx

In the dom0 VM the vif interface  interrupts are not handled distributed intelligently across the available cores (e.g. the VIF domain, VIF interface index, VIF queue index & VIF tx/rx interrupts are not evenly distributed).

Workaround

Add a small IRQBalance policy script that very crudely distributes 'xen-dyn' interrupts across the available cores of dom0 .

Steps:

  1. put in place the '/usr/local/bin/irqbalance-policyscript'
  2. Change the irqbalance settings in '/etc/sysconfig/irqbalance'
  3. restart irqbalance

Install the script (see below).

Change the '/etc/sysconfig/irqbalance' settings:

IRQBALANCE_ARGS="--policyscript=/usr/local/bin/irqbalance-policyscript"

The script will:

  • only change the balance for 'xen-dyn' interrupts
    • it will not make any changes for devices on a PCI bus
    • it will not make changes for 'xen-pirq' or 'xen-percpu' interrupts
  • will distribute the interrupts statically (i.e. only once)
  • won't distribute based on the number of interrupts service by the core/cpu
  • makes a very crude guess at the number of CPUs active
    • assumes that the CPUs are numbers 0...(n-1) (i.e. zero based CPU number)

 

The script has many limitations and assumptions, BUT it much much better than not having the script.

 

Links

 

Appendices

IRQbalance Policy Script

#!/bin/bash
#
#  Argsments are
#    $1 PCI device name (or /sys if unknown)
#    $2 IRQ number
#
# Xen on Fedora seems to be unable to spead the interrupts for
# the xen backend devices across the available cores. This script
# takes a simple approach of using the interrupt number modulus the
# number of CPU cores and assigning the smp_affinity that core.
#
# This makes a number of assumptions (some of which are known to be
# bad). For example:
#
#  - the available CPU count may not be sequential or start from zero
#  - that each IRQ will have an even load on the system
#
# Devices that are backed by PCI devices are not modified.

DEVICE=$1
IRQ=$2
CPU_COUNT=$( find /sys/devices/system/cpu/ -maxdepth 1 -type d -name 'cpu[0-9]*' | wc -l )

if [ "${DEVICE}" == "/sys" ] ; then

   CPU_NUMBER=$(( ${IRQ} % ${CPU_COUNT} ))
   CHIP_NAME=$( cat /sys/kernel/irq/${IRQ}/chip_name )

   #
   # This should handle device names like:
   #   blkif-backend
   #   vif<domain>.<if>[-q<#>[-rx|-tx]]
   #   evtchan:
   #   evtchan:xenstored
   #   evtchan:xenconsoled
   #   evtchan:qemu-system-i<id>
   #
   if [ "${CHIP_NAME}" == "xen-dyn" ] ; then
      echo ${CPU_NUMBER} > /proc/irq/${IRQ}/smp_affinity_list
      echo ban=true
   fi
fi

Host info

# xl info
host                   : blue.lucidsolutions.co.nz
release                : 4.13.4-200.fc26.x86_64
version                : #1 SMP Thu Sep 28 20:46:39 UTC 2017
machine                : x86_64
nr_cpus                : 12
max_cpu_id             : 11
nr_nodes               : 2
cores_per_socket       : 6
threads_per_core       : 1
cpu_mhz                : 2600
hw_caps                : 178bf3ff:80802001:efd3fbff:000037ff:00000000:00000000:00000000:00000100
virt_caps              : hvm
total_memory           : 57343
free_memory            : 13888
sharing_freed_memory   : 0
sharing_used_memory    : 0
outstanding_claims     : 0
free_cpus              : 0
xen_major              : 4
xen_minor              : 8
xen_extra              : .2
xen_version            : 4.8.2
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          :
xen_commandline        : placeholder dom0_mem=4G,max:8G dom0_max_vcpus=4 dom0_vcpus_pin
cc_compiler            : gcc (GCC) 7.1.1 20170622 (Red Hat 7.1.1-3)
cc_compile_by          : mockbuild
cc_compile_domain      : [unknown]
cc_compile_date        : Tue Sep 12 21:57:03 UTC 2017
build_id               : baca4c8c5a903568230d6f6d45411c4a15ae92f2
xend_config_format     : 4

 

Fedora 26 dom0

The Federa 26 kernel (and net back driver) supports multiple queues and seperate tx & rx interrupts:

# modinfo xen-netfront
filename:       /lib/modules/4.13.4-200.fc26.x86_64/kernel/drivers/net/xen-netfront.ko.xz
alias:          xennet
alias:          xen:vif
license:        GPL
description:    Xen virtual network device frontend
depends:
intree:         Y
name:           xen_netfront
vermagic:       4.13.4-200.fc26.x86_64 SMP mod_unload
signat:         PKCS#7
signer:
sig_key:
sig_hashalgo:   md4
parm:           max_queues:Maximum number of queues per virtual interface (uint)

# grep . /sys/module/xen_netback/parameters/*
/sys/module/xen_netback/parameters/fatal_skb_slots:20
/sys/module/xen_netback/parameters/hash_cache_size:64
/sys/module/xen_netback/parameters/max_queues:4
/sys/module/xen_netback/parameters/rx_drain_timeout_msecs:10000
/sys/module/xen_netback/parameters/rx_stall_timeout_msecs:60000
/sys/module/xen_netback/parameters/separate_tx_rx_irq:Y

CentOS v6.x domU

The CentOS v6 kernel (and netfront driver) does not support multiqueue. Thus each NIC is seviced by a single core in each VM (irqbalance is changing the smp_affinity to balance the number of interrupts).

$ cat /proc/interrupts
           CPU0       CPU1
272:     642971     803006   xen-dyn-event     eth1
273:     334965   38472827   xen-dyn-event     eth0
274:        492          3   xen-dyn-event     blkif
275:      17177      31729   xen-dyn-event     blkif
276:         27          0   xen-dyn-event     hvc_console
277:        504          0   xen-dyn-event     xenbus
278:          0      11235  xen-percpu-ipi       callfuncsingle1
279:          0          0  xen-percpu-virq      debug1
280:          0          0  xen-percpu-ipi       callfunc1
281:          0     547309  xen-percpu-ipi       resched1
282:          0   10189651  xen-percpu-virq      timer1
283:       2454          0  xen-percpu-ipi       callfuncsingle0
284:          0          0  xen-percpu-virq      debug0
285:          0          0  xen-percpu-ipi       callfunc0
286:    1031998          0  xen-percpu-ipi       resched0
287:    7013506          0  xen-percpu-virq      timer0
NMI:          0          0   Non-maskable interrupts
LOC:          0          0   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RES:    1031998     547309   Rescheduling interrupts
CAL:       2454      11235   Function call interrupts
TLB:          0          0   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          0          0   Machine check polls
ERR:          0
MIS:          0

CentOS v7.x domU

The CentOS v7 kernel (and netfront driver) supports both multiqueue and seperate tx and rx interrupts

# grep . /proc/irq/*/smp_affinity_list
/proc/irq/16/smp_affinity_list:0
/proc/irq/17/smp_affinity_list:0
/proc/irq/18/smp_affinity_list:0
/proc/irq/19/smp_affinity_list:0
/proc/irq/20/smp_affinity_list:0
/proc/irq/21/smp_affinity_list:0
/proc/irq/22/smp_affinity_list:0
/proc/irq/23/smp_affinity_list:1
/proc/irq/24/smp_affinity_list:1
/proc/irq/25/smp_affinity_list:1
/proc/irq/26/smp_affinity_list:1
/proc/irq/27/smp_affinity_list:1
/proc/irq/28/smp_affinity_list:1
/proc/irq/29/smp_affinity_list:1
/proc/irq/30/smp_affinity_list:1
/proc/irq/31/smp_affinity_list:0
/proc/irq/32/smp_affinity_list:1
/proc/irq/33/smp_affinity_list:0
/proc/irq/34/smp_affinity_list:1
/proc/irq/35/smp_affinity_list:1
/proc/irq/36/smp_affinity_list:1
/proc/irq/37/smp_affinity_list:1

 

# cat /proc/interrupts
           CPU0       CPU1
 16:    1900857          0  xen-percpu-virq      timer0
 17:          0          0  xen-percpu-ipi       spinlock0
 18:    1940532          0  xen-percpu-ipi       resched0
 19:          0          0  xen-percpu-ipi       callfunc0
 20:          0          0  xen-percpu-virq      debug0
 21:       1175          0  xen-percpu-ipi       callfuncsingle0
 22:          0          0  xen-percpu-ipi       irqwork0
 23:          0    1917993  xen-percpu-virq      timer1
 24:          0          0  xen-percpu-ipi       spinlock1
 25:          0    1906799  xen-percpu-ipi       resched1
 26:          0          0  xen-percpu-ipi       callfunc1
 27:          0          0  xen-percpu-virq      debug1
 28:          0       1340  xen-percpu-ipi       callfuncsingle1
 29:          0          0  xen-percpu-ipi       irqwork1
 30:        724          0   xen-dyn-event     xenbus
 31:        500         64   xen-dyn-event     hvc_console
 32:       9582     156749   xen-dyn-event     eth0-q0-tx
 33:       4518       7091   xen-dyn-event     eth0-q0-rx
 34:       4448     166296   xen-dyn-event     eth0-q1-tx
 35:      24514     387232   xen-dyn-event     eth0-q1-rx
 36:     366572       9422   xen-dyn-event     eth1-q0-tx
 37:       1572        407   xen-dyn-event     eth1-q0-rx
 38:     913320    2606663   xen-dyn-event     eth1-q1-tx
 39:    2491893    2283177   xen-dyn-event     eth1-q1-rx
 40:       8173      27177   xen-dyn-event     blkif
 41:        569        860   xen-dyn-event     blkif
NMI:          0          0   Non-maskable interrupts
LOC:          0          0   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:    1940532    1906799   Rescheduling interrupts
CAL:        793        941   Function call interrupts
TLB:        382        399   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:        281        281   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
PIW:          0          0   Posted-interrupt wakeup event

Fedora 26 domU

The domU host supports both multi-queue and seperate tx and rx interrupts, BUT they are all being service by the first of two cores (smp affinity is for both cores).  The irqbalance daemon (irqbalance-1.2.0-2.fc26) is running.

$ cat /proc/interrupts
           CPU0       CPU1
  0:       9597          0  xen-percpu    -virq      timer0
  1:       2489          0  xen-percpu    -ipi       resched0
  2:          0          0  xen-percpu    -ipi       callfunc0
  3:          0          0  xen-percpu    -virq      debug0
  4:        396          0  xen-percpu    -ipi       callfuncsingle0
  5:          1          0  xen-percpu    -ipi       irqwork0
  6:          0      12066  xen-percpu    -virq      timer1
  7:          0       4403  xen-percpu    -ipi       resched1
  8:          0          0  xen-percpu    -ipi       callfunc1
  9:          0          0  xen-percpu    -virq      debug1
 10:          0       1297  xen-percpu    -ipi       callfuncsingle1
 11:          0          0  xen-percpu    -ipi       irqwork1
 12:        572          0   xen-dyn    -event     xenbus
 13:         27          0   xen-dyn    -event     hvc_console
 14:       2720          0   xen-dyn    -event     blkif
 15:       2119          0   xen-dyn    -event     blkif
 16:         80          0   xen-dyn    -event     blkif
 17:         51          0   xen-dyn    -event     blkif
 18:         29          0   xen-dyn    -event     eth0-q0-tx
 19:        251          0   xen-dyn    -event     eth0-q0-rx
 20:        304          0   xen-dyn    -event     eth0-q1-tx
 21:         24          0   xen-dyn    -event     eth0-q1-rx
NMI:          0          0   Non-maskable interrupts
LOC:          0          0   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          1          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:       2489       4403   Rescheduling interrupts
CAL:        396       1297   Function call interrupts
TLB:          0          0   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
DFR:          0          0   Deferred Error APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event

 

$ grep . /proc/irq/*/smp_affinity_list
/proc/irq/0/smp_affinity_list:0
/proc/irq/10/smp_affinity_list:1
/proc/irq/11/smp_affinity_list:1
/proc/irq/12/smp_affinity_list:0-1
/proc/irq/13/smp_affinity_list:0-1
/proc/irq/14/smp_affinity_list:0-1
/proc/irq/15/smp_affinity_list:0-1
/proc/irq/16/smp_affinity_list:0-1
/proc/irq/17/smp_affinity_list:0-1
/proc/irq/18/smp_affinity_list:0-1
/proc/irq/19/smp_affinity_list:0-1
/proc/irq/1/smp_affinity_list:0
/proc/irq/20/smp_affinity_list:0-1
/proc/irq/21/smp_affinity_list:0-1
/proc/irq/2/smp_affinity_list:0
/proc/irq/3/smp_affinity_list:0
/proc/irq/4/smp_affinity_list:0
/proc/irq/5/smp_affinity_list:0
/proc/irq/6/smp_affinity_list:1
/proc/irq/7/smp_affinity_list:1
/proc/irq/8/smp_affinity_list:1
/proc/irq/9/smp_affinity_list:1

 

Document Actions