`tc-hfsc` ( 7 )

кривая иерархического добросовестного обслуживания (Hierarchical Fair Service Curve)

LINUX AND TIMER RESOLUTION

Формат

In certain situations, the scheduler can throttle itself and setup so called watchdog to wakeup dequeue function at some time later. In case of HFSC it happens when for example no packet is eligible for scheduling, and UL service curve is used to limit the speed at which LS criterion is allowed to dequeue packets. It's called throttling, and accuracy of it is dependent on how the kernel is compiled.

There're 3 important options in modern kernels, as far as timers' resolution goes: 'tickless system', 'high resolution timer support' and 'timer frequency'.

If you have 'tickless system' enabled, then the timer interrupt will trigger as slowly as possible, but each time a scheduler throttles itself (or any other part of the kernel needs better accuracy), the rate will be increased as needed / possible. The ceiling is either 'timer frequency' if 'high resolution timer support' is not available or not compiled in, or it's hardware dependent and can go far beyond the highest 'timer frequency' setting available.

If 'tickless system' is not enabled, the timer will trigger at a fixed rate specified by 'timer frequency' - regardless if high resolution timers are or aren't available.

This is important to keep those settings in mind, as in scenario like: no tickless, no HR timers, frequency set to 100hz - throttling accuracy would be at 10ms. It doesn't automatically mean you would be limited to ~0.8Mbit/s (assuming packets at ~1KB) - as long as your queues are prepared to cover for timer inaccuracy. Of course, in case of e.g. locally generated UDP traffic - appropriate socket size is needed as well. Short example to make it more understandable (assume hardcore anti-schedule settings - HZ=100, no HR timers, no tickless):

tc qdisc add dev eth0 root handle 1:0 hfsc default 1 tc class add dev eth0 parent 1:0 classid 1:1 hfsc rt m2 10Mbit

Assuming packet of ~1KB size and HZ=100, that averages to ~0.8Mbit - anything beyond it (e.g. the above example with specified rate over 10x larger) will require appropriate queuing and cause bursts every ~10 ms. As you can imagine, any HFSC's RT guarantees will be seriously invalidated by that. Aforementioned example is mainly important if you deal with old hardware - as is particularly popular for home server chores. Even then, you can easily set HZ=1000 and have very accurate scheduling for typical adsl speeds.

Anything modern (apic or even hpet msi based timers + 'tickless system') will provide enough accuracy for superb 1Gbit scheduling. For example, on one of my cheap dual-core AMD boards I have the following settings:

tc qdisc add dev eth0 parent root handle 1:0 hfsc default 1 tc class add dev eth0 parent 1:0 classid 1:1 hfsc rt m2 300mbit

And a simple:

nc -u dst.host.com 54321 </dev/zero nc -l -p 54321 >/dev/null

...will yield the following effects over a period of ~10 seconds (taken from /proc/interrupts):

319: 42124229 0 HPET_MSI-edge hpet2 (before) 319: 42436214 0 HPET_MSI-edge hpet2 (after 10s.)

That's roughly 31000/s. Now compare it with HZ=1000 setting. The obvious drawback of it is that cpu load can be rather high with servicing that many timer interrupts. The example with 300Mbit RT service curve on 1Gbit link is particularly ugly, as it requires a lot of throttling with minuscule delays.

Also note that it's just an example showing the capabilities of current hardware. The above example (essentially a 300Mbit TBF emulator) is pointless on an internal interface to begin with: you will pretty much always want a regular LS service curve there, and in such a scenario HFSC simply doesn't throttle at all.

300Mbit RT service curve (selected columns from mpstat -P ALL 1):

10:56:43 PM CPU %sys %irq %soft %idle 10:56:44 PM all 20.10 6.53 34.67 37.19 10:56:44 PM 0 35.00 0.00 63.00 0.00 10:56:44 PM 1 4.95 12.87 6.93 73.27

So, in the rare case you need those speeds with only a RT service curve, or with a UL service curve: remember the drawbacks.

Исходный текст на man7.org

tc-hfsc ( 7 )

LINUX AND TIMER RESOLUTION

`tc-hfsc` ( 7 )