LinuxNewsOpen Source Software

Linux Kernel 5.12 looks very promising: I/O performance boost and idmap mounts!

Linux Kernel 5.12 started with a bumpy road, as chief maintainer Linus Torvalds had to battle against power outages in his living area. But development has continued and Kernel development is back on the road.

Kernel 5.12 is currently available as Release Candidate 1. And gone are the romantic days of "Valentine's Day"; Torvalds nicknamed the 5.12 Kernel "Frozen Wasteland":

Kernel 5.12 is nicknamed "Frozen Wasteland"

I/O improvements

Besides the usual improvements (such as adding or improving drivers for hardware), there is one big change which should be looked at a bit closer: Move napi polls from softirq to kthread.

Under the current implementation of napi polls, softirq is used to query information about the current CPU load. However by doing this, the scheduler only has poor visibility into cpu cycles spent inside the softirq context and cannot provide optimal scheduling decisions (= not optimal performance).

With napi poll moved to kthread, scheduler is in charge of scheduling both the kthreads handling network load, and the user threads, and is able to make better decisions.

Wei Wang

On a benchmark test using netperf tcp_rr, the different implementations (softirq, kthread and workq) were compared against each other. With an astonishing performance boost (much faster response time) when napi runs as kthread:

        req/resp   QPS   50%tile    90%tile    99%tile    99.9%tile
softirq   1B/1B   2.75M   337us       376us      1.04ms     3.69ms
kthread   1B/1B   2.67M   371us       408us      455us      550us
workq     1B/1B   2.56M   384us       435us      673us      822us

softirq 5KB/5KB   1.46M   678us       750us      969us      2.78ms
kthread 5KB/5KB   1.44M   695us       789us      891us      1.06ms
workq   5KB/5KB   1.34M   720us       905us     1.06ms      1.57ms

softirq 1MB/1MB   11.0K   79ms       166ms      306ms       630ms
kthread 1MB/1MB   11.0K   75ms       177ms      303ms       596ms
workq   1MB/1MB   11.0K   79ms       180ms      303ms       587ms

Only on higher request/response sizes (1MB) the difference was not that significant anymore.

5.12 introduces idmap mounts

For everyone using containers, this is big news. Starting with Kernel 5.12 it will be possible to mount idmapped file systems and share files between different users.

It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2).

It is possible to share files between containers with non-overlapping idmappings.

They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files.

Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists.

Christian Brauner

This is in particularity useful for unprivileged containers, which are common around the LXC and LXD projects. These containers typically run with a "fake" ownership, mapped to an unprivileged local Unix user.

But also application containers (Docker/Kubernetes) are able to profit from this feature. The application container runtime (containerd) added experimental support for idmapped mounts.

This could be the basis for running containers without needing any privileged rights on a host, which in general is a great security improvement!

Claudio Kuenzler
Claudio has been writing way over 1000 articles on his own blog since 2008 already. He is fascinated by technology, especially Open Source Software. As a Senior Systems Engineer he has seen and solved a lot of problems - and writes about them.

Leave a reply

Your email address will not be published. Required fields are marked *

More in:Linux