UAPI Compatibility Checker: Automated Tooling to Detect Userspace Breakage in the Linux Kernel

Monday 1/29/24 02:25am
|
Posted By Trilok Soni
  • Up0
  • Down0

Snapdragon and Qualcomm branded products are products of
Qualcomm Technologies, Inc. and/or its subsidiaries.

How do you maintain backward compatibility with userspace apps in the Linux kernel? Most maintainers rely on code review and pushing changes out for testing. But with experienced reviewers and testing farms in short supply, it can be helpful to add automated tooling at build time to check for compatibility.

At the Qualcomm Innovation Center, Inc. (QUIC), we’ve developed a shell script that performs application binary interface (ABI) analysis of patches. We use it in house to detect at build time whether a particular change we’ve made to the Linux kernel will break compatibility with the userspace application programming interface (UAPI). We’ve upstreamed the tool to the Linux kernel community, and they’ve helped us refine it.

In this post, I’ll describe how the script works and give examples you can follow to automate the detection of userspace breakage in your own Linux kernel development. You’ll also see ideas for some of the more difficult boundaries between user and kernel space, like module parameters and sysfs.

(This post is a summary of my presentation “Improving UAPI Compatibility Review with Automated Tooling” at the Linux Plumbers Conference. Details at bottom.)

What is UAPI?

For a long time, the Linux kernel community has rigorously enforced a policy of backward compatibility in its userspace interfaces. As a result, users have been able to upgrade their kernels without worrying about their userspace programs breaking. They’ve enjoyed stability across kernel upgrades without the need to recompile anything.

But that stability comes at the cost of kernel developers intercepting any changes that may break UAPI, before those changes affect users. UAPI includes any interface between userspace and the kernel, such as system calls, data structures (used in IOCTLs), module parameters, sysfs files and procfs files. In short, if it’s something that can change in the kernel and break compatibility with a program running in the userspace, it’s considered a UAPI.

Traditionally, kernel developers have two ways to detect userspace breakage. One is code review, in which the maintainers examine each patch to see whether it will break anything in userspace. The other is putting the merged patch out to the whole Linux testing universe; if any userspace programs break, the patch is reported back upstream and reverted if necessary.

That makes the hunt very labor-intensive, and therefore ripe for automation. We set out to add tooling and documentation to help kernel developers find areas of potential userspace breakage more easily.

Will the patch break compatibility with UAPI headers?

We’ve upstreamed check-uapi.sh, a script for checking UAPI header stability across git commits. Its default behavior is to check whether the latest commit (or current dirty changes) introduces any ABI changes compared to HEAD^1.

At QUIC, we use this script in our continuous integration (CI) system to block Linux kernel changes that fail the check. Added to the kernel tree as part of our build pipeline, it automates the analysis of patches for UAPI breakages before the code is even executed. It gives us useful, immediate feedback we can use to codify the Linux kernel’s UAPI stability policy.

The logic of the script is to:

  1. run make headers_install before and after the patch
  2. populate two parallel header trees
  3. run abidiff (a libabigail command line tool) on all of the UAPI headers in those trees

The script compares the ABI of a modified header before and after the patch is applied. If an existing UAPI is modified in a way that's not backward-compatible, the script exits non-zero and generates a message.

For example, in this struct foo, we added a member z between x and y:

The script reports that the offset of y changed and the size of the structure changed:

Those changes break UAPI because they are not backward-compatible.

We found that abidiff can be overzealous when examining kernel code, sometimes objecting to very common patterns in kernel code that don’t impair backward compatibility. To get abidiff to suppress these findings, we use a regular expression match to pass in certain patterns and conditions for things (here, a numerator variant that ends in COLOR_MAX):

Below are seven examples of using check-uapi.sh to detect potential UAPI breakage in your Linux kernel development.

Example 1 – Adding a simple #define

Here we add #define FOO to a header file:

This change won’t break anything, so the script goes through all 912 UAPI header files installed by make headers_install and generates a message:

Note also that you can run the script with no arguments. If you have a dirty git tree, it will compare HEAD to the dirty git tree. If you have a clean tree, it will compare HEAD to HEAD-1.

Example 2 – Changing a type

Here’s an example of changing the last member of a structure from a signed, 32-bit integer to an unsigned, 32-bit integer:

The size of the structure does not change, so your IOCTL code won’t be affected. But on the chance of an incompatibility like a userspace program passing in a negative number, the script reports it:

On the other hand, if you know it's impossible for userspace to pass in a negative number, then you can ignore this. You can pass the -i parameter so that the script will ignore ambiguous changes.

Note also that the script gives you statistics (“1/912 UAPI headers...”) on the diff.

Example 3 – Re-ordering a member

Moving a destination register below a source register is simply a move of a struct member:

Even with the -i flag, the script finds a breaking change because any userspace program using either of those registers will fail:

Example 4 – Architecture-specific headers

Suppose you make a change to an arm64 header, like adding a new variable to the end of the structure:

When running on an x86 machine, the script reports no changes to UAPI headers:

That’s because when you run make headers_install, the parallel trees don't include the arm64 headers. But if you pass in a cross compiler and set the arch variable, then it will install make headers_install correctly and check those headers:

Note that arm64 has 884 headers compared to 912 for x86.

You can also pass -i here and the script doesn't report this as a breaking change. Why not? Because adding a new variable to the end of the structure does change the size of the struct, but it is technically possible to handle that expansion in kernel space. You can map your IOCTLs correctly to go to the same command. And, if you use copy_struct_from_user() and the _IOC_SIZE macro, then the kernel will automatically zero-extend or truncate your structure to deal with the size discrepancy.

So if your kernel driver can see that new variable and know whether it's set to zero – that means userspace didn't have this notion of new variable – then it’s okay. In that case, we treat it as an ambiguously breaking change.

Example 5 – Cross-dependencies

What happens if you change poll_t in types.h to unsigned short?

The script detects differences in eventpoll.h:

That’s noteworthy because eventpoll.h is not the header file you modified. But it underscores why we want the script to check every header file in the tree: to catch cross-dependencies among headers.

The script also determines that eventpoll.h did not change between the revs. But it points out that something broke and suggests that perhaps one of the headers it includes changed.

Example 6 – UAPI header removals

If you comment out, say, the term.ios from Kbuild:

then make headers_install will miss it and the script will report it:

You’ve removed a UAPI header, so the script can’t identify if you’ve broken userspace and will flag the change.

Example 7 – Checking swaths of history

You can use the script to check a large swath of history by passing in the parameters -b and -p. Here’s how to compare tags 6.1 to 6.0:

In this case, the script flags UAPI breakage in 37 headers between those tags:

Will the patch break compatibility with module parameters?

Besides checking UAPI headers, we’ve upstreamed check-module-params.sh in the same patch set. It’s a script to help determine whether a patch will break compatibility with module parameters. Its logic is to grep for all module_param.*()calls, then compare their arguments before and after a change is applied. Like check-uapi.sh, it checks whether the latest commit (or current dirty changes) introduces any ABI changes compared to HEAD^1.

In this example, an argument changes:

The script flags it:

It's useful to have an automated process for spotting changes made to module parameters – for example, by inexperienced developers inadvertently modifying something.

Community input on the scripts has been helpful. Now, send us more!

When upstreaming to the Linux kernel, there are no shortcuts. You have to convince the kernel maintainers that your code won’t break backward compatibility, and building robust automation to detect breakage is non-trivial. The Linux kernel community has provided a great deal of helpful, detailed input that has improved the tool immensely.

Here are other areas in which we hope to generate discussion:

Sysfs and procfs

To ensure a stable kernel upgrade, changes to sysfs and procfs files must also be backward-compatible. Similar to UAPI headers, they are boundaries between userspace and kernel space, and there are plenty of ways to break userspace by changing the files. But because they do not offer a clean C API, abidiff cannot help to analyze them.

Mind you, with scripts/get_abi.pl, some automation is already in place. Many sysfs and procfs interfaces are in Documentation/ABI, and get_abi.pl can parse the documentation. It can also look up your running sysfs nodes in the documentation directory and flag any undocumented nodes.

We have some preliminary ideas:

  • Extending the documentation interface (e.g., to something like device tree bindings)
  • If the documentation becomes a de facto API to sysfs and procfs, then writing automated tooling to generate test stubs
  • Performing static analysis on the files

We’re open to other ideas.

Edge cases

There are many edge cases (or corner cases) when it comes to backward compatibility. The way to see what works and what doesn’t is to use vast amounts of development history, especially when working on the Linux kernel. Our check-uapi.sh script can traverse the kernel history and find breakages over time. With that data, we can analyze certain breakage patterns and improve the detection algorithms in our scripts.

You can help us find more edge cases by using the tool and examining its findings over the kernel history and as new patches come in to the mailing lists. Let us know what you’ve found.

Author intent

Our check-uapi.sh script has no understanding of and makes no assumptions about the author’s intent. To the script, breakage is breakage, and it needs to be flagged whether deliberate or accidental. But one variable is that the kernel itself sometimes refactors things – for example, by deleting unused interfaces. Those actions are perfectly valid, but they are part of the intent that the script doesn't understand, so it may flag them, even if they don’t affect userspace. For that matter, if a file is moved, the script counts it as a removal.

One of the goals of the Linux kernel community is to put Linux in places where it doesn't yet exist; for example, in Level-A, highly hazardous situations. But that requires developers and architects to produce design and architecture documentation that the Linux kernel just doesn't have. That documentation would normally cover developer-/author intent sufficiently.

We’ve heard that having meta-information associated with kernel changes to present the intent of the author is important to community members working in safety-critical environments. We’d support efforts to make that happen if there’s a way to do so without increasing maintainer burden substantially.

Your turn: Try the upstreamed scripts

We’ve now upstreamed the scripts and you’re welcome to use them in your own environment. We’ll be grateful if you follow the links and participate in the patch with the rest of the community.

Our goal has been to share this implementation with other Linux kernel developers who want to integrate it to their build pipelines and processes. To cap off the months of work we’ve put into upstreaming it, we took the step of presenting “Improving UAPI Compatibility Review with Automated Tooling” at the Linux Plumbers Conference. Take a look at the deck and video for more details about detecting UAPI breakage before it gets to your users.

Snapdragon and Qualcomm branded products are products of Qualcomm Technologies, Inc. and/or its subsidiaries.