Patching the Double-Role Quote

My first patch to the Linux kernel (from last year) was really small and had nothing to do with the kernel. Still, I find it interesting for two reasons: first, it’s an example for an unusual entrypoint to kernel contribution. The second aspect is the deceptive nature of the bug it fixes.

Let’s start with the first point. For the general public, “Linux” means some GNU/Linux distro (an entire operating system). But for hardcore GNU/Linux users and Free Software proponents, “Linux” is specifically the Linux kernel. This, however, intensifies the surprise when one learns that the Linux kernel source tree does not consist of the kernel alone – it contains some userspace components. The patch we’re talking about fixes a bug in such a component called libbpf. No knowledge about the inner workings of the kernel was needed to fix this bug. Yes, you need to be a bit familiar with the convention and the workflow of the Linux kernel development process (it’s different from contributing to a typical GitHub-based project), but that’s all there is. I believe this fact should be inspiring to anyone who wants to start contributing to the kernel.

Now let’s talk about the bug. I was casually going through the code of libbpf as part of learning more about eBPF, and I immediately noticed the use of potentially unsafe functions like strcpy. So I started inspecting the code here and there until I stumbled upon this:

len = strlen(value);
if (value[len - 1] != '"') {
    pr_warn("extern (kcfg) '%s': invalid string config '%s'\n",
        ext->name, value);
    return -EINVAL;
}

/* strip quotes */
len -= 2;

It doesn’t matter if you don’t know what libbpf is for or what it does; all that the snippet does is some basic string processing. It is placed inside a function that deals with configurations given as strings, the if check is there to detect malformed strings that do not end with a quote.

Did you see an out-of-bound read there? i.e., value[len - 1] resulting in an invalid read in case value is an empty string and len is zero? That’s where the deception is! Enough checking is done earlier (not shown above) to make sure that the string value passed to this function is at least one character long and starts with a quote. Yes, this is not future-proof, but the point is, value[len - 1] without an explicit length check was valid in this case.

So the code is safe? No! The real problem manifests with the line len -= 2, which is there to strip the quotes. This is obviously written with the assumption that the string is at least two characters long. There is earlier code to make sure that value starts with a quote and there is code inside this function to make sure value ends with a string. So even for an empty configuration value like "", the string must be at least two characters long, right?

That’s where things get tricky. Earlier checks only make sure that the string is at least one character long and starts with a quote. That means the check if (value[len - 1] != '"') placed inside this function would pass even for a single-character string that consists of just one quote (in other words, the opening quote could be mistaken for the closing quote). Here, len -= 2 will result in an underflow, leading to out-of-bound accesses later.

The issue was fixed with a simple change:

-   if (value[len - 1] != '"') {
+   if (len < 2 || value[len - 1] != '"') {

Yes, one more example in favour of higher abstractions and languages like Rust.

N.B.:

  • I have another patch in the kernel that improves the precision of a component run in the kernel space. Not writing much about it here because it’s part of an academic work awaiting publication.

  • Last year, I gave a talk titled From GitHub Projects to the Linux Kernel: How FOSS Contributions Work at Open Source India. It discusses the process of submitting patches to the kernel. The slides are available here.


Tags: linux, kernel, contribution, foss, bugs, security, experience

Read more from Nandakumar at nandakumar.org/blog/