Filesystem Isolation in NGINX Unit

by

in

In November 2019 (which already seems like half a century ago) we announced the addition of namespace isolation to NGINX Unit. Now we’re here to discuss the most recent addition to the isolation mechanism, namely the rootfs option of the isolation object.

As mentioned in the blog post announcing NGINX Unit 1.18.0, the rootfs option enables you to designate an arbitrary directory as the filesystem root of an application. Thus, you can configure and run apps as lightweight, on‑demand containers, improving their security, isolating them from each other and the underlying OS, and enhancing the granularity of your infrastructure.

Yet, this power doesn’t come without responsibilities.

Caveats and Technicalities

Perhaps the most important thing to mention is that the rootfs feature is available only on Linux and Unix‑based systems that support bind mounts or the nullfs filesystem.

Moreover, if the isolation object defines a new mount namespace, to increase safety NGINX Unit uses the pivot_root system call instead of chroot. Consequently, another technical side to using rootfs is the need for NGINX Unit’s main process to have system administrator capabilities, specifically CAP_SYS_ADMIN, so it can make all the necessary system calls.

In practice, these two considerations mean that the main process needs to run as root. While it’s highly probable that you already meet this requirement, that’s not 100% certain: in some installations, NGINX Unit’s main process can run as a different system user. The requirement to run as root creates no additional risks for the apps you run, however, because the main process doesn’t handle client connections or run application code; all of that is achieved with separate processes that run with non‑privileged credentials.

Having spelled all this out, let’s proceed to review some scenarios where rootfs comes in handy.

Securing Applications

This may seem too obvious to mention, but the primary goal of the entire isolation object is to make your apps physically unable to venture outside the limits you’ve drawn for them. When the object was first introduced, it notably lacked the ability to restrict filesystem access, but the rootfs option closes this gap.

Suppose we have a Python app that has been compromised and now can be injected with arbitrary code (via static file upload or other means). First, let’s put on the attacker’s hat and check our opportunities:

This code scans the system for setuid executables that can be exploited further, which itself normally must be disallowed. If any such executables can be tricked into providing access to sensitive data, your system is seriously compromised. However, such misconfigurations occur on a regular basis, so let’s see how a typical situation plays out, starting with a basic setup:

Thus configured, the compromised app yields the following:

$ curl http://localhost/find/?path=/usr/bin
/usr/bin/passwd
/usr/bin/fusermount
/usr/bin/sudo
/usr/bin/chfn
/usr/bin/umount
/usr/bin/pkexec
/usr/bin/sg
/usr/bin/cp
/usr/bin/atrm

As you can see, our initial scan worked nicely: we’ve managed to obtain an attack vector with the incautiously configured /usr/bin/cp executable. Let’s continue by adding a second step to extract some valuable data, presumably by the same injection method we used earlier:

By running this code, we can peek at some sensitive information:

$ curl http://localhost/exfiltrate/?file=/etc/shadow
root:$6$QF7EX8XQ4BnLFVo/$f3hqo1vdWqK77kEuY4NOKsvgP1.XBtcO4fOND78IV/jP1i6/PtG/RHWZAqL3PQ3AVvwXwgBUbmAeOVtYDSg2o/:18471:0:99999:7:::

Now, let’s add rootfs into the mix to stop evil from prevailing again. Here’s our new configuration:

How does the compromised app now handle the two phases of the same attack? Let’s see:

$ curl http://localhost/find/?path=/usr/bin

The traversal yields nothing because the system directories aren’t mapped to the new filesystem root.

$ curl http://localhost/exfiltrate/?file=/etc/shadow
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2 Final//EN”>
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>

In other words: “Not so fast, cowboy!” As intended, we can’t reach the sensitive file now: the rootfs option bars the app from accessing the outside world. Nor can we access the coveted /etc/shadow or traverse the system directories at will. The app is restricted to its sandbox directory.

But what if the attacker knows the app is going to be sandboxed and accounts for that? Long story short, there are quite a few notorious ways to escape a chroot environment which can be put into use with sufficient preparation. However, NGINX Unit accounts for that by providing a way to employ the pivot_root system call instead of chroot. All you need to do is enable the mount namespace along with the rootfs option:

Finally, another neat feature of rootfs is that you can automatically map language‑specific dependencies to the new filesystem. That’s why we didn’t have to do anything to make our import directives valid after enabling rootfs. They still look like this:

Note that if rootfs only changed the filesystem root, the second import directive would become invalid – the standard modules just wouldn’t be anywhere to be found. Also note that we didn’t have to do anything to get our Flask virtual environment going; that happened automatically. Unfortunately, this type of mapping is currently available only for some of the languages NGINX Unit supports, namely Java, Python, and Ruby.

Speaking of mappings, let’s briefly discuss another issue that can be solved with rootfs.

Manipulating Global Dependencies

As noted above, NGINX Unit enables an elaborate internal mechanism to ensure language‑specific dependencies remain available to your app after rootfs toggles it to a new filesystem root. But what about other dependencies?

Usually, in situations where the app relies on custom libraries or modules that can be placed anywhere on the system, the working_directory option and the environment object, along with such language‑specific options as root, allow you to switch between custom dependencies as you see fit.

Moreover, some languages themselves offer a toolset to manipulate such dependencies with virtual environments or other types of versioning. NGINX Unit in fact makes use of this, natively supporting virtual environments in Python. However, if your app has fixed system‑wide dependencies that need to be located at absolute predefined paths, the situation can get messy: using different versions of a dependency side by side can become complicated, if not outright impossible. Nevertheless, you can employ rootfs to implement a basic runtime toggling mechanism.

Imagine the following PHP app (though you can perform the same trick with other languages):

Now, let’s toy a little with our dependency, module.php. Say we have two versions of it (nothing too fancy – just to show they’re different from each other):

Now we place two identical copies of our app in /www/data/a/ and /www/data/b/ and apply the following configuration:

Note that the root option of the application is relative to the rootfs value. In fact, this applies to all application path‑based options when rootfs is used.

With this configuration, the curl command yields the following:

$ curl http://localhost
Implementation A, legacy: How do you like this?

Next, we want to switch our module.php to version B. Instead of painstakingly reinstalling it or (to be more realistic) firing up another container, we need only run the following command:

$ curl -X PUT -d ‘”/www/data/b/”‘ –unix-socket /var/run/control.unit.sock http://localhost/config/applications/ab_app/isolation/rootfs/

This updates the rootfs setting (note the config API path in the URL), leaving all other parts of NGINX Unit’s configuration intact. Now the curl query results in a different response:

$ curl http://localhost
Implementation B, fancy: How do you like this?

This is only a brief example, but it shows how with NGINX Unit you can avoid the tedium of creating and maintaining multiple virtual machines or containers just to account for varied combinations of system‑wide dependencies. Instead you can replicate and regress through any variations of standard libraries and custom‑built modules that your application or language runtime expects at predefined paths, just by altering a single setting in NGINX Unit’s configuration. Unfortunately, this ability can behave somewhat erratically when layered, hidden, and indirect dependencies are present, but that’s what we’re actively working on now.

Conclusion

We’ve discussed a couple of use cases that benefit from the new rootfs feature. This, we believe, sufficiently outlines the way to transform NGINX Unit from a modest but robust web server into a potent lightweight containerization engine. We don’t want it to seem like we’re overpromising though – many features are still to be delivered in full, like the language‑specific dependency mapping mentioned above.

We also recognize that we need to add another feature before we can claim a seriously versatile set of isolation capabilities, namely directory bindings. To fully deliver on NGINX Unit’s containerization promise, we need to support mounting arbitrary directories not just at the root, but at any point in our make‑believe filesystem. Look for it in an upcoming release!

As always, we invite you to check out our roadmap, where you can find out whether your favorite feature is going to be implemented any time soon, and rate and comment on our inhouse initiatives. Feel free to open new issues in our repo on GitHub and share your ideas for improvement.