Optimizing Resource Usage for Complex SSL Configurations in NGINX

In this post, we want to highlight one of the less visible NGINX projects from 2024, which we completed with today’s release of NGINX 1.27.4. This project focuses on optimizing load time and resource usage for configurations with a large number of SSL contexts. To see every update for this release, head over to the nginx.org/en/CHANGES.

Let’s start with the problem: In NGINX, each block in the configuration file, whether for server or location, is processed independently. While we inherit the directive values from the upper level, if no definition exists in the current context, more complex objects like SSL_CTX from the OpenSSL library are usually created from scratch. That allows us to avoid complex inheritance logic and keep configuration handlers simpler and more maintainable. Any departures from this rule require a very good justification.

Creating the SSL context appears to be just that – a very expensive operation that requires a special approach. That’s why in 2022 the first enhancement was committed. We started by checking if any of the proxy configuration directives that directly affect the SSL context were redefined in the current location. If not, we allowed the use of the entire SSL_CTX from the parent block.

As a result, the following configuration block now reads the certificates and key, and creates the necessary OpenSSL structures only once, as the nested location is allowed to inherit them.

Now, if you copy any of the directives above from the parent location to the nested one, the optimization will stop working. We could have checked if the directive parameters had changed, but this would add an extra, unavoidable, and sometimes significant cost for every location block.

A year later, we encountered another interesting case: A configuration that takes nearly a minute to load, with most of the time spent repeatedly reading and parsing the same set of SSL certificates, secret keys, and trusted certificate authority lists.

While the optimization above could’ve partially alleviated this issue, the reality is that any NGINX configuration that takes dozens of seconds to parse is typically on the larger side, compiled from multiple sources and managed by many people from different parts of the organization. Rewriting such a configuration in a production environment would be an impossible goal. So, we returned to the drawing board, trying to figure out how to load such a configuration faster.

Optimizing the Single Configuration Cycle

The next approach we took was more direct: Ensure that a single SSL file is read only once during a single configuration (re)load cycle, regardless of how many references to the file exist.

OpenSSL and its related libraries already handle part of this job for us. All objects – whether they are X509 certificates, EVP_PKEY private keys, or an X509_CRL structures with certificate revocation status – are reference-counted. This allows us to keep a single copy of a parsed object and reuse it as needed.

Our task is to add a cache lookup whenever we need to load a certificate, private key, or any other SSL object, and continue using the obtained copy. While it seems simple, it involves reimplementing all the subtle differences in the behavior, along with implementation of different OpenSSL versions and other supported SSL libraries. And extensive testing is necessary to find all regressions.

The first part of the changes was released in NGINX 1.27.2 last October and included caching SSL objects during a single configuration load cycle.

Performance

Now, let’s look at the performance. Our regular test environment consists of Dell PowerEdge M630 servers with Intel Xeon E5-2650 v3 CPUs, running Ubuntu 24.04 LTS with OpenSSL 3.0.13. The configuration test command nginx -t performs a full configuration load and then exits, giving us an accurate estimation.

We run this command on one of the servers and collect two metrics:

Retired instructions counter – Reported by the perf stat utility, this is the number of instructions required to execute the program, excluding the mispredicted branches. The counter is generally proportional to the execution time but is more deterministic and does not depend on the hardware or the system load.
Memory increase per configuration – Reported by mallinfo2 from the GNU libc allocator implementation, this measures the memory increase before and after loading the configuration.

The test configuration is a single server block with 10,000 locations containing proxy_ssl_certificate directives:

We’ll also introduce a variable parameter called the “uniqueness percentage.” At the low end, all the locations in the configuration will use the same SSL certificate and key pair. At the high end, all the SSL certificates will be unique. Both situations are uncommon, as the real usage should be closer to a midpoint.

And here’s the result:

As demonstrated, both the configuration load time and the memory usage now depend on the number of unique proxy_ssl_certificate entries, rather than the total number of certificates being loaded. And the cost is a small increase in memory usage starting from 90% uniqueness.

I mentioned that we run the test with OpenSSL 3.0. It is well known that the subsequent releases of OpenSSL 3.x improved performance, and the gains are significant even with OpenSSL 3.4, especially in terms of the memory savings. Some of our supported operating systems will continue using OpenSSL 3.0 for the foreseeable future.

Later, we learned that the configuration that prompted this work contained a rather small number of unique objects and the reload time was reduced to just a few seconds.

Further Optimization of Configuration Reloads

At this point, we have a lookup table that stores all the SSL objects along with the current configuration. With all this data, can we make any further improvements?

In NGINX, a configuration reload involves reading all the configuration files and creating a new, complete instance of the whole configuration. However, in modern environments the changes between reloads are usually minimal, often limited to a few servers or locations. The bulk of the configuration remains unchanged, as do the referenced SSL objects. We can now peek into the previous configuration data and reuse the already parsed representation of objects that haven’t changed.

To determine when to reuse the objects, we established the following criteria:

data:<base64-encoded-object> is always inherited because the data string, which serves as a key in the lookup table, is a full representation of the object.
engine:… is always reloaded, as it usually refers to an object in external storage, and we can’t determine if it has changed since the last read. We still cache the engine keys within a single configuration cycle.
Regular files are revalidated based on the metadata – the last modification time and the inode number, which is a unique identifier assigned by the underlying filesystem. If both are unchanged, it is safe to use the data from previous configuration.
There are situations when the file metadata is not available. In this case, we consider the file modified and reload it, just to be safe.
Encrypted private keys are exempt from caching. Otherwise, it was possible to obtain a reference to an encrypted key in a context that did not contain appropriate ssl_password_file configuration. This change was also retroactively applied to the full configuration reload.

As a last resort, the entire inheritance between the configuration cycles can be switched off with the ssl_object_cache_inheritable off top-level directive. We hope that this directive will not need a lot of use, but it’s difficult to predict every possible configuration. It’s also a bit tricky to apply – the directive marks the configuration currently being loaded as available for inheritance for the next configuration reload. Therefore, it would take two reloads or a full restart to completely clear the cache after setting it to off.

This enhancement was released in 1.27.4 and will effectively skip reading all the already known SSL objects if none has changed, making reloads even faster.

Caching SSL Certificates and Secret Keys with Variables

Back in 2019, with the release of NGINX 1.15.10, we introduced the ability to load SSL certificates and secret keys from variables. The feature enabled lazy loading of the certificates and allowed the configuration of virtual servers to serve multiple domain names, with distinct certificates for each domain:

In combination with NGINX JavaScript (njs), the variable support for SSL certificates enabled some interesting scenarios like njs-acme. This feature also made possible several advanced cases in our commercial product, NGINX Plus.

At the time of its introduction, we estimated the performance penalty to be around 20-30% per TLS handshake. This impact is negligible in long-lived HTTP/2 or HTTP/3 sessions, but becomes more visible if your traffic pattern includes short sessions with small number of requests per connection.

Recently, we measured the impact again and were surprised to see the overhead increase with OpenSSL 3.0. This prompted us to take action. We decided to adapt the same framework to cache the certificates loaded at runtime, even if for a short period. This is a significant change in behavior, so we added a set of new directives to enable the optimization for a current configuration scope.

ssl_certificate_cache max=N [inactive=time] [valid=time] is a directive that defines a new cache that can store N SSL certificates or secret keys specified with variables. When the cache overflows, the least recently used elements are replaced. The certificates and keys are counted as separate cache elements.

The cache can be set on the http block level and inherited to all the server contexts, configured for each server individually or disabled with ssl_certificate_cache off.

After the period set with valid=time expires, the cached object will be reloaded or revalidated, following the rules established in the configuration reload optimization section.

Similarily, proxy_ssl_certificate_cache, grpc_ssl_certificate_cache, uwsgi_ssl_certificate_cache enable caching of the authentication certificates and keys for HTTP, gRPC, and uwsgi upstream servers.

Here’s a basic usage example:

Estimating the performance impact of the change was tricky. To avoid hitting bottlenecks like network interface throughput or traffic generator capacity, we had to deoptimize the configuration as much as possible.

The final configuration included:

Reducing the number of worker processes to 1
Disabling TLS session resumption with ssl_session_tickets off
Disabling HTTP keepalive

These changes ensure that we measured the worst possible scenario for a HTTP server, with a full TLS handshake on every request.

With all these changes applied, we observed the following results:

The performance is almost back to the level of statically configured certificate pair, while still retaining the benefits of using variables in the configuration.

Try It Out and Share Your Feedback

This change was released in NGINX 1.27.4 and we hope you find it useful for your configuration. If you have any feedback, suggestions, or further feature requests for NGINX, please let us know in GitHub Discussions or via GitHub Issues.