Securing Custom Protocols With Noise

In the world of backend services, it’s of utmost importance to provide secure communication channels. Traditionally, in services like those provided by Amazon, Microsoft, or Google, the outside-accessible interface is provided via an HTTPS endpoint and hopefully, the TLS connection is configured to only allow secure cipher suites and provide the proper certificates.

In my time doing security reviews with teams across different parts of AWS services, I’ve seen that it’s easy for teams to follow standard guidelines on how to secure the customer surface, but it becomes harder to have the right security and confidentiality properties in loosely coupled internal services. Of course, guidelines exist as well and teams follow them very well. But behind the APIs is where the business logic is implemented and the services need to scale. Here is where the ingenuity of the engineers is required to come up with scalable architectures and efficient solutions. In many cases, this will require some kind of communication between service components. Not in all cases TLS is the solution to everything.

Background

The last time I was dealing with such a scenario, we had the following setup. Multiple parties were communicating through a routing proxy. The proxy was providing basic infrastructure routing capability and very limited protocol inspection. The endpoints were loosely coupled and needed end-to-end security and integrity.

Simple Proxy Connection

There were multiple alternatives for end-to-end encryption like nesting TLS connections through the proxy, using symmetric or asymmetric keys to protect the payloads for example. None of these approaches felt elegant and scalable.

In this context, I started looking at how TLS works under the hood and how it provides the necessary security properties. Here, I learned about Diffie-Hellman key-exchanges and how this is embedded into TLS. Using the low-level OppenSSL functions I was able to quickly draw up a protocol that builds upon DH key exchanges using ephemeral and static public keys and I was happy.

The downside of this approach however was that it was quite some ugly code using OpenSSL, there was no guarantee this was working exactly like this across programming languages and there was some desire to not have to “roll our own crypto”. The idea was abandoned at this point and replaced with something readily available, but a less fitting approach to the problem.

A couple of months down the road I came across Noise, a protocol framework for building secure protocols based on DH key exchanges, designed to make it very hard to mess up the communication challenge.

The Noise Protocol Framework

Initially, I thought, well this might be interesting but did not look too much into it. But browsing through the specification, I liked the simplicity of the approach and continued reading. Essentially, Noise is built upon handshake patterns that are used to establish the secure communication challenge. Different elements of the patterns can be combined according to the scenario. Then it struck me when looking at the following handshake pattern:

KK:
     -> s
     <- s
     ...
     -> e, es, ss
     <- e, ee, se

In Noise-speak, this means that two parties (Alice to Bob) communicate and implement the protocol with the following agreed-upon handshake. The KK pattern representing a scenario where the two parties have previously exchanged their public keys. To initiate the communication, Alice executes a series of steps, basically a series of DH and key-derivation steps, that is mirrored by Bob. Bob then returns a message with keys derived similarly.

The cool thing is, that this particular pattern was very similar to the protocol that I came up with in my project, but much more thought-through and secure because of additional encryption with authenticated data.

And I even found a confirmation why this approach would have been very beneficial in our case:

  1. 0-RTT Encryption (0 roundtrip encryption) - communication can directly start with the first message since all components a known ahead of time. In this case the public key.
  2. Sender authentication is resistant to key-compromise impersonation (KCI).
  3. Encryption to a known recipient, strong forward secrecy.

But this doesn’t explain how and why it works. In the next section, I’m going to try to explain for the above case how the keys are generated and try to shed some light on how the message protocol ends up working.

Noise Basics

Noise applies a set of very well-known and researched principles to make it work in a very elegant way to avoid confusion and mistakes. The most important parts that need to be understood are:

  • Diffie-Hellman Key Exchange, in particular, Elliptic curve Diffie-Hellman. The very very short summary is that ECDH allows deriving a shared symmetric key based on two asymmetric key-pairs.
  • Hashing using well-known and understood hashing algorithms like SHA-256.
  • Key derivation using a hash-based function using HMAC based on the same hash function used in the hashing steps.

With a rudimentary understanding of what happens using the above building blocks, we can now take a deeper look into the message handshake protocol. As mentioned before, the handshake protocol is an exact specification of how to build an internal state so that both parties end up with the same symmetric encryption keys.

However, instead of just using Diffie-Hellman on long-lived key-pairs, Noise allows several combinations of static and ephemeral key pairs to achieve the desired security properties.

KK Handhake - e, es, ss pattern.

In the diagram below, I walk you through the first important sequence of the “KK” handshake pattern. As a quick recap, the “KK” message pattern relies on the fact that the two parties have already exchanged their public keys, but want to establish a secure communication channel with forward secrecy.

The key element of the KK pattern is that both parties know each other’s public keys. The exchange of the necessary public keys has happened before in a controlled environment. For example, imagine a fleet of backend hosts and with each host, you generate a particular key pair. In a cloud environment, you can use a key management service to provide these keys and make sure they’re properly bound to only authorized hosts.

A quick legend for the image below:

  • The blue left side is Alice, this actor uses one static key pair for authentication and an ephemeral key pair for connection establishment. The ephemeral key pair should be generated for every new session.
  • The green right side is Bob, this actor has a similar static key pair and an ephemeral key pair for connections.
  • Green arrows indicate the exchange of public keys from Alice to Bob (solid green arrow) or from Bob to Alice (dotted green arrow).
  • Red arrows indicate when the private keys are accessed by either party.
  • The dotted line in the middle symbolizes the network interface between the two.

Graphical Explanation of what happens during the handshake.

Now, let’s walk through the process at least for the first part of the handshake, extending it more would make the diagram particularly messy, but you will get the gist of the chaining process that happens that makes it easy to understand. The Noise specification mentions three state contexts: handshake state, symmetric state, cipher state.

  • The handshake state contains and builds the public and private keys needed to process the messages.
  • The symmetric state contains a hash value h and a chaining key ck that are continuously updated to build the internal state.
  • The cipher state contains a symmetric encryption key k and a nonce n that is incremented every time the encryption key k is used. k can be used to encrypt certain payloads part of the handshake messages and is particularly useful for zero roundtrip encryption.

The first part of the handshake is to process the pre-messages. Certain handshake patterns do not have pre-messages, others do. In the case of the “KK” pattern, the pre-message contains the previously exchanged public keys of the two parties.

  1. To initialize the state, first, the protocol name (the full protocol name contains the pattern, the hash function, and encryption method) is hashed. This then initializes the h and ck variables that are most important for tracking the cryptographic state. In this initial step (1), h and ck have the same value.
  2. In this step, the static public key of Alice is hashed and h updated appropriately.
  3. Now, the static public key of Bob is hashed and h is updated. This concludes the pre-message handling.

After processing the pre-message, it is now time to process all the handshake symbols of the first part of the handshake. Namely, e, es, ss:

  1. Processing e means hashing the public ephemeral key of Alice and updating h. Also, it will append the ephemeral public key of Alice to the message buffer that is sent to Bob. This marks the exchange of the public key.
  2. Processing es: This is the first time, we perform a Diffie-Hellman operation to derive a key. We use the ephemeral private key of Alice and the static public key of Bob to derive a new temporary key.
  3. Using the chaining key ck and a key derivation function based on HMAC to derive a new chaining key and an encryption key. HMAC based on the selected hash function is applied multiple rounds to create these two new secret values. At the end of this step, ck is updated with a new value and k the secret encryption key is set to the second temporary key, and the nonce n is reset to 0.
  4. The final step is to process ss. This will perform a similar operation as in the previous step but using a different set of keys. It will use the static private key of Alice and the static public key of Bob. First, it uses Diffie-Hellman to generate a temporary symmetric key.
  5. This is the final step of processing the ss pattern. It uses the temporary output value of the previous step as the input for the key derivation function. Again, this will update ck and k of the cryptographic state.

Bob’s side is doing more or less the same operations but slightly mirrored with the opposite private keys. Due to the way that Diffie-Hellman works, Bob’s side builds the same cryptographic side.

A Tabular summary of the modifications applied to the cryptographic states.

Continuing The Handshake

The above step is only processing the first pattern of the handshake, the second step is to process the second part of the pattern e, ee, se. The processing follows the same rules as previously and keeps continuing to modify the internal state in particular h and ck. Once all patterns are processed, both sides call Split(), which calls again the key derivation function to produce two keys, one for sending messages and one for receiving messages.

Once we have these two keys, we can securely communicate between Alice and Bob. Due to using ephemeral keys in combination with the static keys, we have session secure communication.

Conclusion

Of course, in principle Noise and TLS are relatively similar, and the process of the exchange pattern is very similar to what TLS does. Certificates contain public keys and in TLS ephemeral keys are similarly used per session.

However, the benefits are that the configuration space is significantly smaller leading to less surface space to mess something up. Besides, it becomes much simpler to use Noise in places where traditional TLS becomes harder to implement. The focus for Noise is not necessarily exactly one protocol, but to provide a framework to build protocols that are secure from the ground up but have the necessary flexibility to adapt to the specific use cases.

In the introductory example, I mentioned the case where between several backend instances a router layer is nested that needs the possibility to introspect certain properties from the traffic but must not be able to read the traffic between the end-points. This would have been an amazing opportunity to leverage Noise to secure the traffic in a more flexible and problem-specific way than relying on standard TLS.

But of course your mileage may vary and it’s always necessary to have good reasons to justify using Noise instead of TLS.

To conclude, I will have probably missed certain features of Noise that need additional high-lighting, but I leave this for you to find out directly from the specification.

Further Reading