TLSv1.2 vs TLSv1.3: A Deep Dive into Handshake Efficiency

Table of Contents

Cloud Providers, Load Balancers, Security Experts, Architects, etc., all have one thing in common. At some point, they will advise you to utilize TLSv1.2 and/or TLSv1.3 protocols and abandon legacy protocols like TLSv1.0 and TLSv1.1. Legacy protocols are still very much in use to this day. Some of those advisors will provide a set of cipher suites to configure for these protocols, but what is the performance impact these cipher suites have?

The goal of this research is to determine the efficiency of TLSv1.3 handshakes compared to TLSv1.2 using the same settings and hardware and how cipher suites impact these results.

Since TLS is a core technology for most data in the IT industry, the results can provide proof that drives choices that can both reduce ecological impact and optimize IT environments their efficiency.

Introduction

Transport Layer Security (TLS) is the successor to Secure Socket Layer (SSL). Despite this, many still refer to TLS as SSL due to its historical roots and familiarity within the cryptography community. TLS provides a standardized method for securing data communication between two parties over a network connection.

TLS operates using algorithms that encrypt data through computational processes. The algorithms available vary by protocol version and required security level. A ‘Cipher Suite’ refers to the specific set of algorithms used in a connection and their configurations.

The encryption and decryption of an TLS connection consumes computational power through a computer’s Central Processing Unit (CPU). The CPU does the computation for the algorithms. These connections, from beginning to end are often referred to as ‘SSL transactions’. The computational power required for a singular transaction is insignificant in modern IT, but what if there are millions of connections to decrypt simultaneously? In this case, the impact can clog the CPU of a server and potentially cause delays in response.

TLS has gone through multiple versions, and anything before TLSv1.2 is considered legacy (TLSv1.1, TLSV1.0, SSLv3, etc.). Currently, TLSv1.2 and TLSv1.3 are widely used protocols. Some of the improvements of TLSv1.3 over its predecessor relate to efficiency. One of those improvements relates to the approach of cipher suites.

Cipher suites

SSL/TLS communication can not occur without a cipher suite. The cipher suite is a combination of algorithms to secure communication. Cipher suites have a complex naming convention. To understand the naming convention properly, let’s dissect one.

Cipher suites are described with multiple abbreviations, for example: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA256_P384

Cipher Suite Structure

Source: https://learn.microsoft.com/en-au/windows/win32/secauthn/cipher-suites-in-schannel)

If you would like to learn more in-depth about cipher suites, consider the following Wikipedia article.

TLSv1.2 vs TLSv1.3

TLSv1.2 and TLSv1.3 function differently, with TLSv1.3 being the newer version. A key improvement in TLSv1.3 is its streamlined design, which uses fewer cipher suites and delivers a more efficient handshake process. In TLSv1.3, only five ciphers are available, three of which are set as default.

TLSv1.2 has multiple key exchange algorithms. TLSv1.3 attempted to optimize this process with only ECDHE/DHE for the key exchange algorithms. There is no choice for other key exchange algorithms.

Key Exchange Algorithm  TLSv1.2  TLSv1.3 
ECDHE/DH 
AES-GCM 
SHA-2 
ECDSA  
CHACHA20 

Fast Handshake

When using TLSv1.2, multiple packets are exchanged to determine a compatible cipher suite, with the initial packets sent in clear text until the cipher suite is agreed upon. In contrast, TLSv1.3 optimizes this process using server certificate encryption, reducing the packet exchange.

Methodology

Overview

The logical structure of the methodology can be found in the below image. A handshake is simulated, and the RTT data is sent back for processing. Methodology

Achilles

A key challenge in this research is simulating connections in a way that ensures each handshake is unique, preventing results from being skewed by optimization techniques. Scalability is also crucial, as testing under variable loads provides better insights into handshake efficiency under stress. It proved to be a challenge to find software that meets these needs. I opted to develop custom software called Achilles instead.

Achilles uses a coordinator/agent architecture with a central command center that distributes data to its agents. This data includes the target and specified cipher suite/protocol, enabling the simulation of many unique handshakes. From a research perspective, it is essential to make both the results and methodology accessible so the code has been made publicly available. Please note that Achilles is intended solely for research purposes and is not a commercial product. The Achilles repository can be found here here

Hardware Setup

While the hardware specification is not important at its core for a relative comparison, an incorrect setup can impact results. The tables below provide a short overview of how the hardware is set up.

Machine A

Shared Resource Pool Achilles Command Center NetScaler CXP
CPU (1 Socket / 8 Cores) 3,70GHZ 3,70GHZ
RAM 48GB 48GB
Disk 2TB 2TB

Machine B

Shared Resource Pool Proxmox Ubuntu VM (Podman)
CPU (1 Socket / 12 Cores) 1,60GHZ 1,60GHZ
RAM 32GB 19,5GB
Disk 512GB 200GB

Network

The network connection between machine A/B is limited to CAT5e (1000Mbps). It’s a local connection routing via a Ubiquiti UniFi Dream Machine (1Gbps).

NetScaler

The NetScaler will host a virtual server and will only be adjusted for the applicable cipher suite negotiation. Achilles can negotiate on TLSv1.2 ciphers from the client perspective, but TLSv1.3 negotiation is not supported due to library restrictions. Therefore, TLSv1.3 ciphers must be enforced through the NetScaler when testing. To ensure that the testing scenarios are equal, TLSv1.2 Ciphers will be configured on the NetScaler side similarly, even if it is not required.

The settings for the Ciphers can be seen in the screenshot below. During the testing, irrelevant ciphers will be removed based on the scenario being tested. NetScaler settings Besides the minimally required configuration mentioned above, all NetScaler settings are default.

OpenSSL

OpenSSL is used to generate the certificates. The generation of the OpenSSL certificate is reproducible with the following command:

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=NL/ST=Noord-Holland/L=Amsterdam/O=GooseIT/OU=GO-EUC/CN=goosecipher.com"

Scenarios

This research focuses on round-trip time (RTT), measured in nanoseconds, which reflects the total time taken for data to travel from source to destination and back. Here, RTT measures the time required to complete a full handshake process. After each result, the agent returns to the achilles command center, where a report is generated. This report includes the average RTT for all handshakes under that command, expressed in nanoseconds (ns).

The matching cipher suites between TLSv1.2 and 1.3 will be selected. The excluded protocols (SSL/TLS1.0/TLS.1.1) are formally deprecated and, therefore, no longer relevant to most IT environments. source:https://datatracker.ietf.org/doc/rfc8996/.

The scenarios to be tested can be found in the table below. Comparing these similar cipher suites allows for a relative comparison of the impact of using either TLSv1.2 or TLSv1.3 on RTT. The numbers can provide insight into a relative comparison with the same hardware, which is exactly what this research aims to achieve. Having multiple agents scales the number of handshakes, which represent a unique connection. More handshakes equals more load on the hardware, generating results for both low-load and high-load scenarios.

  • TLSv1.2
    • ECDHE_RSA_WITH_AES_128_GCM_SHA256
    • ECDHE_RSA_WITH_AES_256_GCM_SHA384
    • ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
  • TLSv1.3
    • AES_128_GCM_SHA256
    • AES_256_GCM_SHA384
    • CHACHA_POLY1305_SHA256

Each scenario requires both cipher suites to go through the test cases, resulting in six tests (3 * 2 = 6).

Below table is the test case that each cipher suite will go through:

Agent Count Handshake Count (per Agent) Total handshakes
10 10 100
10 50 500
10 100 1000
10 250 2500
25 10 250
25 50 1250
25 100 2500
25 250 6250
50 10 500
50 50 2500
50 100 5000
50 250 12500
TOTAL   34.850 * 6 = 209.100

Quality control of the results will be achieved by including standard deviation. An overview will be generated per handshake test. The averages per test will then be combined to create an average for each cipher suit, and a standard deviation will be generated. The individual results per test will not be published here but are available through GO-EUC as they store the data indefinitely. The data is stored in separate databases to keep the output organized.

See the screenshot for an example. That is an example of a 10*10 handshake test. As a result of all those averages, a new average will be created as a consolidation effort with a regenerated standard deviation.

Simulation

Hypothesis

TLSv1.3 is expected to be more efficient than TLSv1.2, as it was designed with improvements to its predecessor’s performance and security. Many TLSv1.2 ciphers are considered legacy and insecure due to exploitable vulnerabilities. Since the main focus is on handshake efficiency, the key exchange algorithm, ECDHE, is used consistently across tests, deviations in handshake results based on the cipher suites are unlikely. The NetScaler, optimized for SSL offloading, is expected to yield consistent outcomes. The revised handshake process that TLSv1.3 uses is expected to result in a 30-40% relative improvement because it uses about 30-40% fewer steps to conclude the handshake process.

Results

These are the results for each test scenario described in the methodology. A bit further down in the research, you will find the data plotted to graphs in milliseconds for readability. The “Average RTT ns” is the handshake time average, measured in nanoseconds. The standard deviation is added to provide a sense of data sanitation. Lower Standard deviation numbers, indicate a higher stability throughout the handshake average. It is crucial to realize that the RTT does not represent anything on its own, as factors like the hardware used determine this number.

AES_128_GCM_SHA256

A table with the overview of all generated results for the scenarios relating to AES_128.

    TLSv1.2   TLSv1.3  
Agent Count Handshakes Average RTT ns ±SD Average RTT ns ±SD
10 10 24124672 11588853 13593146 5691270
10 50 24302336 11960147 11842348 4497673
10 100 22156082 10236698 11096100 4297257
10 250 24046093 11255481 10974147 3915799
25 10 55620390 29122670 20220804 14006490
25 50 43333236 28185829 20507794 12000659
25 100 38248831 19672761 21640990 12874166
25 250 43735924 25327631 21524655 12572572
50 10 64738606 35685054 29956731 20120825
50 50 95391979 65374118 32480341 22601449
50 100 68026407 42795512 31313028 21806728
50 250 95941910 178922473 31683775 21646617

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-128.

In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-128.

AES_256_GCM_SHA384

A table with the overview of all generated results for the scenarios relating to AES_256.

    TLSv1.2   TLSv1.3  
Agent Count Handshakes Average RTT ns ±SD Average RTT ns ±SD
10 10 23472740 10404468 11220805 4492969
10 50 24819355 12251272 12067919 4597934
10 100 21678316 9877222 12306595 4751224
10 250 22559280 10331732 12089709 4568950
25 10 67743392 34264429 22903666 13784750
25 50 51597147 32301946 21249985 12161088
25 100 37603421 19143934 20479910 11548197
25 250 40625497 23121576 20664233 12101581
50 10 34692573 18404494 27931368 20539579
50 50 68555682 42915297 33022688 21865491
50 100 65344589 38457757 32036442 21590571
50 250 69411967 43665152 32459247 22592934

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-256.

In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-256. An important observation is that TLSv1.2 is faster with more agents. This is probably contributing to the lower stability of data with higher volumes, as can be seen in the standard deviation.

CHACHA_POLY1305_SHA256

A table with the overview of all generated results for the scenarios relating to CHACHA_POLY1305.

    TLSv1.2   TLSv1.3  
Agent Count Handshakes Average RTT ns ±SD Average RTT ns ±SD
10 10 23665354 10277814 13356534 5692718
10 50 27054232 17376683 13188812 5064375
10 100 24350979 14993304 12696442 5179193
10 250 23985602 12112042 12621275 5053705
25 10 74524403 43166485 20122366 9533544
25 50 66876352 38603930 19465379 10253255
25 100 49618250 29930404 21962350 12385663
25 250 45523500 28662715 21050463 11903326
50 10 67086032 37835345 30050714 21080453
50 50 92409521 72047236 32334239 21679841
50 100 101408514 66245119 32855143 22810535
50 250 75325677 47905519 32744349 32744349

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 in the average handshake across all the results combined for CHAHCA_POLY1305.

In this line chart, you can see the average RTT per agent count in the testing scenarios for CHAHCA_POLY1305.

Ciphers relative comparison

The research question is about comparing cipher suites in TLSv1.2 and TLSv1.3. If the data gets plotted and compared side by side a difference can be seen. If the data is converted to percentages, with TLSv1.2 functioning as the baseline, there is an average handshake improvement of 56% (43 + 49 + 39 = 131 / 3 = 44. 100%-44% = 56%).

The comparison graphs indicate that TLSv1.3 ciphers maintain a consistent response time across all tests, especially impressive given the small time measurements, which minor fluctuations can easily influence in hardware availability.

Conclusion

The results lead to a clear conclusion: TLSv1.3 is on average, about 56% more efficient than TLSv1.2. While the hypothesis anticipated improved speed with TLSv1.3, the results exceeded expectations, with nearly all test cases showing a time reduction of more than half of the TLSv1.2 baseline. This improvement is likely due to the reduced number of packets required to complete the handshake, reducing CPU usage, and enabling better performance with larger handshake volumes. Remember that the results are based on a lab scenario, and in a real environment additional factors like optimization techniques can create different results.

During testing, the hardware was pushed to its limits; the 50 * x results required re-runs due to crashes and inaccuracies caused by external factors (e.g., host OS pop-ups, errors, RDP disconnects). The runs were all monitored, and the results were discarded if any external factor impact occurred. TLSv1.2 was notably more unstable during the generation of the results. The response of the machines became slower. Because of the impact of the limited hardware capacity, the expectation is that with different hardware, TLSv1.2 would act more stable in terms of overall RTT.

The generated results are important as they show that stability is increased on the same hardware. This creates the assumption that less computational power is required, too, which is to be confirmed in separate research. In the same light, it would be interesting to test different load balancer competitors to confirm if that affects the RTT overall. The results are promising and support further investigation with Achilles. As for what this means for your environment, the upside of negotiation is that you don’t need to pick between TLSv1.3 and TLSv1.2 if you have any network product available for negotiation between protocols (like NetScaler). Negotiation is important because not every device supports TLSv1.3. Often, legacy software (e.g. Windows Server 2012) or cheaper/eos devices are not capable of using TLSv1.3, which would prevent secure data transfers if no fallback is available. It would be recommended that TLSv1.3 be enabled with a higher preference and that TLSv1.2 be allowed for fallback. If you are in a modern environment without a device like NetScaler where TLSv1.3 is fully supported, choosing that instead and profit from the improved handshake times would be recommended. The next step for GO-EUC is to generate the computational power results for a cross-reference on efficiency to validate what impact this could have on your IT environment’s hardware.

Credits

Special thanks to GO-EUC members Leee and Ryan for their support in the form of both hardware, transport and in-depth feedback. Without them this research would not have been possible, if you enjoyed it, please consider buying them a beer.

Photo by Chris Liverani on Unsplash