TLSv1.2 vs TLSv1.3: A Deep Dive into Handshake Efficiency

Introduction
Methodology
OpenSSL
- Scenarios
- Hypothesis
Results
Ciphers relative comparison
Conclusion
- Credits

Cloud Providers, Load Balancers, Security Experts, Architects, etc., all have one thing in common. At some point, they will advise you to utilize TLSv1.2 and/or TLSv1.3 protocols and abandon legacy protocols like TLSv1.0 and TLSv1.1. Legacy protocols are still very much in use to this day. Some of those advisors will provide a set of cipher suites to configure for these protocols, but what is the performance impact these cipher suites have?

The goal of this research is to determine the efficiency of TLSv1.3 handshakes compared to TLSv1.2 using the same settings and hardware and how cipher suites impact these results.

Since TLS is a core technology for most data in the IT industry, the results can provide proof that drives choices that can both reduce ecological impact and optimize IT environments their efficiency.

Introduction

Transport Layer Security (TLS) is the successor to Secure Socket Layer (SSL). Despite this, many still refer to TLS as SSL due to its historical roots and familiarity within the cryptography community. TLS provides a standardized method for securing data communication between two parties over a network connection.

TLS operates using algorithms that encrypt data through computational processes. The algorithms available vary by protocol version and required security level. A ‘Cipher Suite’ refers to the specific set of algorithms used in a connection and their configurations.

The encryption and decryption of an TLS connection consumes computational power through a computer’s Central Processing Unit (CPU). The CPU does the computation for the algorithms. These connections, from beginning to end are often referred to as ‘SSL transactions’. The computational power required for a singular transaction is insignificant in modern IT, but what if there are millions of connections to decrypt simultaneously? In this case, the impact can clog the CPU of a server and potentially cause delays in response.

TLS has gone through multiple versions, and anything before TLSv1.2 is considered legacy (TLSv1.1, TLSV1.0, SSLv3, etc.). Currently, TLSv1.2 and TLSv1.3 are widely used protocols. Some of the improvements of TLSv1.3 over its predecessor relate to efficiency. One of those improvements relates to the approach of cipher suites.

Cipher suites

SSL/TLS communication can not occur without a cipher suite. The cipher suite is a combination of algorithms to secure communication. Cipher suites have a complex naming convention. To understand the naming convention properly, let’s dissect one.

Cipher suites are described with multiple abbreviations, for example: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA256_P384

Cipher Suite Structure

Source: https://learn.microsoft.com/en-au/windows/win32/secauthn/cipher-suites-in-schannel)

If you would like to learn more in-depth about cipher suites, consider the following Wikipedia article.

TLSv1.2 vs TLSv1.3

TLSv1.2 and TLSv1.3 function differently, with TLSv1.3 being the newer version. A key improvement in TLSv1.3 is its streamlined design, which uses fewer cipher suites and delivers a more efficient handshake process. In TLSv1.3, only five ciphers are available, three of which are set as default.

TLSv1.2 has multiple key exchange algorithms. TLSv1.3 attempted to optimize this process with only ECDHE/DHE for the key exchange algorithms. There is no choice for other key exchange algorithms.

Key Exchange Algorithm	TLSv1.2	TLSv1.3
ECDHE/DH	Y	Y
AES-GCM	Y	N
SHA-2	Y	N
ECDSA	Y	N
CHACHA20	Y	N

Fast Handshake

When using TLSv1.2, multiple packets are exchanged to determine a compatible cipher suite, with the initial packets sent in clear text until the cipher suite is agreed upon. In contrast, TLSv1.3 optimizes this process using server certificate encryption, reducing the packet exchange.

Methodology

Overview

The logical structure of the methodology can be found in the below image. A handshake is simulated, and the RTT data is sent back for processing.

Achilles

A key challenge in this research is simulating connections in a way that ensures each handshake is unique, preventing results from being skewed by optimization techniques. Scalability is also crucial, as testing under variable loads provides better insights into handshake efficiency under stress. It proved to be a challenge to find software that meets these needs. I opted to develop custom software called Achilles instead.

Achilles uses a coordinator/agent architecture with a central command center that distributes data to its agents. This data includes the target and specified cipher suite/protocol, enabling the simulation of many unique handshakes. From a research perspective, it is essential to make both the results and methodology accessible so the code has been made publicly available. Please note that Achilles is intended solely for research purposes and is not a commercial product. The Achilles repository can be found here here

Hardware Setup

While the hardware specification is not important at its core for a relative comparison, an incorrect setup can impact results. The tables below provide a short overview of how the hardware is set up.

Machine A

Shared Resource Pool	Achilles Command Center	NetScaler CXP
CPU (1 Socket / 8 Cores)	3,70GHZ	3,70GHZ
RAM	48GB	48GB
Disk	2TB	2TB

Machine B

Shared Resource Pool	Proxmox	Ubuntu VM (Podman)
CPU (1 Socket / 12 Cores)	1,60GHZ	1,60GHZ
RAM	32GB	19,5GB
Disk	512GB	200GB

Network

The network connection between machine A/B is limited to CAT5e (1000Mbps). It’s a local connection routing via a Ubiquiti UniFi Dream Machine (1Gbps).

NetScaler

The NetScaler will host a virtual server and will only be adjusted for the applicable cipher suite negotiation. Achilles can negotiate on TLSv1.2 ciphers from the client perspective, but TLSv1.3 negotiation is not supported due to library restrictions. Therefore, TLSv1.3 ciphers must be enforced through the NetScaler when testing. To ensure that the testing scenarios are equal, TLSv1.2 Ciphers will be configured on the NetScaler side similarly, even if it is not required.

The settings for the Ciphers can be seen in the screenshot below. During the testing, irrelevant ciphers will be removed based on the scenario being tested. Besides the minimally required configuration mentioned above, all NetScaler settings are default.

OpenSSL

OpenSSL is used to generate the certificates. The generation of the OpenSSL certificate is reproducible with the following command:

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=NL/ST=Noord-Holland/L=Amsterdam/O=GooseIT/OU=GO-EUC/CN=goosecipher.com"

Scenarios

This research focuses on round-trip time (RTT), measured in nanoseconds, which reflects the total time taken for data to travel from source to destination and back. Here, RTT measures the time required to complete a full handshake process. After each result, the agent returns to the achilles command center, where a report is generated. This report includes the average RTT for all handshakes under that command, expressed in nanoseconds (ns).

The matching cipher suites between TLSv1.2 and 1.3 will be selected. The excluded protocols (SSL/TLS1.0/TLS.1.1) are formally deprecated and, therefore, no longer relevant to most IT environments. source:https://datatracker.ietf.org/doc/rfc8996/.

The scenarios to be tested can be found in the table below. Comparing these similar cipher suites allows for a relative comparison of the impact of using either TLSv1.2 or TLSv1.3 on RTT. The numbers can provide insight into a relative comparison with the same hardware, which is exactly what this research aims to achieve. Having multiple agents scales the number of handshakes, which represent a unique connection. More handshakes equals more load on the hardware, generating results for both low-load and high-load scenarios.

TLSv1.2
- ECDHE_RSA_WITH_AES_128_GCM_SHA256
- ECDHE_RSA_WITH_AES_256_GCM_SHA384
- ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
TLSv1.3
- AES_128_GCM_SHA256
- AES_256_GCM_SHA384
- CHACHA_POLY1305_SHA256

Each scenario requires both cipher suites to go through the test cases, resulting in six tests (3 * 2 = 6).

Below table is the test case that each cipher suite will go through:

Agent Count	Handshake Count (per Agent)	Total handshakes
10	10	100
10	50	500
10	100	1000
10	250	2500
25	10	250
25	50	1250
25	100	2500
25	250	6250
50	10	500
50	50	2500
50	100	5000
50	250	12500
TOTAL		34.850 * 6 = 209.100

Quality control of the results will be achieved by including standard deviation. An overview will be generated per handshake test. The averages per test will then be combined to create an average for each cipher suit, and a standard deviation will be generated. The individual results per test will not be published here but are available through GO-EUC as they store the data indefinitely. The data is stored in separate databases to keep the output organized.

See the screenshot for an example. That is an example of a 10*10 handshake test. As a result of all those averages, a new average will be created as a consolidation effort with a regenerated standard deviation.

Simulation

Hypothesis

TLSv1.3 is expected to be more efficient than TLSv1.2, as it was designed with improvements to its predecessor’s performance and security. Many TLSv1.2 ciphers are considered legacy and insecure due to exploitable vulnerabilities. Since the main focus is on handshake efficiency, the key exchange algorithm, ECDHE, is used consistently across tests, deviations in handshake results based on the cipher suites are unlikely. The NetScaler, optimized for SSL offloading, is expected to yield consistent outcomes. The revised handshake process that TLSv1.3 uses is expected to result in a 30-40% relative improvement because it uses about 30-40% fewer steps to conclude the handshake process.

Results

These are the results for each test scenario described in the methodology. A bit further down in the research, you will find the data plotted to graphs in milliseconds for readability. The “Average RTT ns” is the handshake time average, measured in nanoseconds. The standard deviation is added to provide a sense of data sanitation. Lower Standard deviation numbers, indicate a higher stability throughout the handshake average. It is crucial to realize that the RTT does not represent anything on its own, as factors like the hardware used determine this number.

AES_128_GCM_SHA256

A table with the overview of all generated results for the scenarios relating to AES_128.

		TLSv1.2		TLSv1.3
Agent Count	Handshakes	Average RTT ns	±SD	Average RTT ns	±SD
10	10	24124672	11588853	13593146	5691270
10	50	24302336	11960147	11842348	4497673
10	100	22156082	10236698	11096100	4297257
10	250	24046093	11255481	10974147	3915799
25	10	55620390	29122670	20220804	14006490
25	50	43333236	28185829	20507794	12000659
25	100	38248831	19672761	21640990	12874166
25	250	43735924	25327631	21524655	12572572
50	10	64738606	35685054	29956731	20120825
50	50	95391979	65374118	32480341	22601449
50	100	68026407	42795512	31313028	21806728
50	250	95941910	178922473	31683775	21646617

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-128.

In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-128.

AES_256_GCM_SHA384

A table with the overview of all generated results for the scenarios relating to AES_256.

		TLSv1.2		TLSv1.3
Agent Count	Handshakes	Average RTT ns	±SD	Average RTT ns	±SD
10	10	23472740	10404468	11220805	4492969
10	50	24819355	12251272	12067919	4597934
10	100	21678316	9877222	12306595	4751224
10	250	22559280	10331732	12089709	4568950
25	10	67743392	34264429	22903666	13784750
25	50	51597147	32301946	21249985	12161088
25	100	37603421	19143934	20479910	11548197
25	250	40625497	23121576	20664233	12101581
50	10	34692573	18404494	27931368	20539579
50	50	68555682	42915297	33022688	21865491
50	100	65344589	38457757	32036442	21590571
50	250	69411967	43665152	32459247	22592934

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-256.

In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-256. An important observation is that TLSv1.2 is faster with more agents. This is probably contributing to the lower stability of data with higher volumes, as can be seen in the standard deviation.

CHACHA_POLY1305_SHA256

A table with the overview of all generated results for the scenarios relating to CHACHA_POLY1305.

		TLSv1.2		TLSv1.3
Agent Count	Handshakes	Average RTT ns	±SD	Average RTT ns	±SD
10	10	23665354	10277814	13356534	5692718
10	50	27054232	17376683	13188812	5064375
10	100	24350979	14993304	12696442	5179193
10	250	23985602	12112042	12621275	5053705
25	10	74524403	43166485	20122366	9533544
25	50	66876352	38603930	19465379	10253255
25	100	49618250	29930404	21962350	12385663
25	250	45523500	28662715	21050463	11903326
50	10	67086032	37835345	30050714	21080453
50	50	92409521	72047236	32334239	21679841
50	100	101408514	66245119	32855143	22810535
50	250	75325677	47905519	32744349	32744349

Comparison chart

In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 in the average handshake across all the results combined for CHAHCA_POLY1305.

In this line chart, you can see the average RTT per agent count in the testing scenarios for CHAHCA_POLY1305.

Ciphers relative comparison

The research question is about comparing cipher suites in TLSv1.2 and TLSv1.3. If the data gets plotted and compared side by side a difference can be seen. If the data is converted to percentages, with TLSv1.2 functioning as the baseline, there is an average handshake improvement of 56% (43 + 49 + 39 = 131 / 3 = 44. 100%-44% = 56%).

The comparison graphs indicate that TLSv1.3 ciphers maintain a consistent response time across all tests, especially impressive given the small time measurements, which minor fluctuations can easily influence in hardware availability.

Conclusion

The results lead to a clear conclusion: TLSv1.3 is on average, about 56% more efficient than TLSv1.2. While the hypothesis anticipated improved speed with TLSv1.3, the results exceeded expectations, with nearly all test cases showing a time reduction of more than half of the TLSv1.2 baseline. This improvement is likely due to the reduced number of packets required to complete the handshake, reducing CPU usage, and enabling better performance with larger handshake volumes. Remember that the results are based on a lab scenario, and in a real environment additional factors like optimization techniques can create different results.

During testing, the hardware was pushed to its limits; the 50 * x results required re-runs due to crashes and inaccuracies caused by external factors (e.g., host OS pop-ups, errors, RDP disconnects). The runs were all monitored, and the results were discarded if any external factor impact occurred. TLSv1.2 was notably more unstable during the generation of the results. The response of the machines became slower. Because of the impact of the limited hardware capacity, the expectation is that with different hardware, TLSv1.2 would act more stable in terms of overall RTT.

The generated results are important as they show that stability is increased on the same hardware. This creates the assumption that less computational power is required, too, which is to be confirmed in separate research. In the same light, it would be interesting to test different load balancer competitors to confirm if that affects the RTT overall. The results are promising and support further investigation with Achilles. As for what this means for your environment, the upside of negotiation is that you don’t need to pick between TLSv1.3 and TLSv1.2 if you have any network product available for negotiation between protocols (like NetScaler). Negotiation is important because not every device supports TLSv1.3. Often, legacy software (e.g. Windows Server 2012) or cheaper/eos devices are not capable of using TLSv1.3, which would prevent secure data transfers if no fallback is available. It would be recommended that TLSv1.3 be enabled with a higher preference and that TLSv1.2 be allowed for fallback. If you are in a modern environment without a device like NetScaler where TLSv1.3 is fully supported, choosing that instead and profit from the improved handshake times would be recommended. The next step for GO-EUC is to generate the computational power results for a cross-reference on efficiency to validate what impact this could have on your IT environment’s hardware.

Credits

Special thanks to GO-EUC members Leee and Ryan for their support in the form of both hardware, transport and in-depth feedback. Without them this research would not have been possible, if you enjoyed it, please consider buying them a beer.

Photo by Chris Liverani on Unsplash

Mick Hilhorst

Mick Hilhorst is a freelance consultant at Goose IT, who takes care of challenges customers may face regarding the Citrix Portfolio

Unveiling the True Cost: Single-User Microsoft Azure Virtual Desktop vs. Windows 365

Benchmarking HP Anyware PCoIP on a GPU-enabled Azure VM

Table of Contents