Table of Contents
Cloud Providers, Load Balancers, Security Experts, Architects, etc., all have one thing in common. At some point, they will advise you to utilize TLSv1.2 and/or TLSv1.3 protocols and abandon legacy protocols like TLSv1.0 and TLSv1.1. Legacy protocols are still very much in use to this day. Some of those advisors will provide a set of cipher suites to configure for these protocols, but what is the performance impact these cipher suites have?
The goal of this research is to determine the efficiency of TLSv1.3 handshakes compared to TLSv1.2 using the same settings and hardware and how cipher suites impact these results.
Since TLS is a core technology for most data in the IT industry, the results can provide proof that drives choices that can both reduce ecological impact and optimize IT environments their efficiency.
Introduction
Transport Layer Security (TLS) is the successor to Secure Socket Layer (SSL). Despite this, many still refer to TLS as SSL due to its historical roots and familiarity within the cryptography community. TLS provides a standardized method for securing data communication between two parties over a network connection.
TLS operates using algorithms that encrypt data through computational processes. The algorithms available vary by protocol version and required security level. A ‘Cipher Suite’ refers to the specific set of algorithms used in a connection and their configurations.
The encryption and decryption of an TLS connection consumes computational power through a computer’s Central Processing Unit (CPU). The CPU does the computation for the algorithms. These connections, from beginning to end are often referred to as ‘SSL transactions’. The computational power required for a singular transaction is insignificant in modern IT, but what if there are millions of connections to decrypt simultaneously? In this case, the impact can clog the CPU of a server and potentially cause delays in response.
TLS has gone through multiple versions, and anything before TLSv1.2 is considered legacy (TLSv1.1, TLSV1.0, SSLv3, etc.). Currently, TLSv1.2 and TLSv1.3 are widely used protocols. Some of the improvements of TLSv1.3 over its predecessor relate to efficiency. One of those improvements relates to the approach of cipher suites.
Cipher suites
SSL/TLS communication can not occur without a cipher suite. The cipher suite is a combination of algorithms to secure communication. Cipher suites have a complex naming convention. To understand the naming convention properly, let’s dissect one.
Cipher suites are described with multiple abbreviations, for example: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA256_P384
Source: https://learn.microsoft.com/en-au/windows/win32/secauthn/cipher-suites-in-schannel)
If you would like to learn more in-depth about cipher suites, consider the following Wikipedia article.
TLSv1.2 vs TLSv1.3
TLSv1.2 and TLSv1.3 function differently, with TLSv1.3 being the newer version. A key improvement in TLSv1.3 is its streamlined design, which uses fewer cipher suites and delivers a more efficient handshake process. In TLSv1.3, only five ciphers are available, three of which are set as default.
TLSv1.2 has multiple key exchange algorithms. TLSv1.3 attempted to optimize this process with only ECDHE/DHE for the key exchange algorithms. There is no choice for other key exchange algorithms.
Key Exchange Algorithm | TLSv1.2 | TLSv1.3 |
---|---|---|
ECDHE/DH | Y | Y |
AES-GCM | Y | N |
SHA-2 | Y | N |
ECDSA | Y | N |
CHACHA20 | Y | N |
Fast Handshake
When using TLSv1.2, multiple packets are exchanged to determine a compatible cipher suite, with the initial packets sent in clear text until the cipher suite is agreed upon. In contrast, TLSv1.3 optimizes this process using server certificate encryption, reducing the packet exchange.
Methodology
Overview
The logical structure of the methodology can be found in the below image. A handshake is simulated, and the RTT data is sent back for processing.
Achilles
A key challenge in this research is simulating connections in a way that ensures each handshake is unique, preventing results from being skewed by optimization techniques. Scalability is also crucial, as testing under variable loads provides better insights into handshake efficiency under stress. It proved to be a challenge to find software that meets these needs. I opted to develop custom software called Achilles instead.
Achilles uses a coordinator/agent architecture with a central command center that distributes data to its agents. This data includes the target and specified cipher suite/protocol, enabling the simulation of many unique handshakes. From a research perspective, it is essential to make both the results and methodology accessible so the code has been made publicly available. Please note that Achilles is intended solely for research purposes and is not a commercial product. The Achilles repository can be found here here
Hardware Setup
While the hardware specification is not important at its core for a relative comparison, an incorrect setup can impact results. The tables below provide a short overview of how the hardware is set up.
Machine A
Shared Resource Pool | Achilles Command Center | NetScaler CXP |
---|---|---|
CPU (1 Socket / 8 Cores) | 3,70GHZ | 3,70GHZ |
RAM | 48GB | 48GB |
Disk | 2TB | 2TB |
Machine B
Shared Resource Pool | Proxmox | Ubuntu VM (Podman) |
---|---|---|
CPU (1 Socket / 12 Cores) | 1,60GHZ | 1,60GHZ |
RAM | 32GB | 19,5GB |
Disk | 512GB | 200GB |
Network
The network connection between machine A/B is limited to CAT5e (1000Mbps). It’s a local connection routing via a Ubiquiti UniFi Dream Machine (1Gbps).
NetScaler
The NetScaler will host a virtual server and will only be adjusted for the applicable cipher suite negotiation. Achilles can negotiate on TLSv1.2 ciphers from the client perspective, but TLSv1.3 negotiation is not supported due to library restrictions. Therefore, TLSv1.3 ciphers must be enforced through the NetScaler when testing. To ensure that the testing scenarios are equal, TLSv1.2 Ciphers will be configured on the NetScaler side similarly, even if it is not required.
The settings for the Ciphers can be seen in the screenshot below. During the testing, irrelevant ciphers will be removed based on the scenario being tested. Besides the minimally required configuration mentioned above, all NetScaler settings are default.
OpenSSL
OpenSSL is used to generate the certificates. The generation of the OpenSSL certificate is reproducible with the following command:
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=NL/ST=Noord-Holland/L=Amsterdam/O=GooseIT/OU=GO-EUC/CN=goosecipher.com"
Scenarios
This research focuses on round-trip time (RTT), measured in nanoseconds, which reflects the total time taken for data to travel from source to destination and back. Here, RTT measures the time required to complete a full handshake process. After each result, the agent returns to the achilles command center, where a report is generated. This report includes the average RTT for all handshakes under that command, expressed in nanoseconds (ns).
The matching cipher suites between TLSv1.2 and 1.3 will be selected. The excluded protocols (SSL/TLS1.0/TLS.1.1) are formally deprecated and, therefore, no longer relevant to most IT environments. source:https://datatracker.ietf.org/doc/rfc8996/.
The scenarios to be tested can be found in the table below. Comparing these similar cipher suites allows for a relative comparison of the impact of using either TLSv1.2 or TLSv1.3 on RTT. The numbers can provide insight into a relative comparison with the same hardware, which is exactly what this research aims to achieve. Having multiple agents scales the number of handshakes, which represent a unique connection. More handshakes equals more load on the hardware, generating results for both low-load and high-load scenarios.
- TLSv1.2
- ECDHE_RSA_WITH_AES_128_GCM_SHA256
- ECDHE_RSA_WITH_AES_256_GCM_SHA384
- ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
- TLSv1.3
- AES_128_GCM_SHA256
- AES_256_GCM_SHA384
- CHACHA_POLY1305_SHA256
Each scenario requires both cipher suites to go through the test cases, resulting in six tests (3 * 2 = 6).
Below table is the test case that each cipher suite will go through:
Agent Count | Handshake Count (per Agent) | Total handshakes |
---|---|---|
10 | 10 | 100 |
10 | 50 | 500 |
10 | 100 | 1000 |
10 | 250 | 2500 |
25 | 10 | 250 |
25 | 50 | 1250 |
25 | 100 | 2500 |
25 | 250 | 6250 |
50 | 10 | 500 |
50 | 50 | 2500 |
50 | 100 | 5000 |
50 | 250 | 12500 |
TOTAL | 34.850 * 6 = 209.100 |
Quality control of the results will be achieved by including standard deviation. An overview will be generated per handshake test. The averages per test will then be combined to create an average for each cipher suit, and a standard deviation will be generated. The individual results per test will not be published here but are available through GO-EUC as they store the data indefinitely. The data is stored in separate databases to keep the output organized.
See the screenshot for an example. That is an example of a 10*10 handshake test. As a result of all those averages, a new average will be created as a consolidation effort with a regenerated standard deviation.
Hypothesis
TLSv1.3 is expected to be more efficient than TLSv1.2, as it was designed with improvements to its predecessor’s performance and security. Many TLSv1.2 ciphers are considered legacy and insecure due to exploitable vulnerabilities. Since the main focus is on handshake efficiency, the key exchange algorithm, ECDHE, is used consistently across tests, deviations in handshake results based on the cipher suites are unlikely. The NetScaler, optimized for SSL offloading, is expected to yield consistent outcomes. The revised handshake process that TLSv1.3 uses is expected to result in a 30-40% relative improvement because it uses about 30-40% fewer steps to conclude the handshake process.
Results
These are the results for each test scenario described in the methodology. A bit further down in the research, you will find the data plotted to graphs in milliseconds for readability. The “Average RTT ns” is the handshake time average, measured in nanoseconds. The standard deviation is added to provide a sense of data sanitation. Lower Standard deviation numbers, indicate a higher stability throughout the handshake average. It is crucial to realize that the RTT does not represent anything on its own, as factors like the hardware used determine this number.
AES_128_GCM_SHA256
A table with the overview of all generated results for the scenarios relating to AES_128.
TLSv1.2 | TLSv1.3 | ||||
---|---|---|---|---|---|
Agent Count | Handshakes | Average RTT ns | ±SD | Average RTT ns | ±SD |
10 | 10 | 24124672 | 11588853 | 13593146 | 5691270 |
10 | 50 | 24302336 | 11960147 | 11842348 | 4497673 |
10 | 100 | 22156082 | 10236698 | 11096100 | 4297257 |
10 | 250 | 24046093 | 11255481 | 10974147 | 3915799 |
25 | 10 | 55620390 | 29122670 | 20220804 | 14006490 |
25 | 50 | 43333236 | 28185829 | 20507794 | 12000659 |
25 | 100 | 38248831 | 19672761 | 21640990 | 12874166 |
25 | 250 | 43735924 | 25327631 | 21524655 | 12572572 |
50 | 10 | 64738606 | 35685054 | 29956731 | 20120825 |
50 | 50 | 95391979 | 65374118 | 32480341 | 22601449 |
50 | 100 | 68026407 | 42795512 | 31313028 | 21806728 |
50 | 250 | 95941910 | 178922473 | 31683775 | 21646617 |
Comparison chart
In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-128.
In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-128.
AES_256_GCM_SHA384
A table with the overview of all generated results for the scenarios relating to AES_256.
TLSv1.2 | TLSv1.3 | ||||
---|---|---|---|---|---|
Agent Count | Handshakes | Average RTT ns | ±SD | Average RTT ns | ±SD |
10 | 10 | 23472740 | 10404468 | 11220805 | 4492969 |
10 | 50 | 24819355 | 12251272 | 12067919 | 4597934 |
10 | 100 | 21678316 | 9877222 | 12306595 | 4751224 |
10 | 250 | 22559280 | 10331732 | 12089709 | 4568950 |
25 | 10 | 67743392 | 34264429 | 22903666 | 13784750 |
25 | 50 | 51597147 | 32301946 | 21249985 | 12161088 |
25 | 100 | 37603421 | 19143934 | 20479910 | 11548197 |
25 | 250 | 40625497 | 23121576 | 20664233 | 12101581 |
50 | 10 | 34692573 | 18404494 | 27931368 | 20539579 |
50 | 50 | 68555682 | 42915297 | 33022688 | 21865491 |
50 | 100 | 65344589 | 38457757 | 32036442 | 21590571 |
50 | 250 | 69411967 | 43665152 | 32459247 | 22592934 |
Comparison chart
In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 on average handshake across all the results combined for AES-256.
In this line chart, you can see the average RTT per agent count in the testing scenarios for AES-256. An important observation is that TLSv1.2 is faster with more agents. This is probably contributing to the lower stability of data with higher volumes, as can be seen in the standard deviation.
CHACHA_POLY1305_SHA256
A table with the overview of all generated results for the scenarios relating to CHACHA_POLY1305.
TLSv1.2 | TLSv1.3 | ||||
---|---|---|---|---|---|
Agent Count | Handshakes | Average RTT ns | ±SD | Average RTT ns | ±SD |
10 | 10 | 23665354 | 10277814 | 13356534 | 5692718 |
10 | 50 | 27054232 | 17376683 | 13188812 | 5064375 |
10 | 100 | 24350979 | 14993304 | 12696442 | 5179193 |
10 | 250 | 23985602 | 12112042 | 12621275 | 5053705 |
25 | 10 | 74524403 | 43166485 | 20122366 | 9533544 |
25 | 50 | 66876352 | 38603930 | 19465379 | 10253255 |
25 | 100 | 49618250 | 29930404 | 21962350 | 12385663 |
25 | 250 | 45523500 | 28662715 | 21050463 | 11903326 |
50 | 10 | 67086032 | 37835345 | 30050714 | 21080453 |
50 | 50 | 92409521 | 72047236 | 32334239 | 21679841 |
50 | 100 | 101408514 | 66245119 | 32855143 | 22810535 |
50 | 250 | 75325677 | 47905519 | 32744349 | 32744349 |
Comparison chart
In this bar chart, you can see the differences between TLSv1.2 and TLSv1.3 in the average handshake across all the results combined for CHAHCA_POLY1305.
In this line chart, you can see the average RTT per agent count in the testing scenarios for CHAHCA_POLY1305.
Ciphers relative comparison
The research question is about comparing cipher suites in TLSv1.2 and TLSv1.3. If the data gets plotted and compared side by side a difference can be seen. If the data is converted to percentages, with TLSv1.2 functioning as the baseline, there is an average handshake improvement of 56% (43 + 49 + 39 = 131 / 3 = 44. 100%-44% = 56%).
The comparison graphs indicate that TLSv1.3 ciphers maintain a consistent response time across all tests, especially impressive given the small time measurements, which minor fluctuations can easily influence in hardware availability.
Conclusion
The results lead to a clear conclusion: TLSv1.3 is on average, about 56% more efficient than TLSv1.2. While the hypothesis anticipated improved speed with TLSv1.3, the results exceeded expectations, with nearly all test cases showing a time reduction of more than half of the TLSv1.2 baseline. This improvement is likely due to the reduced number of packets required to complete the handshake, reducing CPU usage, and enabling better performance with larger handshake volumes. Remember that the results are based on a lab scenario, and in a real environment additional factors like optimization techniques can create different results.
During testing, the hardware was pushed to its limits; the 50 * x results required re-runs due to crashes and inaccuracies caused by external factors (e.g., host OS pop-ups, errors, RDP disconnects). The runs were all monitored, and the results were discarded if any external factor impact occurred. TLSv1.2 was notably more unstable during the generation of the results. The response of the machines became slower. Because of the impact of the limited hardware capacity, the expectation is that with different hardware, TLSv1.2 would act more stable in terms of overall RTT.
The generated results are important as they show that stability is increased on the same hardware. This creates the assumption that less computational power is required, too, which is to be confirmed in separate research. In the same light, it would be interesting to test different load balancer competitors to confirm if that affects the RTT overall. The results are promising and support further investigation with Achilles. As for what this means for your environment, the upside of negotiation is that you don’t need to pick between TLSv1.3 and TLSv1.2 if you have any network product available for negotiation between protocols (like NetScaler). Negotiation is important because not every device supports TLSv1.3. Often, legacy software (e.g. Windows Server 2012) or cheaper/eos devices are not capable of using TLSv1.3, which would prevent secure data transfers if no fallback is available. It would be recommended that TLSv1.3 be enabled with a higher preference and that TLSv1.2 be allowed for fallback. If you are in a modern environment without a device like NetScaler where TLSv1.3 is fully supported, choosing that instead and profit from the improved handshake times would be recommended. The next step for GO-EUC is to generate the computational power results for a cross-reference on efficiency to validate what impact this could have on your IT environment’s hardware.
Credits
Special thanks to GO-EUC members Leee and Ryan for their support in the form of both hardware, transport and in-depth feedback. Without them this research would not have been possible, if you enjoyed it, please consider buying them a beer.
Photo by Chris Liverani on Unsplash