The content we introduced before mainly focused on symmetric encryption. Starting from this article, our focus has shifted to asymmetric encryption algorithm, also known as public key encryption algorithm. There is a key link in the whole algorithm: key exchange. As the name implies, the key exchange "person" has to solve the essential problem of how to exchange keys safely. Let's invite our old friends Queen Alice and Lord Bob to explain. Assuming that Queen Alice and Lord Bob want to communicate securely, then Queen Alice and Lord Bob will send their respective keys to each other. In this way, both communication parties hold the key shared by * * *, which can be used for subsequent information security interaction.
In order to make the follow-up discussion more grounded, we assume that Queen Alice and Lord Bob have never met, so how can we make the Queen and Lord communicate safely? This problem is also the most primitive application scenario of key exchange algorithm. In order to ensure the privacy of communication between the queen and the Lord, both parties need a shared key, but the shared key for secure communication is not as simple as expected. If a malicious attacker eavesdrops on the telephone communication or e-mail communication between the Queen and the Lord (assuming that the message is transmitted in plain text), then the malicious attacker will steal the key shared by the Queen and the Lord, and all subsequent communication contents between the two parties can be cracked by this key, so all private communication between the Queen and the Lord is unsafe, and the communication content has become the talk of the whole kingdom after dinner.
How to solve this problem? This is the core problem to be solved by key exchange algorithm. Simply put, through the key exchange algorithm, Queen Alice and Lord Bob can exchange keys safely. Even if malicious attackers are listening to all communication lines, they can't get the keys from both sides, so the queen and the Lord can finally gossip without worry.
The secret key exchange begins with the communication parties generating their own secret keys. Usually, asymmetric encryption algorithms generate two keys: a public key and a private key. Then the two parties send their own public keys to each other, and the "public" of the public key means public here, which means that malicious attackers can also obtain the public key information generated by the two parties. Then the queen and the Lord combine the received public key with their own private key, and the result is the communication key shared by * * *. You can look at this public key from the perspective of a malicious attacker. Because the malicious attacker does not have the private key of either party, the malicious attacker cannot obtain the "* * *" key for communication between the queen and the Lord. About the process of * * * enjoying the secret key on the side of the Queen and Lord, as shown in the following figure:
After understanding the general working mechanism of the key exchange algorithm, let's take a look at how the key exchange algorithm solves the problem of secure communication between Queen Alice and Lord Bob. In the process shown in the above figure, the queen and the Lord determine the key that can be used for secure communication through the key exchange algorithm, which can be used as the key for authentication and encryption. Therefore, even if MITM (man-in-the-middle attacker) intercepts the communication data between the Queen and the Lord, the communication content will not be cracked because there is no key, and the Queen and the Lord can communicate safely, as shown in the following figure:
However, the content described here is a bit imprecise. At best, we can only call this situation passive MITM. In the vernacular, it refers to the passive listening of malicious attackers, but the main difference from active MITM is that the active middleman intercepts the data exchanged by the key exchange algorithm, and then simulates the roles of both parties. Specifically, the middleman will actively exchange keys with the Queen and the Lord at the same time, and both parties "think" that they have reached a * * * knowledge about * * * with each other, but in essence, the Queen and the Lord only reached a * * * knowledge about * * * with the middleman. You can think about what caused this misunderstanding.
In fact, the reason behind it is not difficult to understand, because the communication parties have no other means to judge the holding relationship between the received public key and the communication counterpart. We also refer to this kind of key exchange as "unauthorized" key exchange, as shown in the following figure:
So how to solve the active attack of MITM? I believe you can guess the authentication key exchange. Let's see why we need this authentication mode through a specific business scenario. Suppose we have developed a set of service to provide time information, which is deployed in Alibaba Cloud. In order to prevent the time data from being modified by malicious attackers, we use MAC (Message Authentication Code). If you have no idea about MAC, you can refer to the author's previous article.
MAC needs a key to protect the confidentiality and integrity of data, so we will generate a key when deploying the application, and then this key will be illegally given to all client users in some way. The application runs very stably, and because of the existence of the key, all law-abiding clients can read the exact time. However, after learning this article, a customer found that this secret key can be used to tamper with data, so our website was complained by a large number of customers that the reading time was inaccurate, which led to problems in the business operation and data processing of the system.
You ask the architect to deal with it quickly. The architect gives a scheme in which each user can generate a unique key. Although it can stop the bleeding, you will soon find that this scheme is not feasible and cannot be operated and maintained. With the rapid increase of users, how to configure and manage these keys has become a big problem, not to mention changing them regularly. The key exchange algorithm can come in handy here. What we need to do is to generate a key on the server and then provide public key information for each new user. Because the client knows the public key information of the server, it is impossible to simulate MITM attacks in two directions. We also call this mode authentication key exchange.
Let's continue to analyze this scene. Although the man-in-the-middle indicates that he can also exchange keys with the service, at this time, the man-in-the-middle is no different from ordinary clients, so he can't carry out active MITM attacks.
With the development of science and technology, the Internet is almost everywhere in our life, so it is extremely important to establish the secret key between the two communication parties safely. However, the model we introduced earlier is not very scalable, because the client needs to preset the public key of the server in advance, which is particularly obvious in the Internet scenario. For example, as users, we hope to communicate with many bank websites and social security websites safely. If we need to preset the public key information of each website for mobile phones, tablets and desktops to access safely, then you can consider how bad the convenience will be and how we can access the newly developed websites safely.
Therefore, readers need to understand a very important point. Key exchange is very important, but there is the scalability problem mentioned above, and the solution of this problem depends on digital signature technology. The combination of digital signature and key exchange is the basis of SSL technology that we will introduce later. It takes a long time to make it clear, so we will use a special chapter to introduce SSL principle later. However, for the sake of smooth introduction, let's talk about several specific key exchange algorithms and the mathematical principles behind them.
Let's talk about the Diffie-Hellman key exchange algorithm mentioned in the first article of the author's series. Whitfield Diffie and Martin E. Hellman published a groundbreaking paper in 1976, introducing DH(Diffie-Hellman) key exchange algorithm, which is called "the new direction of cryptography". This paper is called groundbreaking mainly because it has two firsts: the first key exchange algorithm and the first public key encryption algorithm (or asymmetric encryption algorithm).
The data principle of DH algorithm is group theory, which is also the cornerstone of all security mechanisms we come into contact with today. Because I didn't graduate from math major and my math foundation is not too solid, I have been hesitant about how to continue to deepen from the perspective of safety. In order to make this algorithm easier for readers to understand, the following contents will involve some basic knowledge of data, which I believe students with high school mathematics knowledge should be able to understand.
When introducing group theory, the first problem is to clearly define what a group is. The author consulted relevant materials and found that groups in the field of mathematics have the following two characteristics:
1, which consists of a set of elements.
2. Special binary operators are defined between elements (such as? Or)
Based on the above definition, if this group of elements and the binary operator defined above meet certain attributes, then we call these elements a group. For DH algorithm, the group used later is called multiplication group: the collection of elements that define multiplication binary operators. Readers may ask, so what attributes does this multiplication group satisfy? Because this part of the content is more, we will list them one by one:
-It's over. After two elements in the collection are calculated by the defined operator, the result is also in the collection. For example, if we have a set M with elements A, B and c(c=a*b), then these three elements conform to the closure property, and the operator defined on the set is multiplication.
-Associativity. This is consistent with the concept of combination in middle school mathematics. You can understand the mathematical formula A× (b× c) = (a× b )× c
-Identity element. The set contains unit elements, and the values of any element and unit element will not change after being calculated by operators. For example, if we have the set m, which contains elements (1, a, b, c ...), then 1 is the unit element, because 1*a = a, a and the unit element 1 are all calculated by operators, and the result remains unchanged.
-Inverse element. Any element in the collection has an inverse element. For example, we have an element A of a set, and the inverse element of this element is 1/a, and the result of the operator calculating the element A and the inverse element is 1.
The author must admit that his superficial mathematical knowledge may lead to the explanation of the above four attributes, which makes everyone more confused, so I have prepared the following picture, hoping to have a more detailed supplementary explanation of the four attributes owned by the group.
After introducing the related information such as groups and group attributes, let's take a concrete look at what groups used in DH algorithm look like. The group used in DH algorithm is composed of positive integers. In most cases, the elements that make up the group are prime numbers, such as this group (1, 2, 3, ..., p- 1), where p is generally a very large prime number, in order to ensure the security of the algorithm.
Note: The mathematical definition of a prime number is a number that can only be divisible by itself and 1, such as 2, 3, 5, 7, 1 1 and so on. Prime numbers are widely used in asymmetric encryption algorithms. Computer majors should have written a program to find and print prime numbers during their college years. The core of the algorithm is to enumerate all the numbers in order to judge whether they are prime numbers, and if so, print them out. However, from the point of view of algorithm, this exhaustive model is inefficient, so many efficient algorithms have appeared in the industry, and relatively large prime numbers can be found soon.
In addition to prime numbers, another property of the group used by DH algorithm is modular operator, which is specifically called modular multiplication operation. Let's talk about modular operation first, which is very similar to some high-altitude math problems of primary school students, focusing on quotient and remainder. For example, if we set the modulus to 5, when the number is greater than 5, the wrap will start again from 1. For example, the result of the number 6 modulo 5 is 1, the result of 7 is 2, and so on. The most classic example of modular operation is the clock, which has 24 hours a day, so when we count with 12 hours, the 13 points in the afternoon are called 1 points, because 1 3 =1*12+.
Then let's look at the definition of modular multiplication. Let's take 6 as an example, and 6 = 3 2. If the modulus is 5, we know that all 6 are equal to (Congo)1moduo5, then our formula can be written as:
3 × 2 = 1 mod 5
We draw a very important conclusion from the above equation. If mod 5 is removed, we get 3 × 2 = 1, so 3 and 2 are reciprocal elements.
Finally, let's summarize two characteristics of DH algorithm basic group:
-commutative law (commutative law), two elements in a group have commutative law, that is, ab=ba. Usually, we call a group with commutative law Galois Group.
-finite field (finite field). Below we will introduce the characteristics of finite fields in detail.
The group defined by DH algorithm is also called FFDH (Refine Field Diffie-Hellman), and subgroup refers to a subset of the group. After we manipulate the elements in the subset through the defined operators, we will go to another subgroup.
A very important concept in group theory is cyclic subgroup. In the vernacular, you can multiply yourself by a generator (or radix). Generator 4 can be used for subgroups 1 and 4 for the following examples:
4 mod 5 = 4
4 × 4 mod 5 = 1
4 × 4 × 4 mod 5 = 4 (starting again is also the embodiment of cyclic subgroups)
4 × 4 × 4 × 4 mod 5 = 1
etc
When our module is a prime number, then every element in the group is a generator, and a cyclic subgroup can be generated, as shown in the following figure:
Finally, we completely summarize the Galois group defined by group and DH:
-A group is a set of elements that define binary operations, and has the properties of closure, association, identity and inverse.
The group defined by -DH is called Galois Group, the elements that make up this group are prime numbers, and the modular multiplication operation is defined on this group.
-In the group defined by DH, each element is a generator, which is multiplied by itself repeatedly to produce a subgroup.
Groups are the basis of many encryption algorithms. The author only introduces a little superficial knowledge. If readers are interested in this part, they can consult relevant materials. With a preliminary understanding of the group, the next article will introduce the working principle behind DH algorithm, so stay tuned!