Who Is This For?
For anyone interested in learning how to source code audit a cryptographic API or must use a pre-existing production-ready cryptographic API in their own projects. Such people are expected to have professional programming experience and the math background of a high school student that passed precalculus.
It is not good enough to programming experience and the math background in Cryptography to translate the theory into working code. Said person must have a good command of the target language and instruction set architecture to protect the code against common security exploits affecting code bases. One major class of attacks relevant for this article are Integer Overflows/ Underflow Attacks. Throughout this blog article I walk through how the math behind modular arithmetic works and how to translate that into code that is Integer Safe.
Outline
Introduction to Programming Modular Arithmetic for Cryptography (Part 1)
Why I Chose C++ as The Programming Language for Language Exercises
How to Compile Sample Programs in the C++23 Standard
Definition of Modulus Operation: What Is It?
Integer Safe Program of Modulus Operation Without Risk of Integer Overflow/Underflow
Modular Arithmetic
Why Are We Learning To Code This?
Integer Safe Modular Addition
Integer Safe Modular Subtraction
Modular Multiplication
Modular Multiplicative Inverse (Its Modular Division)
Greatest Common Divisor Algorithm
Extended Euclidean Algorithm
Optimized Binary Extended Euclidean Algorithm
Constant Time Binary Extended Euclidean Algorithm
Modular Exponentiation
Optimized Binary Modular Exponentiation
Square-and-Multiply Algorithm
Montgomery Ladder
Application of Chinese Remainder Theorem for RSA
Legendre and Jacobi Symbol
Why are We Learning This
Legendre Algorithm
Jacobi Algorithm
Quadratic Reciprocity
Why are We Learning to Code This?
Tonelli-Shanks Algorithm
Finding Square Roots Modulo a prime number p
Finding Square Roots Modulo a prime number p When s is Large
Chinese Remainder Theorem
Why are We Learning to Code This?
Gauss's Algorithm
Introduction to Programming Modular Arithmetic for Cryptography (Part 1)
Modular Arithmetic limits the results of calculations to be an element within a finite set of consecutive elements. Almost all cryptographic primitives rely on some form of modular arithmetic. One reason why is to limit the range of possible values used in computation. This is important since we live in a world where we are constrained by finite resources. Modular Arithmetic also helps ensure every bit in the results of our calculations are statistically significant and not wasted. Cryptographic primitives are computationally expensive. Not using modular arithmetic would make them even more inefficient since bits in calculations would be excessive and there would be loss in precision and thus loss in the security assurance.
This blog post explores what the modulus operation is defined as, how it works, practice exercises to test your understanding of the theory, and coding exercises where you translate math formulas featuring modular arithmetic to practical code.
Why I Chose C++ As the Programming Language for Programming Exercises
C++ offers several language features that will help developers ensure their code is correct, easy to study and learn from, and therefore easy for others to use in the future. C++ also boasts excellent documentation and community support for security-focused coding. Two excellent works I strongly encourage you to read are Secure Coding in C/C++ Second Edition by Robert C Seacord and The CERT C++ Coding Standard. C++ also allows one to use pre-existing legacy cryptographic code written in C--and much production-ready crypto code is written in C for historical reasons.
How to Compile Sample Programs in the C++23 Standard
All programs featured in this blog post feature the C++23 standard using the g++ compiler. That is because the newest versions of the language have up-to-date compiler checks and security features.
All programs you see in this blog post can be compiled by copying and pasting the code to a new file and applying the following to the command line:
g++ --std=c++23 source_code.cpp -o output_file
Definition of Modulus Operation
The modulus operation is defined by the following:
Often we are taught in beginner cryptography courses that a mod n is simply the remainder of division when dividing a by n. The problem with this is that students often get confused when calculating the remainder when a < 0. This is why I present the above formula. It is guaranteed to give you the right answer even if a < 0.
Let us do some practice exercises to help us understand the modulus operation. Only use a scientific calculator that supports the modulus operation:
Here is the Solutions Manual to the above Modular Arithmetic Exercises:
Program of Modulus Operation
We can translate the math formula for the modulus operation into a computer program and can use the solutions to the modulus arithmetic arithmetic exercises above as test cases. Your program must be written in such a way that Integer Overflows/Underflows are impossible. We must be concerned about preventing Integer Overflows/Underflows when someone uses our code! Integer Overflows/Underflows take place when the result of an arithmetic operation, such as modular arithmetic, requires more memory than what was allocated to store the result. If this happens the final calculated result will be wrong--causing potential unpredictable behavior in the machine!
So try it now:
Caution: Please be aware that the "%" operator in C/C++ will not always yield the correct result for the modulus operation when a < 0. It will give the negative version of the answer. That is because the "%" operator in C/C++ uses truncated division and not floored division as the definition of modulus operation features. Here is a Stack Overflow Post about the difference.
After you have finished writing and testing your program you can test it against these test cases:
Does your code work against all test cases :)
If so, please compare it to mine. If our results differ at least one of us wrote our code wrong :)
Here is the link to my code.
See if yours is more reader-friendly than mine. Hopefully someone makes a solution that rectifies a possible bug in my solution and is more user-friendly than mine. The idea is by competing to write a sound, usable solution for others we work together to distill a more pristine coding solution with time:
NOTE: Please do not cheat and look at my solution before trying to make your own! By comparing your own solution to mine we can both learn from each other's good and bad coding habits. Please feel free to send your solution to me and let me know what tricks you used to solve the problem at: contact@programcryptography.com
What is important about this solution is that it has no risk of Integer Overflow/Underflow. You can feed it any value for both a and n without having to worry about such problems! Detecting Integer Overflow/Underflow before it happens in a program is important. Of course its better to avoid the situation altogether as the above program does.
Below is the complete program complete with test assertions:
Let's break down why this code works:
Earlier I gave a "Caution" that you need to be careful with using the "%" operator in C/C++ since it yields negative results when a < 0. But pay attention to line 8. If m < 0 we simply add m to n to get the positive result. Otherwise we leave m as is.
This code works because:
So if an integer a % n in C/C++ yields -(n-1). And -(n-1) + n = 1 and this is the lowest possible number you can get from a % n + n where a < 0. The largest possible result takes place when a % n = -1. And (a % n = -1) + n = n - 1. So the results of a % n + n are within the original range for a mod n I defined earlier:
This may not be the original solution you made. It may have been a more direct solution of the original math formula for a mod n that I have shown earlier. If so you may have done something similar to the following code:
Note: Please don't feel the need to understand all of the code. Only understand the lines of code that deal with Integer Overflow/Underflow detection.
The above code for modulus_legacy actually works for all test cases the same as the one derived from the Stack Overflow solution except for the one test case at line 115 above. This test case causes an overflow in calculation and therefore the error code -1 is returned.
Notice how messy the translation of the math formula to working code is. This teaches us an important lesson:
It is not good enough to simply hand math formulas or pseudocode to coders and expect that to be enough when coding cryptography. We have to consider the limits of the language in expressing the math.
In C++'s case for the test case at line 115 causes an overflow when multiplying the result of floor division "floordiv" to n at line 35. The code in lines 37 - 41 catches the error.
I decided that instead of explaining how the code works it is a better use of our time to instead learn more about detecting Integer Overflows / Underflows. Throughout the messier solution above we see several conditional statements designed to catch these errors at lines 26, 37, 45, and 63.
These checks for Integer Overflow / Underflow are based on the following, excellent articles made by GeeksforGeeks.
Please keep note of the following:
When calculating sums of integers and you need to check for Integer Overflows/Underflows:
When doing multiplication of integers and you need to check for Integer Overflows / Underflows:
For more information on Integer Overflows / Underflows I strongly recommend the book Secure Coding in C/C++, Second Edition.
The messier code is messy and suffers from overflow at the test case from line 115. So of course its best to use the first solution I should be used at all times.
However even writing code such as this is not good enough for cryptography in production!
In the real world cryptography demands working with numbers much larger than 64 bits in size. When doing arithmetic in computers for numbers larger than what fits in our CPU registers we call such arithmetic Multi-Precision Arithmetic.
You will not have the luxury of the "%" operator that you get in C/C++ when you need to program that later. Instead, you will have to program the equivalent of modulus on your own when dealing with numbers larger than 64 bits in size.
So here is your second programming challenge:
Here are some test cases you can use to test out your program:
Does your program pass all the test cases? If so, please compare your solution to mine:
Here is the link to my solution for a program that does the equivalent of the % operator
in C/C++.
Notice the solution looks similar to the actual formula for modulus. However it is not identical. That is because (a / n) is truncation division and not floor division. There is a difference.
It would be tedious having to make function calls to modulus all the time. So let us encapsulate this in a class definition in C++. This makes our code much easier to refactor, reuse, and recycle for future cryptographic engineers. Much easier than mere function calls in C.
The above code is a bare-bones skeleton of the Modulus class definition we will use them for the rest of this article as we expand on what this class definition for us. If you are rusty on C++ class definitions just as I was while writing this article I strongly recommend the following GeeksForGeeks article for a quick review.
Modular Arithmetic
Why Are We Learning To Code This?
We have just reviewed the mathematical definition of the modulus operation in number theory and have translated it into a re-factorable computer program that our colleagues can count on to function correctly.
Of course simply having the result of modulus does not do much. We have to do actual arithmetic with the result of modulus.
Modular Addition:
The formula for modular addition is simple:
Here are some exercises to test your understanding:
You did all of them and have the right answers, right?
Good because here are the solutions:
As you do these problems you may have noticed the following:
The above rules are essential to avoid Integer Overflow/Underflow problems when performing modular addition using signed arithmetic.
Why Signed Arithmetic is Better Than Unsigned Arithmetic
I argue it is best to allow representation of modulo numbers using signed arithmetic than unsigned arithmetic for two reasons:
Future algorithms you will see such as the Extended Euclidean Algorithm will return results as negative numbers. So it is important to support values for a < 0.
You can avoid having to deal with Integer Overflow/Underflow when using signed arithmetic. You can try any value for value "a" that is positive or negative as a parameter for the above function and it will return the correct positive modulo result without risk of Integer Overflow/Underflow.
As a further example consider the addition of:
Think carefully about the problem you see above. The number, 2^64 - 4 is greater than 2^63 - 1. There is no way to reduce (2^63 - 2) \mod (2^63 - 1) to a smaller positive version when using signed 64-bit arithmetic. Instead in signed 64-bit arithmetic we can represent (2^63 - 2) \mod (2^63 - 1) as -1 \mod (2^63 - 1).
Now that we have (2^63 - 2) \mod (2^63 - 1) represented as -1 \mod (2^63 - 1) we can add it to itself in signed 64-bit addition without worrying about Integer Overflow.
So here is your second programming exercise:
Remember you can find the class definition template for Mod here.
You can use the exercises for Modular Addition as test cases.
So by now you got your program, tested it, and ensured it works right? Great!
Here is my solution complete with test cases. Does yours pass all of them?
The solution simply adds to a_add with b_add. This is Integer Safe to do since b_add chooses the values between the positive modulo representation of a and b respectively versus their negative modulo representation that has a smaller absolute value.
For example to calculate (30 + 30) mod 31 is equivalent to (-1 + -1) mod 31 since 30 mod 31 is equivalent to -1 mod 31. It is wiser to represent 30 mod 31 as -1 mod 31 when doing modular arithmetic since the results of the arithmetic fit within the range of modulus 31.
Modulus addition without risk of Integer Overflow/Underflow is an essential concept that must be mastered. This will ensure your code is safe from Integer Overflow class of attacks as well as ensure your code is functionally correct meaning it yields the correct final result for all possible input cases.
Modular Subtraction
In the next coding exercise we will write a program for modulus subtraction.
So you wrote your own solution and tested that it works, right? Great! Because here is mine:
The solution subtracts b_add from a_add unlike the operator+() function. This is Integer Safe to do since a_add and b_add each choose the value between the positive modulo representation versus their negative modulo representation that has a smaller absolute value.
Call to Action
I hope you found the above explanations and exercises helpful in your journey as a source code auditor or user of crypto APIs. They contain helpful tips and tricks on how to translate the math for modulus and modular addition that you will not find in most books on cryptography--such as avoiding Integer Overflow/Underflow when doing calculations.
These lessons will serve you well as you study and use code that performs modular arithmetic in your programming library.
If you enjoyed this blog post you may be interested in checking out my in-progress book:
where I explain how to translate math necessary for cryptographic primitives into working code.
Please feel free to leave comments and thoughts anywhere on the webpage of my beta draft.
If you made it this far thanks for reading!
Comentarios