Outline
Introduction to Constant-Time Programming Techniques
Branch-free Comparison Predicates
Equals Comparison
Not Equal to Comparison
Greater Than Comparison
Greater Than or Equal To Comparison
Less Than Comparison
Less Than or Equal To Comparison
Storing Big Numbers as Vectors in C++
Comparison Predicates with Big Numbers
Addition with Big Numbers
Subtraction with Big Numbers
Multiplication with Big Numbers
Grade School Multiplication
Karatsuba Multiplication
From the last blog post a lot of readers said its best to use unsigned arithmetic with multi-precision numbers when doing arithmetic in cryptography. In this blog post I will walk you through a basic multi-precision library. I have decided not to worry about speed in this article. Instead, I made up my mind to focus more on comprehension. I will discuss multi-precision library code techniques for speed later. I have also decided to present introductory techniques for constant-time programming--meaning the execution time of the program is supposed to be constant regardless of the value of any secret data.
Introduction to Constant-Time Programming
I strongly recommend you read the article Constant Time Arithmetic for Safer Cryptography. There were three general rules that summarize rules for constant-time coding:
For this article we will focus on rules (1.) and (3.).
There is one more rule to remember:
Assume hardware instructions for arithmetic using CPU Registers are constant-time except for operations using division. This includes the '%' operator in C/C++. For most instruction set architectures it is okay to assume this the case such as for Intel x86_64. For a full list of architectures on whether they support constant time arithmetic with CPU registers please visit Thomas Pornin's webpage on the matter.
For a more in-depth introduction to constant-time rules for programming please see my past blog article.
Division not being executed in constant-time does pose a problem. We need division to find the remainder from division for modular arithmetic. Yet, the division operation used by default whenever you use the '/' symbol does not run in constant time. Luckily for us, Peter Montgomery has got us covered. In 1985, Montgomery discovered techniques for modular multiplication and exponentiation without needing to perform trial division. You need those for public-key cryptography! What's so important about these techniques is they can be designed to be resistant to side-channel attacks including timing attacks. For this article we will not be programming these directly but it is important to be aware that the Big Integer Library we are about to make must support these essential techniques.
I am aware the compiler can transform source code that is logically constant-time into assembled code that is not. We will ignore this problem throughout this article.
Branch-Free Comparison Predicates
Comparison predicates are the math symbols for comparing numbers you learned in programming class: <, >, <=, >=, !=, and ==. But the above Threat Model for Constant-Time Programming says any conditional expression reveals which branch was taken. We have to write such comparison predicates in a manner that is branch-free.
This forces us to change a lot of our normal coding habits. Often we must not use the if-else, else-if, and comparison operators you grew up with programming.
One of the most important features of conditional statements is the ability to return error codes or numbers based on a conditional statement. But now we can no longer use the comparision operators we are used to. To mimic this in branch free expressions we should ensure each comparison predicate outputs 0 if the comparison predicate is false and the bitwise negation of zero (means all 0 bits are flipped to 1 bits) if the comparison predicate is true. This will help us return specific numbers as output if the comparison predicate is true instead of use the traditional if-else statements we are used to. All we have to do is apply the bitwise '&' symbol to the output of our custom comparison predicate to the number we want to return if the comparision is true.
So for example, let's say we want to check if the length of a string of bits is 0.
Usually we would do something like this:
Instead we can write the following:
Its okay to use a comparison operator in the above example against the error code since its not secret data.
I strongly recommend you acquire a copy of the book Hacker's Delight: Second Edition by Henry Warren S. Jr. It contains several bit-wise tricks you should find useful throughout your cryptographic programming days. If you are not in the mood to purchase it (you should!) you will have to stick with running around the Internet searching for these binary hacks.
Here is my solution:
All my code is a logical extension of Warren's code snippets that work on 32-bit unsigned numbers. In my code they work on unsigned 64-bit numbers--and all of these code samples are branch-free according to Henry S. Warren. For test cases to ensure your solutions work you may compare your solution against the test cases for the source code files of each of the above functions at this link.
Storing Big Numbers as Vectors in C++
This is the first version of Multi-Precision Arithmetic that we will see. In this version of Multi-Precision Arithmetic our Class Definition "Big" will accept a C++ string of binary digits as input and store them in a vector of uint8_t numbers, where each uint8_t stores a single binary digit. As inefficient as this is it will make the rest of the exercises simple and easy-to-follow for a complete beginner.
The following is the code snippet of the Big Constructor (and Destructors) that you must implement.
Here is my solution:
Note I used the equint64_t() when testing for the error case where the user supplied an invalid string as input to the Big() constructor. This is branch-free code that executes in constant time.
Now that we have a proper constructor (and destructor) for our Big Integer library let us make some functions that allow us to perform logical bitwise operations '&' (bitwise and) and '|' (bitwise or).
Here is a quick review of what the '|' and '&' symbols do.
The following code snippet:
Outputs the following:
When you have finished your implementation you may test it against mine:
Simply put my solution applies the (|/&) functions to each value in the first and second operand from the least significant binary digit to the most significant digit. If one operand has a longer bitsize than the other the program will simply apply the (|/&) operators to the 0 bit as the second operand and prepend that as part of the final result.
We can next make the functions for the xor (^) and negation (~) functions. The xor function simply tests if the two bits are different from each other. If so the output bit is 1. Otherwise 0. The bitwise negation (~) functions simply flips the bit. If the bit is 1 the output bit is 0--and vice versa. You can read the GeeksforGeeks article for a more in-depth explanation.
Here is the code snippet file:
Once you have made and tested your solution please compare it to my solution:
Lastly we need to implement the >> and << operators for Big in C++. >> means shift bitwise right. For example. 0b1001 >> 2 == 0b10 and 0b1001 << 1 == 0b10010. The >> operator is an easy way to divide a binary number by 2. The << operator is an easy way to multiply a number by 2.
Use the following code snippet to make your solution:
And once you have written and tested your solution here is mine:
Now that we have the basic bitwise logical operators for Big we are ready to write the operator overload functions for addition and subtraction. For a quick review of binary addition you can see this article. For a great article on binary subtraction check out GeeksForGeeks' excellent article.
Here is my solution:
Please review the articles on binary addition and subtraction if you are confused about those. I also strongly recommend you have a copy of Hacker's Delight Second Edition by Henry S Warren. Its a book on binary hacks you should find helpful as you struggle to develop code that is efficient and constant-time.
Here is my solution:
My solution for addition is straightforward: it is a simple translation of grade school binary addition. Binary subtraction was trickier. I decided to use the Two's Complement Algorithm for binary subtraction. There is also a second problem with binary subtraction: this article is about unsigned binary subtraction. So if the first operand is larger than the second it has to detect this and return an error code.
With the algorithms for binary addition and subtraction done we move on to the second most difficult problem: binary multiplication. There are two methods: the gradeschool method (warning it's slow with a time complexity of O(n^2) [technically its O(size of first operand * size of second operand)]) and the Karatsuba Multiplication Method, which is only faster when the numbers are at least as large as a certain bitsize. What that bitsize is depends on your implementation and the machine you are running on. You will have to find out the optimal speed based on experimentation.
Cuemath gives an excellent primer on grade-school binary multiplication. Brilliant also gives an excellent explanation of Karatsuba. The most important part of Brilliant's article is the below diagram on how Karatsuba works:
x_H and y_H respectively mean the first halves of x and y. These are also known as the most significant bit halves of x and y. For example if x is 0b1001 and y is 0b1100 then x_H == 0b10 and y_H == 0b11. Likewise x_L and y_L mean the lower halves of x and y aka the least significant halves of x and y. b^n and b^(n/2) can easily be done using leftwise (<<) binary shifts. For example, a * b^n == a << n. The requirement for b^(n/2) requires that each operand has an even number of bits. We can do this by prepending 0 bits to the beginning of each operand until its bitsize is an even number of bits.
There is one problem with the above explanation: the algorithm requires you to recursively halve x and y each until they are exactly 1 bit in size. This is not efficient! After halving down x and y past a certain size I discovered through experimentation that Karatsuba is even slower than grade-school multiplication. You must figure out the maximum bitsize that x and y each should be in order for Karatsuba to be faster than grade-school multiplication. If the bitsize of x and y are smaller than that it is faster to use grade-school multiplication.
With these notes you should be ready to program multiplication for Big:
Here is the code snippet for multiplication:
Once you have written and tested your solution here is mine:
My solution above even with Karatsuba is very slow compared to production-ready programs. I made up my mind to design the Big object in the way you see it--despite its slowness--so it would be easy for first-time learners to focus on how the math of the exercises work. Hopefully this blog article was simple enough for you to be able to learn how the math that these algorithms for Multi-precision arithmetic require and then program them easily. When you read my solution you will notice that when the bitsize of each operand is less than 20 bits in size my program switches from Karatsuba to grade-school multiplication. When I tested my program on my own machine I noticed this technique would make multiplication ~2x as fast as grade-school--though still slow compared to production ready implementations.
The implementation you made--like mine--is way too slow for performing binary exponentiation with 4096 bit unsigned integers--this is something you will deal with in RSA-4096. In the next blog article we will write a faster version of Big--though the price for that will be it is more difficult to solve the exercises.
Comments