Jason L Causey

Epsilon Comparison

This post was originally written for the A-State CS Department wiki in November, 2013.

The Numerical (In)accuracy Problem

Computers use binary (base-2) numbers internally. This means that our familiar decimal (base-10) numbers must be converted to binary before they can be stored in the computer. The problem is that some—many, actually—base-10 real numbers do not have an exact representation in base-2. This may seem odd until you consider that we deal with lots of numbers that do not have an exact representation in base-10 either. Two examples are $\pi$ and $e$. Neither of these values can be written exactly in decimal no matter how many digits you allow after the decimal point. Numbers of this variety are called irrational numbers. It turns out that many numbers that are easily written in decimal become irrational numbers when translated to binary. A simple example is the number 0.1. It is easy enough to write in decimal, but in binary it becomes impossible to represent exactly.

So what do we do when we can’t exactly represent a number? We round. If you were asked to write down $\pi$, what would you write? You would probably write down some value between 3.14 and 3.14159, depending on how precise you wanted to be. Neither number is exactly correct, but both are reasonable representations given the number of digits used. In a computer, there is a limit to the number of binary digits (bits) that may be used to represent a number. Because of this, numbers like 0.1 must be rounded to some limited precision. This rounding produces a “pretty good”, but inexact representation of 0.1.

At this point, you are probably asking yourself “why should I care?”, and that is a reasonable question. The answer is: Because it will cause errors in your programs! To help illustrate why, open the interactive Python interpreter and follow along with the following example:

First, let’s see how 0.1 is approximated in Python (Python is used for this example because it’s default numeric output mode shows more decimal places than C++). At the interactive prompt, type 0.1 and <ENTER>. You should see something like the following:

>>> 0.1
0.10000000000000001

Hmm. It looks like 0.1 gained an extra 1 in the 17^th^ decimal position! Now enter this relational expression and note the result:

0.1 == 3.1 - 3

Here’s the output that was produced on the computer used to write this page:

>>> 0.1 == 3.1 - 3
False

Why was the result False? To see the answer, let’s enter each operand into the interpreter separately and look at the result:

>>> 0.1
0.10000000000000001
>>> 3.1 - 3
0.10000000000000009

Notice that 0.1 is represented internally as 0.10000000000000001 (as we saw before). Now notice that 3.1 - 3 (which should also be 0.1) is represented internally as 0.10000000000000009. There is a difference in the 17^th^ digit following the decimal point! When we compare these two values with the == operator, the expression will return False since the two numbers (internally) are not exactly the same! We refer to this as rounding error, and if we want to compare real numbers for equality in a computer program, we must learn how to deal with rounding error properly.

A C++ Example

In C++, showing the inaccuracy is a little more difficult due to differences in the internal representation of floating-point values. The following example may or may not show the “error”, but it is likely to do so:

double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);
if(pi1 == pi2){
    cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
    cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}

Here, a very accurate value for $\pi$ is calculated using the inverse cosine function acos(). Then, this version of $\pi$ is used to calculate another representation (by calculating the area of a circle of radius 0.1 and then solving for $\pi$ by dividing out the $r^2$ term). In theory, these two values of $\pi$ should be exactly the same — but in the process of doing imprecise floating-point operations (the operations involving 0.1), rounding error was (possibly) introduced. Therefore, it is likely that the following output is produced:

An error occurred: 3.14159 != 3.14159!

Even worse, since the default behavior of the << operator in C++ is to round the output, you cannot even see the difference that caused the two values to be considered “not equal” (it was beyond the 5^th^ decimal place).

The Right Way to Compare Real Numbers for Equality

Ok, if we can’t use the == operator by itself to compare real numbers, how can we hope to determine if two real number values are equal? The answer is that we must accept the fact that we can’t compare real numbers exactly with a computer—not by using the built-in float type anyway. Once we have accepted that limitation, we can find a way to compare real numbers that takes the rounding error into account. We will decide on some amount of error that we will consider acceptable, then we will determine if the two values are reasonably close to equal. This makes sense in the real world, since you would probably say that both 3.1416 and 3.14159 are reasonable representations of $\pi$. To determine if two numbers are functionally equal (or “close enough”), we can ask the following question:

“Is the difference between the two values so small that we can ignore it?”

If the answer to this question is “Yes”, we will consider the numbers equal. Otherwise, we will consider them to be unequal. To answer this question, we can use an expression like the following (which compares 0.1 and 3.1 - 3 ):

In Python

abs(0.1 - (3.1 - 3)) < 1e-11

In C++

fabs(0.1 - (3.1 - 3)) < 1e-11

In the expression above, we take the difference of the two values (by subtracting them), then we look at the magnitude of that number (by taking its absolute value with abs() (in Python), or fabs() (in C++) to remove any negative sign). If the magnitude of the difference of the two numbers is less than a very small value ($1\times10^{-11}$ in this case), we consider the numbers equal. We usually refer to the small number representing allowable error as EPSILON. Thus, a comparison of this type is usually referred to as an EPSILON comparison. At the Python interactive prompt, enter the expression above and note the result. You should see something like the following:

>>> abs(0.1 - (3.1 - 3)) < 1e-11
True

For the C++ example, consider the same code testing the two values of $\pi$ as shown above, but now using an EPSILON comparison:

double pi1 = acos(-1);
double pi2 = (pi1 * (0.1 * 0.1)) / (0.1 * 0.1);

if(fabs(pi1 - pi2) < 1e-11){
    cout << "OK, " << pi1 << " == " << pi2 << "!\n";
}
else{
    cout << "An ERROR occurred: " << pi1 << " != " << pi2 << "!\n";
}    

Now, you can be confident that the outcome will always be as expected:

OK, 3.14159 == 3.14159!