Finding prime numbers with Eratosthenes
A prime is a natural number greater than 1 that has no positive divisors other than 1 and itself. For example:
- 6 is not prime because it has many divisors: 1, 2, 3, and 6.
- 7 is prime because it has only 2 divisors: 1 and itself.
Here's an interesting problem: write a program that finds all prime numbers between 0 and N. For N = 20, the solution would be: 2, 3, 5, 7, 11, 13, 17, 19.
In this article we will cover 2 solutions. One naive, and one very elegant called the "sieve of Eratosthenes". Both will be written in Python.
The naive solution
The best place to start when confronted with a problem is with a naive solution. It means we will code a straightforward program with no performance considerations.
First, we write a function to check whether or not the parameter
x is prime.
def is_prime(x): # If x is less than 2, it is not prime by definition if x < 2: return False # Go though all the numbers between 2 and x (x excluded) for i in range(2, x): # If the remainder of x divided by i is zero if x % i == 0: # Then i is a divisor of x, so x is not prime return False # If we found no divisor, then x is prime return True
Second, we go though all numbers between 1 and N one by one, and call
is_prime to test if each number is prime or not.
def all_primes(n): # Variable that will contain all the prime numbers primes =  # Go though all the numbers between 0 and n (n included) for i in range(0, n+1): # If the number i is prime if is_prime(i): # Then add it to the list primes.append(i) # Return all prime numbers return primes
And we are done! Let's run our program a few times while measuring how long it takes.
all_primes(10**3) # 0.2ms all_primes(10**4) # 9ms all_primes(10**5) # 870ms all_primes(10**6) # 115,000ms
It works, but it's pretty slow for large values of N. That's because we have 2 nested
for loops that need to go though many iterations. The time complexity of this algorithm is O(N2).
The sieve of Eratosthenes algorithm
Eratosthenes is a Greek mathematician that was born in 276 BC. He discovered a clever algorithm named after him that will solve our problem in an elegant and efficient way.
Below is a description of how the algorithm works for N = 29.
First we list all numbers between 2 and 29. Each number can have 2 states: green or red. By default they are all green.
We start from the beginning of the list, and pick the first green number. In our case that's 2.
Then we go trough the remaining of the list to find all numbers that are multiples of 2, and we set them in red. They are: 4, 6, 8, ..., 24, 26, and 28.
Now we start again: find the first green number (3), and mark in red all its multiples (6, 9, 12, 15, 18, 21, 24, 27). Note that some of them where already red, so they won't change color.
And we repeat the same thing, this time with 5 (10, 15, 20, 25).
We continue this process for all remaining numbers, and we end up with this.
As you may have guessed, all numbers in green are primes. Pretty cool!
Here's a short explanation of why this works:
- Red numbers are multiples of the green ones. It means that they have at least one divisor and cannot be primes.
- Green numbers were never set in red. It means they have no divisors, so they must be primes.
The sieve of Eratosthenes code
Let's code the sieve of Eratosthenes. But before that, there are 2 things to keep in mind:
- To store the colors, we will use a boolean:
Truefor green and
- To store both the numbers and the booleans, we will use a simple list. For example
[ False, False, True ]would mean:
Falsein 0th position → 0 is not prime.
Falsein 1st position → 1 is not prime.
Truein 2nd position → 2 is prime.
Here's the full code of the algorithm, with many comments to explain how it works.
def all_primes(n): # Variable that will contain all the prime numbers primes =  # Initialize our table with n times the boolean True table = [ True ] * n # The numbers 0 and 1 are not primes by definition # So we set them as False in the table table = table = False # Go though all the elements of the table for i, boolean in enumerate(table): # If the boolean is set to True if boolean: # It means i is a prime number, so we add it to our list primes.append(i) # Go though all multiples of i # That's i+i, i+i+i, and so on up to n for j in range(i+i, n, i): # And set their flags to False table[j] = False # Return all primes return primes
The time complexity of this algorithm is O(N log log N), which is much better than the previous O(N2). So if we try our new program with our previous tests, we get vastly improved performances.
all_primes(10**3) # 0.001ms (previously 0.2ms) all_primes(10**4) # 0.4ms (previously 9ms) all_primes(10**5) # 3.5ms (previously 870ms) all_primes(10**6) # 36.5ms (previously 115,000ms)
Prime numbers are an interesting topic, and the sieve of Erastosthenes is a very efficient and elegant solution to find them. That's why it's one of my favorite algorithms.
For your information, all the code shown in this article is available on GitHub.
If you have any questions or feedback: email@example.com :-)