lesscake.com

Finding prime numbers with Eratosthenes

A prime is a natural number greater than 1 that has no positive divisors other than 1 and itself. For example:

Here's an interesting problem: write a program that finds all prime numbers between 0 and N. For N = 20, the solution would be: 2, 3, 5, 7, 11, 13, 17, 19.

In this article we will cover 2 solutions. One naive, and one very elegant called the "sieve of Eratosthenes". Both will be written in Python.

The naive solution

The best place to start when confronted with a problem is with a naive solution. It means we will code a straightforward program with no performance considerations.

First, we write a function to check whether or not the parameter x is prime.

def is_prime(x):
  # If x is less than 2, it is not prime by definition
  if x < 2:
    return False

  # Go though all the numbers between 2 and x (x excluded)
  for i in range(2, x):
    # If the remainder of x divided by i is zero
    if x % i == 0:
      # Then i is a divisor of x, so x is not prime
      return False

  # If we found no divisor, then x is prime
  return True

Second, we go though all numbers between 1 and N one by one, and call is_prime to test if each number is prime or not.

def all_primes(n):
  # Variable that will contain all the prime numbers
  primes = []

  # Go though all the numbers between 0 and n (n included)
  for i in range(0, n+1):
    # If the number i is prime
    if is_prime(i):
      # Then add it to the list
      primes.append(i)

  # Return all prime numbers
  return primes

And we are done! Let's run our program a few times while measuring how long it takes.

all_primes(10**3) # 0.2ms

all_primes(10**4) # 9ms

all_primes(10**5) # 870ms

all_primes(10**6) # 115,000ms

It works, but it's pretty slow for large values of N. That's because we have 2 nested for loops that need to go though many iterations.

The sieve of Eratosthenes algorithm

Eratosthenes is a Greek mathematician that was born in 276 BC. He discovered a clever algorithm named after him that will solve our problem in an elegant and efficient way.

Below is a description of how the algorithm works for N = 29.

First we list all numbers between 2 and 29. Each number can have 2 states: green or red. By default they are all green.

We start from the beginning of the list, and pick the first green number. In our case that's 2.

Then we go trough the remaining of the list to find all numbers that are multiples of 2, and we set them in red. They are: 4, 6, 8, ..., 24, 26, and 28.

Now we start again: find the first green number (3), and mark in red all its multiples (6, 9, 12, 15, 18, 21, 24, 27). Note that some of them where already red, so they won't change color.

And we repeat the same thing, this time with 5 (10, 15, 20, 25).

We continue this process for all remaining numbers, and we end up with this.

As you may have guessed, all numbers in green are primes. Pretty cool!

Here's a short explanation of why this works:

The sieve of Eratosthenes code

Let's code the sieve of Eratosthenes. But before that, there are 2 things to keep in mind:

Here's the full code of the algorithm, with many comments to explain how it works.

def all_primes(n):
  # Variable that will contain all the prime numbers
  primes = []

  # Initialize our table with n times the boolean True
  table = [ True ] * n

  # The numbers 0 and 1 are not primes by definition
  # So we set them as False in the table
  table[0] = table[1] = False

  # Go though all the elements of the table
  for i, boolean in enumerate(table):
    # If the boolean is set to True
    if boolean:
      # It means i is a prime number, so we add it to our list
      primes.append(i)

      # Go though all multiples of i
      # That's i+i, i+i+i, and so on up to n
      for j in range(i+i, n, i):
        # And set their flags to False
        table[j] = False

  # Return all primes
  return primes

If we try our new program with our previous tests, we get vastly improved performances.

all_primes(10**3) # 0.001ms (previously 0.2ms)

all_primes(10**4) # 0.4ms (previously 9ms)

all_primes(10**5) # 3.5ms (previously 870ms)

all_primes(10**6) # 36.5ms (previously 115,000ms)

Conclusion

Prime numbers are an interesting topic, and the sieve of Erastosthenes is a very efficient and elegant solution to find them. That's why it's one of my favorite algorithms.

For your information, all the code shown in this article is available on GitHub.

If you have any questions or feedback: thomas@lesscake.com :-)