The algorithm starts out by assuming all numbers are prime, and marking them as such.
At the end of the algorithm, only prime numbers up to an upper limit will still be marked.
The number 1 is a special case, so we start off by unmarking it.
Then we go through the numbers one by one.
For every non-prime number we find, skip to the next number.
If a number is still marked as prime when we get to it, that means it is prime.
Before moving on to the next number, we first unmark every multiple of the found prime.
Those multiples can be divided through the prime number we just found, so by definition isn’t prime.
We repeat this process until we reach the upper limit.
Every number that is still marked as prime, is truly prime.
Optimizations
By using some math we can do significantly less work while still getting the same result.
Repeat until the square root
While iterating through all numbers, we can stop at the square root of the upper limit.
Any non-prime can be expressed as the product of 2 numbers that are not 1 or itself.
n=a∗b
a and b are factors of n.
n=n∗n, so one factor has to be less than or equal to n while the other is greater than or equal to that square root.
a≤n≤b
Up to any number n, all multiples of a number bigger than n must have a factor smaller than n.
As a result that multiple will already be unmarked.
This means that all the non-primes ≥limit will be unmarked in the process of checking every number ≤limit.
Example
21=4.58
Any number up to 21 that is a multiple of a number larger than 4.58 will have a factor smaller than 4.58.
Because 18 is a number up to 21.
It is also a multiple of a number that is bigger than 4.58.
That means a factor of 18 must be smaller than 4.58.
That checks out, 3 is a factor!
Because 3 is a factor of 18.
18 was unmarked while going through multiples when 3 was the number the algorithm was unmarking multiples for!
Start unmarking at the square
During the step the algorithm unmarks all multiples of a number.
We can start unmarking at that number squared.
Every smaller multiple was already unmarked in a previous iteration.
Why?
A multiple can be written as a multiplier times a number.
m=multiple
k=multiplier
p=prime
m=k∗p
The number that is now p, was previously k for every smaller prime number.
Because k∗p=p∗k, every multiple smaller than p∗p has already been unmarked in a previous iteration.
Example
If our current detected prime, p=5.
5 was previously the multiplier for every smaller prime number.
5∗2 was unmarked when p was 2, we don’t need to calculate 2∗5
5∗3 was unmarked when p was 3, we don’t need to calculate 3∗5
Step by step in code
The goal is to write a function that returns a list of prime numbers, up to upper_bound.
We initialise a list of booleans that is 1 bigger than the given upper_bound and call it sieve.
These booleans tell us if the number at that index is prime or not. (True for prime, False for not)
Smart people decided programmers start counting at 0, so that’s why that list is 1 bigger than upper_bound.
It’s also the reason why we have to unmark the index 0 along with the index 1 before we start our loop.
This works out perfectly, because now every index exactly matches the number it represents.
You want to know if the number 69 is prime? The boolean at index 69 will tell you. Nice!
If the boolean at that location is True, the number is prime and we unmark every multiple before moving on to the next step of our loop.
Do this by skip counting.
Start at the number squared and add the number until you hit upper_bound.
For every encountered multiple, set sieve at that number’s index to False.
At the end of the outer loop, sieve will be full of booleans corresponding to the primeness of every possible index to that list.
Use your favourite method to loop over a list while also getting the index, put the indexes with a true into a new list, and presto, primes.