Prime Generation and Manipulation
(bolded sections are more interesting)
toggle TOC (ctrl + ⇔)
sal/algo/prime.h
Sieve | segmented sieve class that generates primes |
---|---|
.next_prime | next prime in the sequence or after a guess |
.closest_prime | the nearest prime to a guess |
.cur_prime | current prime in the sequence |
.nth_prime | nth prime, starting from 2(n=1) |
.primes_upto | sequence of primes up to a upper limit |
.is_prime | check if guess is a prime number |
.count | number of primes in sequence below current prime or an upper limit |
Sieve of Eratosthenes ¶
Declaration
template <typename big_int = unsigned long long>
class Sieve;
// constructor
Sieve(big_int init_limit = 0, size_t seg_size = L1D_CACHE_SIZE);
Parameters
init_limit | initial limit to sieve up to |
seg_size | the batch size that primes are generated in |
Example
// keep a sieve object if continuous work with primes is needed
using big_int = size_t;
Sieve<big_int> sieve;
// or to initialize with primes up to 1000 already generated
Sieve<> alt_sieve(1000);
Discussion
Its working mechanism is depicted in the image below, where multiples of first encountered primes are marked.
Although not the theoretical fastest prime generating algorithm (with complexity O(nlglg(n))
),
an optimized Sieve of Eratosthenes is computationally the fastest way to do so. The most important optimization
is to split the sieve into segments, the most efficient size being the L1 CPU cache. This vastly reduces cache
misses, but most importantly lifts the memory limit on the largest prime sievable. Supposing 4GB of free memory,
a traditional sieve using a 4GB large bit array (8 bits per byte) could only treat primes under 16 billion.
The implementation is influenced by primesieve, which boasts more extensive optimizations than SAL's, but I hope that my version is more readable and usable as part of larger projects.

Next prime | Sieve::
next_prime¶
Declaration
big_int next_prime();
big_int next_prime(big_int guess);
Parameters
guess | a number immediately before the desired prime; returns the prime after if guess is a prime |
Example
// using sieve from before
while (true) sieve.next_prime();
// generate infinite stream of primes (each of big_int)
sieve.next_prime(500);
// prime immediately after 500 -> 503
sieve.next_prime(503);
// if guess is a prime, will return next prime 509
Discussion
Often either a sequence of primes or a prime "next to" a number is needed. This method provides one means of doing so, the other one being closest_prime. Using next_prime to generate primes will result in uneven performance as the primes are sieved in segments at a time, with most of the work being at the boundary of a segment.
Closest prime | Sieve::
closest_prime¶
Declaration
big_int closest_prime(big_int guess);
Parameters
guess | number around which a prime is wanted |
Example
sieve.closest_prime(50000);
// big_int 49999
Discussion
Primes often have use in information encoding due to its unfactorable nature. One application is finding acceptable sizes for hash tables. If the hash function is any number of Universal hashes, then the size should be a prime close to a power of two (such as Mersenee primes) to replace division with shifts and additions but also while reducing clustering.
Current prime | Sieve::
cur_prime¶
Declaration
big_int cur_prime() const;
Example
Sieve<> sieve;
for (int i = 0; i < 100; ++i)
sieve.next_prime();
sieve.cur_prime();
// size_t 100th prime (n = 100)
Discussion
Current prime in the sequence.
Nth prime | Sieve::
nth_prime¶
Declaration
big_int nth_prime(big_int nth);
Parameters
nth | prime index starting with 2(n=1), 3(n=2) |
Example
sieve.nth_prime(1000);
// big_int 7919 (1000th prime)
Discussion
Prime numbering starts with an index of 1; nth = 0 or nth < 0 returns 0, but beware of underflows if using an unsigned type such as size_t by default.
Primes up to | Sieve::
primes_upto¶
Declaration
const std::vector<big_int>& primes_upto(big_int lim);
Parameters
lim | upper limit for largest prime |
Example
sieve.nth_prime(1000);
// big_int 7919 (1000th prime)
Discussion
Returns a reference to the internal prime vector. Note that the last member
is not guranteed to be the prime just under lim!
(it will be if lim is greater than internal limit prior to running)
A separate copy guranteeing the last element is not done as that would be very inefficient and unncessary.
If the last prime is too large, use sieve.count(lim) - 1
to retrieve index of the desired back.
Check if prime | Sieve::
is_prime¶
Declaration
bool is_prime(big_int guess);
Example
sieve.is_prime(sieve.nth_prime(420000));
// bool true (the 420000th prime is a prime)
Discussion
Accomplished by first filtering out smooth numbers (ones with small factors) then performing a binary search on existing primes or sieving up to the number;
Count primes under | Sieve::
count¶
Declaration
size_t count();
size_t count(big_int upper);
Parameters
upper | upper limit for largest prime |
Example
sieve.count(1000000);
// size_t 78498 primes below a million
Discussion
Counting primes is more efficient than generating them and then counting since a bit sieve is used for counting only (8 byte statuses in 1 byte). Counting will not increase the current prime index or load primes into the internal vector.