Prime Generation and Manipulation

Sieve set_limit next_prime closest_prime nth_prime cur_prime primes_upto count is_prime

(bolded sections are more interesting)

toggle TOC (ctrl + ⇔)

sal/algo/prime.h

Sievesegmented sieve class that generates primes
.next_primenext prime in the sequence or after a guess
.closest_primethe nearest prime to a guess
.cur_primecurrent prime in the sequence
.nth_primenth prime, starting from 2(n=1)
.primes_uptosequence of primes up to a upper limit
.is_primecheck if guess is a prime number
.countnumber of primes in sequence below current prime or an upper limit

Sieve of Eratosthenes

Declaration

template <typename big_int = unsigned long long>
class Sieve;

// constructor
Sieve(big_int init_limit = 0, size_t seg_size = L1D_CACHE_SIZE);

Parameters

init_limitinitial limit to sieve up to
seg_sizethe batch size that primes are generated in

Example

// keep a sieve object if continuous work with primes is needed
using big_int = size_t;
Sieve<big_int> sieve;

// or to initialize with primes up to 1000 already generated
Sieve<> alt_sieve(1000);

Discussion

Its working mechanism is depicted in the image below, where multiples of first encountered primes are marked. Although not the theoretical fastest prime generating algorithm (with complexity O(nlglg(n))), an optimized Sieve of Eratosthenes is computationally the fastest way to do so. The most important optimization is to split the sieve into segments, the most efficient size being the L1 CPU cache. This vastly reduces cache misses, but most importantly lifts the memory limit on the largest prime sievable. Supposing 4GB of free memory, a traditional sieve using a 4GB large bit array (8 bits per byte) could only treat primes under 16 billion.

The implementation is influenced by primesieve, which boasts more extensive optimizations than SAL's, but I hope that my version is more readable and usable as part of larger projects.

Next prime | Sieve::next_prime

Declaration

big_int next_prime();
big_int next_prime(big_int guess);

Parameters

guessa number immediately before the desired prime; returns the prime after if guess is a prime

Example

// using sieve from before
while (true) sieve.next_prime();
// generate infinite stream of primes (each of big_int)

sieve.next_prime(500);
// prime immediately after 500 -> 503
sieve.next_prime(503);
// if guess is a prime, will return next prime 509

Discussion

Often either a sequence of primes or a prime "next to" a number is needed. This method provides one means of doing so, the other one being closest_prime. Using next_prime to generate primes will result in uneven performance as the primes are sieved in segments at a time, with most of the work being at the boundary of a segment.

Closest prime | Sieve::closest_prime

Declaration

big_int closest_prime(big_int guess);

Parameters

guessnumber around which a prime is wanted

Example

sieve.closest_prime(50000);
// big_int 49999

Discussion

Primes often have use in information encoding due to its unfactorable nature. One application is finding acceptable sizes for hash tables. If the hash function is any number of Universal hashes, then the size should be a prime close to a power of two (such as Mersenee primes) to replace division with shifts and additions but also while reducing clustering.

Current prime | Sieve::cur_prime

Declaration

big_int cur_prime() const;

Example

Sieve<> sieve;
for (int i = 0; i < 100; ++i)
	sieve.next_prime();

sieve.cur_prime();
// size_t 100th prime (n = 100)

Discussion

Current prime in the sequence.

Nth prime | Sieve::nth_prime

Declaration

big_int nth_prime(big_int nth);

Parameters

nthprime index starting with 2(n=1), 3(n=2)

Example

sieve.nth_prime(1000);
// big_int 7919 (1000th prime)

Discussion

Prime numbering starts with an index of 1; nth = 0 or nth < 0 returns 0, but beware of underflows if using an unsigned type such as size_t by default.

Primes up to | Sieve::primes_upto

Declaration

const std::vector<big_int>& primes_upto(big_int lim);

Parameters

limupper limit for largest prime

Example

sieve.nth_prime(1000);
// big_int 7919 (1000th prime)

Discussion

Returns a reference to the internal prime vector. Note that the last member is not guranteed to be the prime just under lim! (it will be if lim is greater than internal limit prior to running) A separate copy guranteeing the last element is not done as that would be very inefficient and unncessary. If the last prime is too large, use sieve.count(lim) - 1 to retrieve index of the desired back.

Check if prime | Sieve::is_prime

Declaration

bool is_prime(big_int guess);

Example

sieve.is_prime(sieve.nth_prime(420000));
// bool true (the 420000th prime is a prime)

Discussion

Accomplished by first filtering out smooth numbers (ones with small factors) then performing a binary search on existing primes or sieving up to the number;

Count primes under | Sieve::count

Declaration

size_t count();
size_t count(big_int upper);

Parameters

upperupper limit for largest prime

Example

sieve.count(1000000);
// size_t 78498 primes below a million 

Discussion

Counting primes is more efficient than generating them and then counting since a bit sieve is used for counting only (8 byte statuses in 1 byte). Counting will not increase the current prime index or load primes into the internal vector.