Parallel Processing in Algebraic Number Theory Bill Hart February - - PowerPoint PPT Presentation

parallel processing in algebraic number theory
SMART_READER_LITE
LIVE PREVIEW

Parallel Processing in Algebraic Number Theory Bill Hart February - - PowerPoint PPT Presentation

Outline Introduction to FLINT Parallel Processing in Algebraic Number Theory Bill Hart February 1, 2007 Bill Hart Parallel Processing in Algebraic Number Theory Outline Introduction to FLINT Introduction to FLINT Fast Library for Number


slide-1
SLIDE 1

Outline Introduction to FLINT

Parallel Processing in Algebraic Number Theory

Bill Hart February 1, 2007

Bill Hart Parallel Processing in Algebraic Number Theory

slide-2
SLIDE 2

Outline Introduction to FLINT

Introduction to FLINT Fast Library for Number Theory

Bill Hart Parallel Processing in Algebraic Number Theory

slide-3
SLIDE 3

Outline Introduction to FLINT Fast Library for Number Theory

FLINT: Fast Library for Number Theory

◮ Jointly Maintained by David Harvey (Harvard) and Bill Hart

(Warwick)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-4
SLIDE 4

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-5
SLIDE 5

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-6
SLIDE 6

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms. ◮ Library written in C.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-7
SLIDE 7

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms. ◮ Library written in C. ◮ Based on GMP.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-8
SLIDE 8

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms. ◮ Library written in C. ◮ Based on GMP. ◮ Extensively Tested.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-9
SLIDE 9

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms. ◮ Library written in C. ◮ Based on GMP. ◮ Extensively Tested. ◮ Extensively Profiled.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-10
SLIDE 10

Outline Introduction to FLINT Fast Library for Number Theory

FLINT Design Philosophy

◮ Faster than all available alternatives. ◮ Asymptotically Fast Algorithms. ◮ Library written in C. ◮ Based on GMP. ◮ Extensively Tested. ◮ Extensively Profiled. ◮ Support for Parallel Processing.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-11
SLIDE 11

Outline Introduction to FLINT Fast Library for Number Theory

What does FLINT currently do?

◮ All GMP integer functions (mpz add → Z add).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-12
SLIDE 12

Outline Introduction to FLINT Fast Library for Number Theory

What does FLINT currently do?

◮ All GMP integer functions (mpz add → Z add). ◮ Additional functions for Z and modulo arithmetic.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-13
SLIDE 13

Outline Introduction to FLINT Fast Library for Number Theory

What does FLINT currently do?

◮ All GMP integer functions (mpz add → Z add). ◮ Additional functions for Z and modulo arithmetic. ◮ Integer Factorisation (Multiple Polynomial Quadratic Sieve).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-14
SLIDE 14

Outline Introduction to FLINT Fast Library for Number Theory

What does FLINT currently do?

◮ All GMP integer functions (mpz add → Z add). ◮ Additional functions for Z and modulo arithmetic. ◮ Integer Factorisation (Multiple Polynomial Quadratic Sieve). ◮ Some polynomial arithmetic, including asymptotically fast

polynomial multiplication.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-15
SLIDE 15

Outline Introduction to FLINT Fast Library for Number Theory

What does FLINT currently do?

◮ All GMP integer functions (mpz add → Z add). ◮ Additional functions for Z and modulo arithmetic. ◮ Integer Factorisation (Multiple Polynomial Quadratic Sieve). ◮ Some polynomial arithmetic, including asymptotically fast

polynomial multiplication.

◮ Approximately 21,000 lines of C code so far (including

profiling and test code).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-16
SLIDE 16

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-17
SLIDE 17

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-18
SLIDE 18

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-19
SLIDE 19

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-20
SLIDE 20

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-21
SLIDE 21

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z. ◮ Z p - p-adics.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-22
SLIDE 22

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z. ◮ Z p - p-adics. ◮ GF2 - Sparse and dense matrices over GF2.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-23
SLIDE 23

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z. ◮ Z p - p-adics. ◮ GF2 - Sparse and dense matrices over GF2. ◮ QNF - Quadratic number fields.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-24
SLIDE 24

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z. ◮ Z p - p-adics. ◮ GF2 - Sparse and dense matrices over GF2. ◮ QNF - Quadratic number fields. ◮ NF - General number fields.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-25
SLIDE 25

Outline Introduction to FLINT Fast Library for Number Theory

What FLINT will eventually do

◮ Z - Integer Arithmetic. ◮ Zmod - Arithmetic in Z/nZ. ◮ Zpoly - Polynomials over Z. ◮ Zvec - Vectors over Z. ◮ Zmat - Linear algebra over Z. ◮ Z p - p-adics. ◮ GF2 - Sparse and dense matrices over GF2. ◮ QNF - Quadratic number fields. ◮ NF - General number fields. ◮ ?? - Whatever people contribute.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-26
SLIDE 26

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-27
SLIDE 27

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-28
SLIDE 28

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-29
SLIDE 29

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-30
SLIDE 30

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

◮ Block Lanczos code (Jason Papadopoulous).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-31
SLIDE 31

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

◮ Block Lanczos code (Jason Papadopoulous). ◮ Polynomial root finding code (Jason P. - not yet integrated).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-32
SLIDE 32

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

◮ Block Lanczos code (Jason Papadopoulous). ◮ Polynomial root finding code (Jason P. - not yet integrated). ◮ SQUFOF factoring algorithm (Jason P. - not yet integrated).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-33
SLIDE 33

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

◮ Block Lanczos code (Jason Papadopoulous). ◮ Polynomial root finding code (Jason P. - not yet integrated). ◮ SQUFOF factoring algorithm (Jason P. - not yet integrated). ◮ Self initialising multiple polynomial quadratic sieve (for integer

factorization).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-34
SLIDE 34

Outline Introduction to FLINT Fast Library for Number Theory

Additional Functions Available in FLINT

◮ Exponentiation. ◮ Modular multiplication, modular inversion, modular square

root (mod p or mod pk), CRT, modular exponentiation.

◮ Next prime, random prime, extended GCD, GCD. ◮ Integer multiplication (faster than GMP 4.2.1 with Pierrick

Gaudry’s AMD64 patches for more than 164000 bit operands).

◮ Block Lanczos code (Jason Papadopoulous). ◮ Polynomial root finding code (Jason P. - not yet integrated). ◮ SQUFOF factoring algorithm (Jason P. - not yet integrated). ◮ Self initialising multiple polynomial quadratic sieve (for integer

factorization).

◮ Memory management for single mpz t’s and arrays of mpz t’s,

arrays of limbs.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-35
SLIDE 35

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-36
SLIDE 36

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear. ◮ Maximum coefficient size, whether coefficients are signed or

unsigned, maximum length.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-37
SLIDE 37

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear. ◮ Maximum coefficient size, whether coefficients are signed or

unsigned, maximum length.

◮ Add, subtract, multiply by scalar.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-38
SLIDE 38

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear. ◮ Maximum coefficient size, whether coefficients are signed or

unsigned, maximum length.

◮ Add, subtract, multiply by scalar. ◮ Truncate, rotate.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-39
SLIDE 39

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear. ◮ Maximum coefficient size, whether coefficients are signed or

unsigned, maximum length.

◮ Add, subtract, multiply by scalar. ◮ Truncate, rotate. ◮ Polynomial multiplication (including Karatsuba, RadixMul,

Schoenhage-Strassen, Kronecker-Strassen).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-40
SLIDE 40

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Arithmetic Available so far

◮ Allocate, deallocate, copy, clear. ◮ Maximum coefficient size, whether coefficients are signed or

unsigned, maximum length.

◮ Add, subtract, multiply by scalar. ◮ Truncate, rotate. ◮ Polynomial multiplication (including Karatsuba, RadixMul,

Schoenhage-Strassen, Kronecker-Strassen).

◮ Many test and profiling functions.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-41
SLIDE 41

Outline Introduction to FLINT Fast Library for Number Theory

Why do we need a new Library?

◮ What about Pari, NTL, LiDIA, others?

Bill Hart Parallel Processing in Algebraic Number Theory

slide-42
SLIDE 42

Outline Introduction to FLINT Fast Library for Number Theory

Why do we need a new Library?

◮ What about Pari, NTL, LiDIA, others? ◮ What about MAGMA, MAPLE, Mathematica, etc?

Bill Hart Parallel Processing in Algebraic Number Theory

slide-43
SLIDE 43

Outline Introduction to FLINT Fast Library for Number Theory

Why do we need a new Library?

◮ What about Pari, NTL, LiDIA, others? ◮ What about MAGMA, MAPLE, Mathematica, etc? ◮ SAGE seems to be doing just fine building in functionality

from NTL and Pari and others.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-44
SLIDE 44

Outline Introduction to FLINT Fast Library for Number Theory

Sieve timing comparisons

Digits Msieve FLINT Pari C41 0.33s 0.24s 0.34s C51 1.4s 1.4s 3.78s C61 9s 15.6s 61.3s C71 90s 187s 392s C81 820s 2160s 7985s C86 4200s 7380s Umm yeah Timings for a 1.8GHz Opteron (sage.math)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-45
SLIDE 45

Outline Introduction to FLINT Fast Library for Number Theory

Sieve timing comparisons

Digits Msieve FLINT Pari C41 0.44s 0.40s 1.1s C51 1.97s 1.82s 5.5s C61 13s 18s 90s C71 133s 187s 690s C76 568s 898 2970s C81 1045s 2320s 7920s C86 5880s 8580s Ahem Timings for an Athlon XP 2000+ (laptop)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-46
SLIDE 46

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Multiplication: NTL vs Pari (Pari = Red)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-47
SLIDE 47

Outline Introduction to FLINT Fast Library for Number Theory

Polynomial Multiplication: MAGMA vs NTL (MAGMA = Red)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-48
SLIDE 48

Outline Introduction to FLINT Fast Library for Number Theory

Algorithms for Polynomial Multiplication

◮ Radix Multiplication (used by NTL - old algorithm)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-49
SLIDE 49

Outline Introduction to FLINT Fast Library for Number Theory

Algorithms for Polynomial Multiplication

◮ Radix Multiplication (used by NTL - old algorithm) ◮ Schoenhage-Strassen (based on FFT’s)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-50
SLIDE 50

Outline Introduction to FLINT Fast Library for Number Theory

Algorithms for Polynomial Multiplication

◮ Radix Multiplication (used by NTL - old algorithm) ◮ Schoenhage-Strassen (based on FFT’s) ◮ Kronecker-Schoenhage (combine into large integers and

multiply)

Bill Hart Parallel Processing in Algebraic Number Theory

slide-51
SLIDE 51

Outline Introduction to FLINT Fast Library for Number Theory

Algorithms for Polynomial Multiplication

◮ Radix Multiplication (used by NTL - old algorithm) ◮ Schoenhage-Strassen (based on FFT’s) ◮ Kronecker-Schoenhage (combine into large integers and

multiply)

◮ Karatsuba

Bill Hart Parallel Processing in Algebraic Number Theory

slide-52
SLIDE 52

Outline Introduction to FLINT Fast Library for Number Theory

Algorithms for Polynomial Multiplication

◮ Radix Multiplication (used by NTL - old algorithm) ◮ Schoenhage-Strassen (based on FFT’s) ◮ Kronecker-Schoenhage (combine into large integers and

multiply)

◮ Karatsuba ◮ Toom

Bill Hart Parallel Processing in Algebraic Number Theory

slide-53
SLIDE 53

Outline Introduction to FLINT Fast Library for Number Theory

Karatsuba Method

(a1 + a2xn)(b1 + b2xn) = a1b1 + a2b2x2n+ (a1 + a2)(b1 + b2)xn − a1b1xn − a2b2xn

Bill Hart Parallel Processing in Algebraic Number Theory

slide-54
SLIDE 54

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-55
SLIDE 55

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-56
SLIDE 56

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

◮ Discrete Fourier Transform chooses 2n-th roots of unity as the

points to evaluate at.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-57
SLIDE 57

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

◮ Discrete Fourier Transform chooses 2n-th roots of unity as the

points to evaluate at.

◮ Compute DFT of coefficients of f1, compute DFT of

coefficients of f2, multiply the 2n values, perform an inverse transform.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-58
SLIDE 58

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

◮ Discrete Fourier Transform chooses 2n-th roots of unity as the

points to evaluate at.

◮ Compute DFT of coefficients of f1, compute DFT of

coefficients of f2, multiply the 2n values, perform an inverse transform.

◮ FFT is a method for computing the DFT quickly.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-59
SLIDE 59

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

◮ Discrete Fourier Transform chooses 2n-th roots of unity as the

points to evaluate at.

◮ Compute DFT of coefficients of f1, compute DFT of

coefficients of f2, multiply the 2n values, perform an inverse transform.

◮ FFT is a method for computing the DFT quickly. ◮ Schoenhage-Strassen technique works in the ring

Z/(2n + 1)Z, for which 2 is a 2n-th root of unity.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-60
SLIDE 60

Outline Introduction to FLINT Fast Library for Number Theory

Schoenhage-Strassen Method

◮ A polynomial of degree n is completely determined by its

values at n + 1 distinct points.

◮ g(x) = f1(x) ∗ f2(x) is determined by its value at 2n points, if

f1, f2 have length n.

◮ Discrete Fourier Transform chooses 2n-th roots of unity as the

points to evaluate at.

◮ Compute DFT of coefficients of f1, compute DFT of

coefficients of f2, multiply the 2n values, perform an inverse transform.

◮ FFT is a method for computing the DFT quickly. ◮ Schoenhage-Strassen technique works in the ring

Z/(2n + 1)Z, for which 2 is a 2n-th root of unity.

◮ Multiplications by roots of unity are now just bitshifts.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-61
SLIDE 61

Outline Introduction to FLINT Fast Library for Number Theory

FFT(A, m, w):

A = vector length m, w = primitive m-th root of unity if (m==1) return vector (a_0) else { A_even = (a_0, a_2, ..., a_{m-2}) A_odd = (a_1, a_3, ..., a_{m-1}) F_even = FFT(A_even, m/2, w^2) F_odd = FFT(A_odd, m/2, w^2) F = new vector of length m x = 1 for (j=0; j < m/2; ++j) { F[j] = F_even[j] + x*F_odd[j] F[j+m/2] = F_even[j] - x*F_odd[j] x = x * w } return F

Bill Hart Parallel Processing in Algebraic Number Theory

slide-62
SLIDE 62

Outline Introduction to FLINT Fast Library for Number Theory

What does MAGMA use?

Bill Hart Parallel Processing in Algebraic Number Theory

slide-63
SLIDE 63

Outline Introduction to FLINT Fast Library for Number Theory

What we do

◮ Variants of Schoenhage-Strassen and Kronecker-Schoenhage.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-64
SLIDE 64

Outline Introduction to FLINT Fast Library for Number Theory

What we do

◮ Variants of Schoenhage-Strassen and Kronecker-Schoenhage. ◮ Trick suggested by David Harvey and Paul Zimmerman for

KS.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-65
SLIDE 65

Outline Introduction to FLINT Fast Library for Number Theory

What we do

◮ Variants of Schoenhage-Strassen and Kronecker-Schoenhage. ◮ Trick suggested by David Harvey and Paul Zimmerman for

KS.

◮ Bailey’s four-step algorithm.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-66
SLIDE 66

Outline Introduction to FLINT Fast Library for Number Theory

What we do

◮ Variants of Schoenhage-Strassen and Kronecker-Schoenhage. ◮ Trick suggested by David Harvey and Paul Zimmerman for

KS.

◮ Bailey’s four-step algorithm. ◮ Truncated FFT (with 2-step) - Joris van der Hoeven.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-67
SLIDE 67

Outline Introduction to FLINT Fast Library for Number Theory

FLINT vs MAGMA

Bill Hart Parallel Processing in Algebraic Number Theory

slide-68
SLIDE 68

Outline Introduction to FLINT Fast Library for Number Theory

Parallelisation

◮ No global or static variables.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-69
SLIDE 69

Outline Introduction to FLINT Fast Library for Number Theory

Parallelisation

◮ No global or static variables. ◮ Memory management (needs to support multiple threads

requesting memory).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-70
SLIDE 70

Outline Introduction to FLINT Fast Library for Number Theory

Parallelisation

◮ No global or static variables. ◮ Memory management (needs to support multiple threads

requesting memory).

◮ Posix threads.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-71
SLIDE 71

Outline Introduction to FLINT Fast Library for Number Theory

Parallelisation

◮ No global or static variables. ◮ Memory management (needs to support multiple threads

requesting memory).

◮ Posix threads. ◮ Very next version of GCC will support OpenMP.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-72
SLIDE 72

Outline Introduction to FLINT Fast Library for Number Theory

Parallelisation

◮ No global or static variables. ◮ Memory management (needs to support multiple threads

requesting memory).

◮ Posix threads. ◮ Very next version of GCC will support OpenMP. ◮ Quadratic sieve can use disk based parallelism.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-73
SLIDE 73

Outline Introduction to FLINT Fast Library for Number Theory

Our hackish attempt at pthreads

◮ Frustration at the lack of open source mathematics that use

pthreads.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-74
SLIDE 74

Outline Introduction to FLINT Fast Library for Number Theory

Our hackish attempt at pthreads

◮ Frustration at the lack of open source mathematics that use

pthreads.

◮ Read that 200,000 threads can be started by the kernel, per

second.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-75
SLIDE 75

Outline Introduction to FLINT Fast Library for Number Theory

Our hackish attempt at pthreads

◮ Frustration at the lack of open source mathematics that use

pthreads.

◮ Read that 200,000 threads can be started by the kernel, per

second.

◮ Threads may take some time to be scheduled (real-time

threads).

Bill Hart Parallel Processing in Algebraic Number Theory

slide-76
SLIDE 76

Outline Introduction to FLINT Fast Library for Number Theory

Some solutions?

◮ Queue of jobs from which threads can pull tasks.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-77
SLIDE 77

Outline Introduction to FLINT Fast Library for Number Theory

Some solutions?

◮ Queue of jobs from which threads can pull tasks. ◮ Threads go to sleep when there is no work and wake up when

a condition is met.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-78
SLIDE 78

Outline Introduction to FLINT Fast Library for Number Theory

Some solutions?

◮ Queue of jobs from which threads can pull tasks. ◮ Threads go to sleep when there is no work and wake up when

a condition is met.

◮ For some problems, threads should not be used.

Bill Hart Parallel Processing in Algebraic Number Theory

slide-79
SLIDE 79

Outline Introduction to FLINT Fast Library for Number Theory

Two Threads versus None

Bill Hart Parallel Processing in Algebraic Number Theory

slide-80
SLIDE 80

Outline Introduction to FLINT Fast Library for Number Theory

Four Threads versus Two Threads

Bill Hart Parallel Processing in Algebraic Number Theory