Statistics of the Universe: Exa-calculations and Cosmology's Data - PowerPoint PPT Presentation

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Matt Bellis Debbie Bard

Cosmology: the study of the nature and history of the Universe ● History of Universe driven by competing forces: ○ gravitational attraction ○ repulsive dark energy

How we study cosmology ● Use computer simulations of the Universe to compare theoretical models to data. ● Comparison of dark- matter simulation (Bolshoi), to galaxy locations from Sloan Digital Sky Survey (SDSS). image credit: Sources: Nina McCurdy/University of California, Santa Cruz; Ralf Kaehler and Risa Wechsler/Stanford University; Sloan Digital Sky Survey; Michael Busha/University of Zurich

two-point function ● Two point function: data counting galaxy pairs as a simulation function of distance. 1 10 100 distance between galaxy pairs

Cosmology three-point function data simulation ● Three point function: counting galaxy triplets. 0.2 0.4 0.6 0.8 1.0 opening angle of triangle

Cosmology ● Three point function: counting galaxy triplets.

Three-point function two-point three-point data ● New information about simulation topology of Universe becomes accessible in three-point function. ● Can use it to distinguish between different theoretical models of cosmology. 1 10 100 0.2 0.4 0.6 0.8 1.0 distance between galaxy pairs opening angle of triangle Kulkarni et al., MNRAS 378 3 (2007)

How we calculate these functions ● Count pairs and triplets of galaxies two-point function: O(N 2 ) [Bard, Bellis et al , AsCom 1 17 (2013)] ○ three-point function: O(N 3 ) ! ○ ● Previously rely on approximation code… ○ Insufficient for precision cosmology ● Histogram according to ○ Distance between galaxies (two-point) → 1D histogram 1 10 100 0.2 0.4 0.6 0.8 1.0 distance between galaxy pairs opening angle of triangle ○ triangle configuration (three-point) → 3D histogram!

Computational challenges growing...can GPUs help? 2015: # of galaxies = 100,000 O(N 3 ) = 10 15 calculations (1 quadrillion) 2025: # of galaxies = 1,000,000 O(N 3 ) = 10 18 calculations ( 1 quintillion! Exa-scale! )

Histogramming Volume of calculations. Each point represents the 3 numbers that describe the triangle formed by the galaxies indexed along each axis.

Histogramming One slice of the histogram calculations represents all the triangles that use one common galaxy.

Histogramming Each thread does calculations for one ``pivot” galaxy.

Histogramming We can choose to break up the volume of calculations into subvolumes . These subvolumes can be farmed out to different CPUs/GPUs, and the results combined.

Histogramming Challenges arise if multiple threads are trying to increment the same bin.

Histogramming issues Binning matters! ● Finer bins good ! ○ Discern structure → ○ Less thread block! ● Finer bins bad ! ○ Limited shared memory Shared memory is capped at 48k 50 x 16 x 32 x (4 bytes) = 102k! (previous measurements) Kulkarni et al., MNRAS 378 3 (2007)

Histogramming Large number of bins to fill if everything is kept. Do we need to keep everything?

Histogramming Only record part of the calculations. These samples of the full calculation are enough to test different Cosmologies.

# galaxies CPU times GPU times CPU vs GPU (minutes) (minutes) 1000 3.2 0.15 ● Speedup of ~20x compared to CPU. 5000 500 19 ● 50k sample run on (8.25 hours) ○ SLAC, 7000 CPUs ○ XSEDE/Stampede 10000 2,790 120 128 GPUs (46.5 hours) ○ Turnaround time for researcher is 3-4 50000 480,000 20,400 days. (8000 hours) (340 hours) (333 days) (14 days) CPU (desktop): Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz GPU: NVIDIA Tesla K40

Comparison to approximation code # galaxies CPU time KD-tree GPU time ● GPU is faster than KD- (minutes) (minutes) (minutes) tree approximation 1000 3.4 0.9 0.2 method 5000 300 22 14 And, it’s precise! ○ KD-tree approximates to a level of 0.05 in each triangle parameter. ● Can improve precision in KD- tree by using smaller leafs, but runs much slower (~x10)

Summary ● Cosmology is entering the Big Data era ● Cosmological calculations do not scale well to Big Data! 3-point correlation function: O(N 3 ) ○ ● GPUs enable precise calculations in a reasonable time- frame ○ 20x faster than CPU ○ faster than approximation code! ● Interesting issues with histogramming. ● Easily scales up to multi-GPU clusters ○ Exa-scale calculations feasible! https://github.com/djbard/ccogs

References Fosalba, P et al. "The MICE Grand Challenge Lightcone Simulation I: ● Dark matter clustering." arXiv preprint arXiv:1312.1707 (2013). ● Kulkarni, Gauri V et al. "The three-point correlation function of luminous red galaxies in the Sloan Digital Sky Survey." Monthly Notices of the Royal Astronomical Society 378.3 (2007): 1196-1206. Kayo, Issha et al. "Three-Point Correlation Functions of SDSS Galaxies ● in Redshift Space: Morphology, Color, and Luminosity Dependence." Publications of the Astronomical Society of Japan 56.3 (2004): 415-423. Podlozhnyuk, Victor. "Histogram calculation in CUDA." NVIDIA ● Corporation, White Paper (2007). ● Bard, Deborah et al. "Cosmological calculations on the GPU." Astronomy and Computing 1 (2013): 17-22.

Extra Slides

Cosmology: the study of the nature and history of the Universe ● The nature of the Dark Universe is the biggest puzzle facing scientists today.

Dark Energy and the growth of structure ● Dark energy affects the growth of structure over time. These simulations were carried out by the Virgo Supercomputing Consortium using computers based at Computing Centre of the Max-Planck Society in Garching and at the Edinburgh Parallel Computing Centre. The data are publicly available at www.mpa-garching.mpg.de/galform/virgo/int_sims

Examples of reduced 3- point function in different triangle parameterisation binning.

How we calculate these functions ● Use estimators: ξ = DD-2DR+RR , ζ = DDD - 3DDR + 3DRR - RRR RR RRR ● Count pairs and triplets of galaxies two-point function: O(N 2 ) ○ ■ [Bard, Bellis et al, AsCom 1 (2013)] three-point function: O(N 3 ) ! ○ ● Histogram according to ○ Distance between galaxies (two-point) → 1D histogram ○ triangle configuration (three-point) → 3D 1 10 100 0.2 0.4 0.6 0.8 1.0 distance between galaxy pairs opening angle of triangle histogram! Landy & Szalay (1993), Szapudi & Szalay (1998)

Binning Matters Histogramming is non-trivial! Podlozhnyuk, Victor. "Histogram calculation in CUDA." NVIDIA Corporation, ● White Paper (2007). We take naive, but maintainable/implementable approach. ● Use shared memory for a histogram for each block. Collect entries at the end of the kernel launch. ● Sum each block’s histogram on the CPU. ● We use atomicAdd to avoid losing information if threads become serialized.

Challenges of histogramming Multiple threads want to increment the same bin SOLUTION: Use atomics and increase granularity of bin. but … . increasing the granularity for 3ptCF goes as granularity 3 ! Shared memory is capped at 48k 24 x 24 x 24 x (4 bytes) = 55k! Yikes!

Histogramming bottlenecks Unique issues with histogramming. We’ve tried: ● global memory ○ can have very fine bins (avoids thread block) but data transfer is slow. ● shared memory ○ limited # bins so thread block is an issue ○ nevertheless, faster than using global memory! ● __shfl ○ can share data between threads. Only one thread per warp writes to histo - avoids atomicadd thread-lock within warp. ○ actually takes longer to sum across warp for all histogram bins! ● randomising data was vital!

Within the kernel... // On each block, create a histogram that is visible to all the // threads in that block __shared__ int shared_hist[NUMBER_OF_BINS]; // Run over all the calculations // Increment the appropriate bin atomicAdd(&shared_hist[i2],1); __syncthreads(); // Copy each block’s shared histogram to sections of dev_hist on // (global memory). The summation will take place on the CPU if(threadIdx.x==0) { for(int i=0;i<tot_hist_size;i++) { dev_hist[i+(blockIdx.x*NUMBER_OF_BINS)]=shared_hist[i]; } }

Statistics of the Universe: Exa-calculations and Cosmology's Data - PowerPoint PPT Presentation

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Matt Bellis Debbie Bard Cosmology: the study of the nature and history of the Universe History of Universe driven by competing forces: gravitational attraction

Early Universe and BBN ASTR/PHYS 4080: Intro to Cosmology Week 9 ASTR/PHYS 4080: Introduction to

The Universe: What We Know and What we Don t Fundamental Physics Cosmology Elementary

A Brief History of Cosmology 1905 to 2005 1 A Brief History of Cosmology 1905 to 2005

Cosmology with CMB and Large-scale Structure of the Universe Eiichiro Komatsu Texas Cosmology

Cosmology with Large-scale Structure of the Universe Eiichiro Komatsu (Texas Cosmology Center, UT

Lecture 25: Modern Physics and the Universe The Universe We Live In Announcements Cosmology:

Cosmology: How Did We Learn About The Universe By Dr. Philip N. Eisner May 2015 Ptolemy made a

String cosmology and String cosmology and String cosmology and the index of the Dirac Dirac

Cosmology at the University of Cape Town http:/ /cosmology.uct.ac.za The coming of age of

Observational Cosmology (C. Porciani / K. Basu) Lecture 7 Cosmology with galaxy clusters

Symmetries of String Theory and Early Introduction Universe Cosmology T-Duality: Key Symmetry

Chapter 26: Cosmology Cosmology means the study of the structure and evolution of the entire

Primordial Cosmology through Large-scale Structure of the Universe Eiichiro Komatsu

Challenges and Solutions for Peta- and Exa-Sacle Programming Tasuku Hiraishi Academic Center for

PROJECT HOLLOWAY LANDSCAPE PRESENTATION AUGUST 2020 ABOUT EXTERIOR ARCHITECTURE WHO ARE EXA?

Exa & Yotta Scale Data SC 08 Panel November 21 2008, Austin, TX Garth Gibson Carnegie

Corporate Presentation March 2019 Advisories In the interest of providing information

Our innovative solutions enable the general public to rediscover the wealth of cultural heritage

Histio UK and our school Histiocytosis UK Histio UK is the UKs leading histiocytosis charity

Dr. R. JAYABALAN Assistant Professor Food and Bioprocess Technology Laboratory Department of

SAVANNAH ZONING ORDINANCE UPDATE Presentation to the Planning Commission February 12, 2019

Investor Presentation March 2020 Management Team Tyler Glover Chief Executive Officer Robert

Ne New w Paradigm. Ne New Partners. Ne New w Ener nergy gy. . Inves esto tor Presen

ESG Environment / Social / Governance Kirby Corporation 2018 Sustainability is integral to our

Statistics of the Universe: Exa-calculations and Cosmology's Data - PowerPoint PPT Presentation

Statistics of the Universe: Exa-calculations and Cosmology's Data Deluge Matt Bellis Debbie Bard Cosmology: the study of the nature and history of the Universe History of Universe driven by competing forces: gravitational attraction

Early Universe and BBN ASTR/PHYS 4080: Intro to Cosmology Week 9 ASTR/PHYS 4080: Introduction to

The Universe: What We Know and What we Don t Fundamental Physics Cosmology Elementary

A Brief History of Cosmology 1905 to 2005 1 A Brief History of Cosmology 1905 to 2005

Cosmology with CMB and Large-scale Structure of the Universe Eiichiro Komatsu Texas Cosmology

Cosmology with Large-scale Structure of the Universe Eiichiro Komatsu (Texas Cosmology Center, UT

Lecture 25: Modern Physics and the Universe The Universe We Live In Announcements Cosmology:

Cosmology: How Did We Learn About The Universe By Dr. Philip N. Eisner May 2015 Ptolemy made a

String cosmology and String cosmology and String cosmology and the index of the Dirac Dirac

Cosmology at the University of Cape Town http:/ /cosmology.uct.ac.za The coming of age of

Observational Cosmology (C. Porciani / K. Basu) Lecture 7 Cosmology with galaxy clusters

Symmetries of String Theory and Early Introduction Universe Cosmology T-Duality: Key Symmetry

Chapter 26: Cosmology Cosmology means the study of the structure and evolution of the entire

Primordial Cosmology through Large-scale Structure of the Universe Eiichiro Komatsu

Challenges and Solutions for Peta- and Exa-Sacle Programming Tasuku Hiraishi Academic Center for

PROJECT HOLLOWAY LANDSCAPE PRESENTATION AUGUST 2020 ABOUT EXTERIOR ARCHITECTURE WHO ARE EXA?

Exa &amp; Yotta Scale Data SC 08 Panel November 21 2008, Austin, TX Garth Gibson Carnegie

Corporate Presentation March 2019 Advisories In the interest of providing information

Our innovative solutions enable the general public to rediscover the wealth of cultural heritage

Histio UK and our school Histiocytosis UK Histio UK is the UKs leading histiocytosis charity

Dr. R. JAYABALAN Assistant Professor Food and Bioprocess Technology Laboratory Department of

SAVANNAH ZONING ORDINANCE UPDATE Presentation to the Planning Commission February 12, 2019

Investor Presentation March 2020 Management Team Tyler Glover Chief Executive Officer Robert

Ne New w Paradigm. Ne New Partners. Ne New w Ener nergy gy. . Inves esto tor Presen

ESG Environment / Social / Governance Kirby Corporation 2018 Sustainability is integral to our

Exa & Yotta Scale Data SC 08 Panel November 21 2008, Austin, TX Garth Gibson Carnegie