www.csiro.au
Accelerating Science Platforms for Machine Learning, Big Data, and Earth System Science
John Taylor, John Zic, Jose Alverez, Oliver Obst George Opletal, Maciej Golebiewski, Amanda Barnard Emlyn Jones, Josh Bowden August 2015
Accelerating Science Platforms for Machine Learning, Big Data, and - - PowerPoint PPT Presentation
Accelerating Science Platforms for Machine Learning, Big Data, and Earth System Science John Taylor, John Zic, Jose Alverez, Oliver Obst George Opletal, Maciej Golebiewski, Amanda Barnard Emlyn Jones, Josh Bowden August 2015 www.csiro.au
www.csiro.au
John Taylor, John Zic, Jose Alverez, Oliver Obst George Opletal, Maciej Golebiewski, Amanda Barnard Emlyn Jones, Josh Bowden August 2015
university degrees
In partnership with universities, we develop 650 postgraduate research students
institutions in 14 of 22 research fields
Darwin
Alice Springs Geraldton
2 sites
Atherton Townsville
2 sites
Rockhampton Toowoomba Gatton Myall Vale Narrabri Mopra Parkes Griffith Belmont Geelong Hobart Sandy Bay Wodonga Newcastle Armidale
2 sites
Perth
3 sites
Adelaide
2 sites
Sydney 5 sites Canberra 7 sites
Murchison Cairns Irymple
Melbourne 5 sites
Werribee 2 sites
Brisbane
6 sites
Bribie Island
CSIRO Computational and Simulation Sciences/IMT
25 50 75 100 125 2013 2014 2015
100+ accelerated systems now on Top500 list 1/3 of total FLOPS powered by accelerators NVIDIA Tesla GPUs sweep 23 of 24 new accelerated supercomputers Tesla supercomputers growing at 50% CAGR
Top500: # of Accelerated Supercomputers
Source: NVIDIA, TOP500 List
50 100 150 200 250 300 350 2010/11 2011/6 2011/11 2012/6 2012/11 2013/6 2013/11 2014/6 2014/11 2015/6 2015/11 TOP500 Rank Green500 rank
CSIRO Computational and Simulation Sciences/IMT
Presentation title | Presenter name 7 |
*Alvarez and Petersson, Simplifying ConvNets for End-to-End Learning. To appear
Presentation title | Presenter name 8 |
NETWORK NUMBER OF PARAMETERS NUMBER OF
TOP-1 ACCURACY AlexNet OWT Bn 61M 5 57.9% B-NET (VGG-B) 133M 10 62.5% OURS* 15M 16 66.6% *Alvarez and Petersson, Simplifying ConvNets for End-to-End Learning. To appear
Presentation title | Presenter name 9 |
NETWORK NUMBER OF PARAMETERS NUMBER OF
TOP-1 ACCURACY AlexNet OWT Bn 58.6 M 5 44.5% B-NET (VGG-B) 130M 10 44.0% OURS* 10.2M 16 47.4% *Alvarez and Petersson, Simplifying ConvNets for End-to-End Learning. To appear
Presentation title | Presenter name 10 |
modelling of nanoparticle self-assembly is computational prohibitive.
nanoparticles are dominated by surface electrostatic forces, and thus internal bonding can be neglected.
grained surface point mesh model.
with Protoparticles (SNAP) package.
Atomistic Nanoparticle Surface mesh representation
to drug delivery, J. Med. Applied. Sci. 2(2) 2012, 31- 40.
Analyser
Analysis of the final configuration and dynamical evolution of particle assembly.
Simulator
Usually an NVT simulation quenched to produce a particle aggregate.
Generator
Designs particles, initial configuration and potentials
0.0 1.0
1000 3000 5000
Interfacial Probability Time (ps)
(100)|(100) (111)|(111)
different orientations calculated via ab-initio methods.
potentials with parameters for each pair of facet combination interactions. The parameters are then distributed over a facet’s points.
surfaces (hydroxylation, hydrogenation etc).
harmonic potential.
Clean Hydrogen Passivation Hydroxyl functionalization
2 4 6 8 10 12 1000 5000 10000 20000 50000 100000
STEPS/SEC Number of Nanodiamonds 9 GPUs CUDA-MPI Versus Serial CPU Code
CUDA-MPI 9 GPUs Serial CPU
Reads in output from Simulator and performs a variety of analysis including,
Often analysis is dynamical (as a function of time)
0.0 1.0 1000 2000 3000 4000
Interfacial Probability Time (ps)
(100)|(100 ) (110)|(110 ) Void locations where a 3.2 nm particle could fit
CUBE OCTAHEDRON RHOMBIC DODECAHEDRON
Particle Geometry Particle Size Particle Density Particle Geometry Composition Surface Functionalization
100-100 facet binding energy
CUBE 100 facets
CSIRO Bragg GPU Cluster (6 GPUs used over 15 hours), 5832 particles X 570 points, 664Å, 150,000 x 1fs steps
111 facets
CSIRO Bragg GPU Cluster (6 GPUs used over 15 hours), 5832 particles X 544 points, 664Å, 150,000 x 1fs steps
OCTAHEDRON
111 facets
CSIRO Bragg GPU Cluster (6 GPUs used over 15 hours), 5832 particles X 544 points, 664Å, 150,000 x 1fs steps
OCTAHEDRON
(30%)
interaction points)
6 GPUs over 130 hours
RED – (100) BLUE – (111) GREEN – (110)
0.1 0.2 0.3 0.4 0.5
Interfacial Probability Nanoparticle Facet
22Å (100) 22Å (111) 22Å (110) 27Å (100) 27Å (111) 27Å (110) 32Å (100) 32Å (111)
All 32Å Mixed Sizes
0.005 0.01 0.015 0.02 0.025 0.03 20 40 60 80 100
Pore Size Distribution (cm3/g.Å) Pore Diameter (Å)
Type 1 50 100 150 200 250 300 350 400 2 3 4 5 6 7
Number of nanoparticles (out of 5000) Number of q6·q6 interactions
Type 1
All 32Å Mixed Sizes
Mixtures produce larger pore sizes Mixtures are more “random”
All 32Å Mixed Sizes
Largest ‘111’ facets dominate interaction
Polydisperse aggregate
Porous Cages Mega-clusters
“We performed the largest self-assembly simulation of organic cages”
Evans et al. Journal of Physical Chemistry C, 2015, DOI:101.1021/jp512944r
Wall time reduced from 100 to 15 hours using GPUs
Presentation title | Presenter name | Page 25
DAP pulsar repository Compute on Bragg Cluster
27 | More information josh.bowden@csiro.au
2 4 6 8 10 12 14
5000 10000 15000 20000 25000 30000 35000 40000 45000
Speedup
Problem Size (N = M = K) MAGMA magma_2stage_syevdx() and MAGMAMIC magma_dsyevd() speedup over 16 core Sandybridge MKL dsyevr() R function eigen()
3 K20 2 K20 1 K20 MIC (7120)
The functionality is being incorporated into an R package used for predictive genomic modelling from large sequencing datasets.
1980 1990 2000 2010 20 30 40 50 60 70
Field 1
Y ear Soil Carbon (t/ha)
Wheat−Wheat Wheat−Wheat 1980 1990 2000 2010 20 30 40 50 60 70
Field 2
Y ear Soil Carbon (t/ha)
Wheat−Fallow Wheat−Fallow 1980 1990 2000 2010 20 30 40 50 60 70
Field 3
Y ear Soil Carbon (t/ha)
Wheat−Pasture Wheat−Pasture
CSIRO Computational and Simulation Sciences
“We’ve started to use the GPU cluster to speed up modelling of nuclear analysers such as CSIRO’s air cargo scanner. The speed is up to 5,000 to 10,000 times that of a normal desktop computer if we use most of the cluster. With this performance increase, simulations that normally take hours can be run interactively in real-time. We expect this interactivity to significantly benefit the design and
instruments.”
www.csiro.au
Data61 John A. Taylor t +61 2 6216 7077 E John.Taylor@data61.csiro.au w www.csiro.au