NANOMATERIALS DISCOVERY Michael Fernandez | OCE-Postdoctoral Fellow - - PowerPoint PPT Presentation

β–Ά
nanomaterials discovery
SMART_READER_LITE
LIVE PREVIEW

NANOMATERIALS DISCOVERY Michael Fernandez | OCE-Postdoctoral Fellow - - PowerPoint PPT Presentation

DATA ANALYTICS IN NANOMATERIALS DISCOVERY Michael Fernandez | OCE-Postdoctoral Fellow September 2016 www.data61.csiro.au Materials Discovery Process Materials Genome Project Integrating computational methods and information with sophisticated


slide-1
SLIDE 1

www.data61.csiro.au

DATA ANALYTICS IN NANOMATERIALS DISCOVERY

Michael Fernandez | OCE-Postdoctoral Fellow September 2016

slide-2
SLIDE 2

Data Analytics for Nanomaterials| Michael Fernandez 3 |

Materials Discovery Process

Integrating computational methods and information with sophisticated computational and analytical tools to shorten the duration

  • f

materials development from 10-20 years to 2 or 3 years.

Materials Genome Project

slide-3
SLIDE 3

Data Analytics for Nanomaterials| Michael Fernandez 2 |

Materials and Molecular Modeling

  • Deep learning and GPU computations

Chris Watkins

  • Big data and HPC integration

Piotr Szul Yulia Arzhaeva

Collaborators Team leader

Amanda Barnard Sun Baichuan

slide-4
SLIDE 4

Data Analytics for Nanomaterials| Michael Fernandez 2 |

Outline

  • Material Discovery Process
  • Methods for atomistic simulations of materials
  • Experimental confirmation
  • Data-driven Computatioanl Nanomaterials Discovery
  • Hypothetical material space sampling (structure generation and statistics)
  • In silico high-throughput characterization (atomistic simulations and

machine learning)

  • Data storage, analytics, exploitation and integration
slide-5
SLIDE 5

Data Analytics for Nanomaterials| Michael Fernandez 4 |

Atomistic Simulations of Materials

Theory can predict properties of materials

𝐼 = πΉπ‘œπ‘£π‘‘π‘šπ‘“π‘—({𝑆𝐽})-σ𝑗=1

𝑂𝑓 𝛼𝑗 2 + π‘Š π‘œπ‘£π‘‘π‘šπ‘“π‘— 𝑠 𝑗 + 1 2 Οƒπ‘—β‰ π‘˜ 𝑂𝑓 1 π‘ π‘—βˆ’π‘ π‘˜

  • Quantum chemistry methods can cover any

chemistry

  • Empirical potentials exist for large number of

elements

  • Computation is scalable and generically

deployable

slide-6
SLIDE 6

Data Analytics for Nanomaterials| Michael Fernandez 5 |

Atomistic Simulations of Materials

Computational predictions later confirmed by experiments

  • Self-assembly mechanism of

nanodiamonds

Barnard, A. S. et al. Nanoscale 3, 958–62 (2011)

The arrow indicates (111)|(111) interface between two 4 nm sized nanodiamonds. DFTB simulations of the surface electrostatic potential of dodecahedral diamond nanoparticles of a) 2.2 nm and b) 2.5 nm

slide-7
SLIDE 7

Data Analytics for Nanomaterials| Michael Fernandez 6 |

Materials libraries Experiment design

Database

Performance Measurement Lead materials scale up Knowledge for rational design Banks of materials Hypothetical materials In-silico screening

Modern materials discovery cycle

  • Polydispersive systems
  • Exponential increase in

complexity and diversity

  • Nearly infinite combinatorial

problem

Potyrailo, R. et al. ACS Comb. Sci. 13, 579–633 (2011).

Nano-

Theory, modeling and informatics

slide-8
SLIDE 8

Data Analytics for Nanomaterials| Michael Fernandez 7 |

Nanomaterials Screening

Departing from the Edisonian approach

vs.

We can accurately predict a property, so it can be computed for entire materials spaces Combinatorial In silico design

In silico structure generation

slide-9
SLIDE 9

Data Analytics for Nanomaterials| Michael Fernandez 7 |

Nanomaterials Screening

  • Systematic and extensive materials performance ranking.
  • β€œBig data” discovery of structure-property relationships in unknown materials domains.
  • Accelerated identification of high potential candidates and rational design principles.
slide-10
SLIDE 10

Data Analytics for Nanomaterials| Michael Fernandez 7 |

Data Analytics Challenges

Information representation

  • Fingerprints

Information extraction

  • Multivariate statistical analysis

Knowledge discovery

  • Data mining and machine learning

Knowledge representation

  • Visualization
slide-11
SLIDE 11

Data Analytics for Nanomaterials| Michael Fernandez 9 |

Polydispersity Challenge in Nanomaterials

Polydispersive sample Quasi-monodisperse sample Purification Controlled synthesis

$$$

Polydispersity can be detrimental for high-performing applications Purification of polydispersive nanoparticles samples is expensive

slide-12
SLIDE 12

Data Analytics for Nanomaterials| Michael Fernandez 10 |

Data Analytics of Nanocarbons

Nanodiamonds Graphenes

Virtual structures relaxed using TB-DFT

slide-13
SLIDE 13

Data Analytics for Nanomaterials| Michael Fernandez 10 |

Archetypal Analysis (AA)

The predictors of Xi are finite mixtures of archetypes Zj, which are convex combinations

  • f the observations.

Finds a kο‚΄ m matrix Z that corresponds to the archetypal or ”pure patterns” in the data in such a way that each data point can be represented as a mixture of those archetypes. In other words, the archetypal analysis yields the two n ο‚΄ k coefficient matrices Ξ± and Ξ², which minimize the residual sum of squares:

Cutler, A. & Breiman, L. Archetypal Analysis. Technometrics 36, 338–347 (1994).

slide-14
SLIDE 14

Data Analytics for Nanomaterials| Michael Fernandez 11 |

Archetypal Analysis of Nanocarbons

Nanodiamonds

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

Graphene nanoflakes

slide-15
SLIDE 15

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Nanocarbons Prototypes

Nanodiamonds prototypes

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

Graphene prototypes

slide-16
SLIDE 16

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Estimation of Nanodiamonds Properties

Fernandez, M. & Barnard, A. S. ACS Nano 9, 11980–11992 (2015).

slide-17
SLIDE 17

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Structural Diversity Challenge

Defects, oxidation and edge passivation yield large structural diversity

Graphene nanoflakes

Trigonal Rectangular Hexagonal

slide-18
SLIDE 18

Data Analytics for Nanomaterials| Michael Fernandez 13 |

P P P

Silicon qbits

Structural Diversity Challenge

Single Si substitution by P yields a large structural diversity

slide-19
SLIDE 19

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Metal-Organic Framework (MOF)

nitroimidazole benzimidazole

Zn2+

CO2 capture and sequestration (Science, 2008)

ZIF-68

Structural Diversity Challenge

slide-20
SLIDE 20

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Metal-Organic Framework (MOF) In-silico Combinatorial design

Structural Diversity Challenge

slide-21
SLIDE 21

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Metal-Organic Framework (MOF)

Modification with of 35 functional groups gives a total of ~1.5 million (354) unique combinations

MOF ZBP Organic Linkers

a)

Structural Diversity Challenge

slide-22
SLIDE 22

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Machine Learning Approach

Machine learning prediction of functional properties

Machine learning Feature fingerprints

slide-23
SLIDE 23

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Data Analytics Challenge

Binary decision tree of the Band Gap of graphene

Fernandez, M., Shi, H. & Barnard, A. S Carbon (2016). doi:10.1016/j.carbon.2016.03.005

Features

  • Surface area
  • Number of atoms
  • Shape aspect ratio

Accuracy 80%

slide-24
SLIDE 24

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Machine Learning vs. Atomistic Simulations

Estimation of the graphene Band Gap from topological features

Pi and Pj , are the values of a bond order of the carbon atoms in graphene, while L is the topological distance, whilst L

ij is a delta function delta function

Fernandez, M. et al. ACS Comb. Sci. (2016) doi:10.1021/acscombsci.6b00094

slide-25
SLIDE 25

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Machine Learning of Graphene

Fernandez, M.; Shi, H.; Barnard, A et al.

  • J. Chem. Inf. Model. (2015), 55, 2500-2506

Radial Distribution Function (RDF) scores for graphene

the summation is over the N atom pairs in the graphene structure, and rij is the distance of these pairs and B is a smoothing parameter set to 10.

slide-26
SLIDE 26

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Machine Learning vs. Atomistic Simulations

Ionization Potential

Fernandez, M.; Shi, H.; Barnard, A et al. J. Chem. Inf. Model. (2015), 55, 2500-2506

Energy of the Fermi level From RDF scores

slide-27
SLIDE 27

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Machine Learning vs. Atomistic Simulations

Machine learning prediction of gas adsorption in MOF

Fernandez, M., et al. J. Phys. Chem. C 117, 14095–14105 (2013).

slide-28
SLIDE 28

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Accuracy and Coverage Challenges

System size Accuracy of electronic calculations methods vs. system size

TBDF/ Semiempirical Density Functional

Accuracy

Quantum Monte Carlo Coupled Cluster

5000 atoms 500 atoms 100 atoms 20 atoms

slide-29
SLIDE 29

Data Analytics for Nanomaterials| Michael Fernandez 15 |

Data-driven Challenge

Machine learning for large material spaces βˆ†3

Machine learning predictions:

  • Functional property value or threshold
  • Accuracy of different quantum-chemistry methods =f( )

structure Machine Learning

Partial Database Screening Full Database Predictions

Ramakrishnan, R. et al. J. Chem. Theory Comput. 11, 2087–2096 (2015). Fernandez, M et al. J. Chem. Inf. Model. (2015), 55, 2500-2506

slide-30
SLIDE 30

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Challenges and Limitations

E QMC B3LYP 6,095 isomers of C7H10O2

big gap big gap

Accuracy Gap Between Different Levels of Theory

slide-31
SLIDE 31

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Accuracy Gap Predictions

Predictions Machine learning calibration

slide-32
SLIDE 32

Data-driven High-throughput Screening

  • utputs

Data storage Input jobs Queue management Computational resources Resubmit or kill failed runs Finished runs Accuracy refinement

Data Analytics for Nanomaterials| Michael Fernandez 13 |

slide-33
SLIDE 33

Self-organization Map (SOM) of NPs

Data Analytics for Nanomaterials| Michael Fernandez 13 |

Ag-NP

SOM

Electrostatic Potential

slide-34
SLIDE 34

Deep Learning for Nanomaterials

Data Analytics for Nanomaterials| Michael Fernandez 13 |

C2 Feature maps 10x10

5x5 convolution 2x2 subsampling 5x5 convolution 2x2 subsampling Fully connected

input 32x32 C1 Feature maps 28x28 S1 Feature maps 14x14 S2 Feature maps 5x5 B1 B2 Output

Classification Feature extraction

Image 32x32

slide-35
SLIDE 35

www.data61.csiro.au

Data61 Michael Fernandez e michael.fernandezllamosa@csiro.au w https://research.csiro.au/mmm/

Thank you