AI: From Phenomics to Genomics Xin Wu, CTO, GrandOmics 2019-12-16 - - PowerPoint PPT Presentation

ai from phenomics to genomics
SMART_READER_LITE
LIVE PREVIEW

AI: From Phenomics to Genomics Xin Wu, CTO, GrandOmics 2019-12-16 - - PowerPoint PPT Presentation

AI: From Phenomics to Genomics Xin Wu, CTO, GrandOmics 2019-12-16 311 www.grandomics.com AI: From m Phenomi nomics cs to Genomics mics 1. GrandOmics 2. Extremely Simplified Life Science 2. AI in Phenomics


slide-1
SLIDE 1

Xin Wu, CTO, GrandOmics 2019-12-16

AI: From Phenomics to Genomics

www.grandomics.com 北京昌平区奇点中心311

slide-2
SLIDE 2

AI: From m Phenomi nomics cs to Genomics mics

1. GrandOmics 2. Extremely Simplified Life Science

  • 2. AI in Phenomics

3. AI in Genomics

slide-3
SLIDE 3

GrandOmics

Helicos BioSciences推出 第一台单分子测序仪 HeliScope Pacific BioSciencesz正 式发售PacBio RS系统 (7.5万ZMW孔) 研发三代测序技术,筹备 三代测序服务 Oxford Nanopore Technologies 推出第 一款纳米孔测序仪 MinON 2013年3月11日,正式成 为中国首家三代测序服务 公司 PacBio推出版本 PacBio RS Ⅱ(15万 ZMW孔) 希望组成立 PacBio推出新版 Sequel测序仪 (15万 ZMW孔) 2016年11月 三代测序精准医疗启动 中华家系1号标准物质项目启动 成为世界首家三代测序遗传 病诊断公司 先后成为全球最大PacBio测 序中心及纳米孔测序中心 ONT推出新款桌面式纳米 测序仪GridION 2018年5月 华夏万人SV项目启动 加强临床落地 打开国际市场 ONT推出第一款高通量纳 米孔测序仪PromethION 全球第一台PromethION48 落户希望组,创造单日单机 测序数据量最高世界记录 (实际生产产出) 与ONT建立战略联盟关系 ONT推出可同时运行48 张芯片单个run产量达到 7.6T的PromethION48 2008 2011 2012 2013 2014 2015 2016 2017 2018 2019 未来组成立 2015年11月,第一个中国 人三代测序参考基因组

slide-4
SLIDE 4

GrandOmics (Cont.)

slide-5
SLIDE 5

Extremely Simplified Life Science

Genomics Proteomics Phenomics

slide-6
SLIDE 6

Extremely Simplified Life Science (Cont.)

Chr1: 250M bps, 3000+ genes

slide-7
SLIDE 7

Extremely Simplified Life Science (Cont.)

slide-8
SLIDE 8

Extremely Simplified Life Science (Cont.)

slide-9
SLIDE 9

Phenomics

slide-10
SLIDE 10

AI, Machine learning and deep learning

slide-11
SLIDE 11

We Are Talking About Deep Learning

slide-12
SLIDE 12

12

slide-13
SLIDE 13

13

slide-14
SLIDE 14

14

slide-15
SLIDE 15

15

slide-16
SLIDE 16

16

slide-17
SLIDE 17

17

slide-18
SLIDE 18

AI in Phenomics

slide-19
SLIDE 19

19

slide-20
SLIDE 20

20

slide-21
SLIDE 21

21

slide-22
SLIDE 22

Genomics

slide-23
SLIDE 23

Genomics (Cont.)

slide-24
SLIDE 24

AI in Genomics

The read and reference data are encoded as an image for each candidate variant site. A trained CNN calculates the genotype likelihoods for each site. A variant call is emitted if the most likely genotype is heterozygous or homozygous non-reference training the CNN reuses the DeepVariant machinery togenerate pileup images for a sample with known genotypes. These labeled image + genotype pairs, along with an initial CNN, which can be a random model, a CNN trained for other image classification tests, or a prior DeepVariant model, are used to optimize the CNN parameters to maximize genotype prediction accuracy using a stochastic gradient descent algorithm. After a maximum number

  • f cycles or time has elapsed or the model’s

performance has converged, the final trained model is frozen and can then be used for variant calling the reference and read bases, quality scores, and other read features are encoded into a red–green–blue (RGB) pileup image at a candidate variant. This encoded image is provided to the CNN to calculate the genotype likelihoods for the three diploid genotype states of homozygous reference (hom-ref), heterozygous (het) or homozygous alternate (hom-alt).

slide-25
SLIDE 25

AI in Genomics (Cont.)

slide-26
SLIDE 26

AI in Genomics (Cont.)

slide-27
SLIDE 27

AI in Genomics (Cont.)

Training Set

slide-28
SLIDE 28

AI in Genomics (Cont.)

Training Set More Accurate Illumina Data

slide-29
SLIDE 29

AI in Genomics (Cont.)

slide-30
SLIDE 30

AI in Genomics (Cont.)

slide-31
SLIDE 31

31

slide-32
SLIDE 32

32

slide-33
SLIDE 33

33

slide-34
SLIDE 34

AI in Genomics (Cont.)

slide-35
SLIDE 35

AI in Genomics (Cont.)

slide-36
SLIDE 36

AI in Genomics (Cont.)

slide-37
SLIDE 37

AI in Genomics (Cont.)

slide-38
SLIDE 38

38

slide-39
SLIDE 39

THANK YOU!

www.grandomics.com Bring Hope To Life