Uncovering Proteins Functions Through Multi-Layer Tissue Networks - PowerPoint PPT Presentation

Uncovering Proteins Functions Through Multi-Layer Tissue Networks Marinka Zitnik marinka@cs.stanford.edu Joint work with Jure Leskovec

Why tissues? A unified view of cellular functions across human tissues is essential for understanding biology, interpreting genetic variation, and developing therapeutic strategies [Greene et al. 2015, Yeger & Sharan 2015, GTEx and others] Marinka Zitnik, Stanford, ISMB/ECCB 2017 2

What Does My Protein Do? Goal: Given a set of proteins and possible functions, predict each protein’s association with each function Proteins × (Functions, Tissues) → [0,1] Midbrain development RPT6 WNT1 PPI network in Angiogenesis substantia nigra PPI network in tissue blood tissue 𝑋𝑂𝑈1 × (Midbrain development, Substantia nigra) → 0.9 RPT6 × (Angiogenesis, Blood) → 0.05 Marinka Zitnik, Stanford, ISMB/ECCB 2017 3

Existing Research § Guilty by association: protein’s function is determined based on who it interacts with [Zuberi et al. 2013, Radivojac et al. 2013, Kramer et al. 2014, Yu et al. 2015] and many others] § No tissue-specificity § Protein functions are assumed constant across organs and tissues: § Functions in heart are the same as in skin Lack of methods for predicting protein functions in different biological contexts Marinka Zitnik, Stanford, ISMB/ECCB 2017 4

Challenges § Tissues have inherently multiscale, hierarchical organization § Tissues are related to each other: § Proteins in biologically similar tissues have similar functions [Greene et al. 2015, ENCODE 2016] § Proteins are missing in some tissues § Interaction networks are tissue-specific § Many tissues have no annotations Marinka Zitnik, Stanford, ISMB/ECCB 2017 5

Machine Learning in Networks Midbrain WNT1 WNT1 development DLPG5 DLG5 INA INA RHOA RHOA Angiogenesis Machine learning ETS1 GPR4 ETS1 GPR4 NDNF NDNF HPSE HPSE Multi-label node classification: midbrain development, angiogenesis, etc. Marinka Zitnik, Stanford, ISMB/ECCB 2017 6

Machine Learning Lifecycle § Machine learning lifecycle: This feature, that feature § Every single time! Raw Node and edge Learning Prediction Networks profiles Model Algorithm Automatically Downstream task: Protein Feature engineering learn the features function prediction Marinka Zitnik, Stanford, ISMB/ECCB 2017 7

Feature Learning in Multi-Layer Graphs OhmNet: Unsupervised feature learning for multi-layer networks Vectors, node embeddings Layer u Layer Layer 𝑔 L , 𝑔 M , 𝑔 u Scale “3” N u 𝑔 O , 𝑔 P , 𝑔 Q Scale “2” Scale “1” 𝑣 → ℝ T Marinka Zitnik, Stanford, ISMB/ECCB 2017 8

Features in Multi-Layer Tissue Network § Given: Layers 𝐻 L L , hierarchy ℳ § Layers 𝐻 L LWQ..X are in leaves of ℳ L → ℝ T § Goal: Learn functions: 𝑔 L : 𝑊 § Multi-scale model: § Learn node embeddings at each possible scale § Layers 𝑗, 𝑘, 𝑙, 𝑚 § Scales “3”, “2”, “1” Marinka Zitnik, Stanford, ISMB/ECCB 2017 9

OhmNet Learning Approach OhmNet has two components: 1. Single-layer objectives Nodes with similar network neighborhoods in each layer are embedded close together 2. Hierarchical dependency objectives Nodes in nearby network layers in the hierarchy share similar features Marinka Zitnik, Stanford, ISMB/ECCB 2017 10

Single-Layer Objectives § Intuition: For each layer, embed u nodes to 𝑒 dimensions by preserving their similarity § Two nodes are similar if their neighborhoods are similar u § For node 𝑣 in layer 𝑗 we define nearby nodes as nodes in 𝐻 L visited by random walks starting at 𝑣 Marinka Zitnik, Stanford, ISMB/ECCB 2017 11

Dependencies Between Network Layers § Intuition: Proteins in biologically similar tissues share similar features § Use tissue hierarchy to recursively regularize features at 𝑗 to be similar to features in 𝑗 ’s parent “2” is a parent of 𝐻 L and 𝐻 ` OhmNet generates multi-scale node embeddings Marinka Zitnik, Stanford, ISMB/ECCB 2017 12

Data: 107 Tissue Layers ParietalLobe ParietalLobe CorpusCallosum CorpusCallosum § Layers are PPI nets: Placenta Placenta Oviduct Oviduct TemporalLobe TemporalLobe § Nodes: proteins Lens Lens FemaleReproductiveSystem FemaleReproductiveSystem Hindbrain Hindbrain Spermatid Spermatid Glia Glia Eye Eye Retina Retina Integument Integument Pons Pons § Edges: tissue-specific SpinalCord SpinalCord ReproductiveSystem ReproductiveSystem Choroid Choroid NervousSystem NervousSystem PPIs § Node labels: § “Cortex development” in EndocrineGland EndocrineGland BloodPlasma BloodPlasma One layer renal cortex tissue Pancreas Pancreas Hepatocyte Hepatocyte Basophil Basophil PancreaticIslet PancreaticIslet § “Artery morphogenesis” in artery tissue Marinka Zitnik, Stanford, ISMB/ECCB 2017 13

Experimental Setup § Protein function prediction is a multi-label node classification task § Every node (protein) is assigned one or more labels (functions) § Setup: § Learn OhmNet embeddings for multi-layer tissue network § Train a classifier for each function based on a fraction of proteins and all their functions § Predict functions for new proteins Marinka Zitnik, Stanford, ISMB/ECCB 2017 14

Tissue-Specific Protein Functions 0.756 OhmNet Protein function >10% improvement over function prediction methods prediction methods Mono-layer network >18% improvement over non- hierarchical versions of the dataset embeddings >15% improvement over Tensor decompositions matrix-based methods Marinka Zitnik, Stanford, ISMB/ECCB 2017 15

Case Study: 9 Brain Tissues Brain Brainstem Cerebellum Frontal Parietal Occipital Temporal lobe lobe lobe lobe Midbrain Substantia Pons Medulla nigra oblongata 9 brain tissue PPI networks in two-level hierarchy Marinka Zitnik, Stanford, ISMB/ECCB 2017 16

Multi-Scale Node Embeddings Brainstem Brain Marinka Zitnik, Stanford, ISMB/ECCB 2017 17

Annotating Proteins in a New Tissue § Transfer protein functions to an unannotated tissue § Task: Predict functions in target tissue without access to any annotation/label in that tissue Target tissue Tissue-specific (OhmNet) Tissue non-specific Improvement Placenta 0.758 0.684 11% Spleen 0.779 0.712 10% Liver 0.741 0.553 34% Forebrain 0.755 0.632 20% 40% Blood plasma 0.703 0.540 25% Smooth muscle 0.729 0.583 21% Average 0.746 0.617 Reported are AUROC values (see paper for other metrics) Marinka Zitnik, Stanford, ISMB/ECCB 2017 18

Conclusions § Unsupervised feature learning for multi-layer networks § Learned embeddings can be used for any downstream prediction task: node classification, node clustering, link prediction § OhmNet predicts protein functions across biological contexts A shift from flat networks to large multiscale systems in biology Marinka Zitnik, Stanford, ISMB/ECCB 2017 19

snap.stanford.edu/ohmnet Poster A-294 Travel Award Marinka Zitnik, Stanford, ISMB/ECCB 2017 20

Uncovering Proteins Functions Through Multi-Layer Tissue Networks - PowerPoint PPT Presentation

Uncovering Proteins Functions Through Multi-Layer Tissue Networks Marinka Zitnik marinka@cs.stanford.edu Joint work with Jure Leskovec Why tissues? A unified view of cellular functions across human tissues is essential for understanding

Vascular tissue stomata Palisade layer Vascular tissue Palisade layer Vascular tissue Air

Outline Outline Tissue Modeling and Tissue Modeling and Tissue characteristics Tissue

Uncovering Functions Through Multi-Layer Tissue Networks Marinka Zitnik marinka@cs.stanford.edu

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

AP BIOLOGY Membranes & Proteins Slide 3 / 181 Membranes & Proteins Click on the topic

AP BIOLOGY Membranes & Proteins Slide 3 / 181 Membranes & Proteins Click on the topic

Muscle Tissue Muscle Tissue Gen. Info. Muscle tissue makes up nearly half the bodys

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

TISSUE FREEZING METHODS FOR CRYOSTAT SECTIONING Basic Tissue Freezing Methods Preparing Tissue

Wei Huang, MD Pathology TRIP Laboratory Histology Tissue processing and embedding Cutting

Cancer chemoprevention and identification of antiangiogenic properties of olive oil compounds

In the news Nasdaq:JAN 1 JA JAN101 improves endothelial cell/vascular function Sustained

The Role of the Lung Microenvironment in Has documented that she has no financial Modulating

Some Advice on Applying Machine Learning in Practice CS 760@UW-Madison Its generalization

Biomolecular predictive factors of response: lights and shade Daniele Generali Dipartimento

Vascular Disruption and Vascular Disruption and Vascular Disruption and Vascular Disruption and

Melissa Troester, PhD, MPH What Predicts Breast Cancer Recurrence? Recurrence rates are

Nera%nib plus capecitabine versus lapa%nib plus capecitabine in pa%ents with HER2-posi%ve

Uncovering Proteins Functions Through Multi-Layer Tissue Networks - PowerPoint PPT Presentation

Uncovering Proteins Functions Through Multi-Layer Tissue Networks Marinka Zitnik marinka@cs.stanford.edu Joint work with Jure Leskovec Why tissues? A unified view of cellular functions across human tissues is essential for understanding

Vascular tissue stomata Palisade layer Vascular tissue Palisade layer Vascular tissue Air

Outline Outline Tissue Modeling and Tissue Modeling and Tissue characteristics Tissue

Uncovering Functions Through Multi-Layer Tissue Networks Marinka Zitnik marinka@cs.stanford.edu

Multi Multi Multi- Multi - - -Layer Access Control Layer Access Control Layer Access

AP BIOLOGY Membranes &amp; Proteins Slide 3 / 181 Membranes &amp; Proteins Click on the topic

AP BIOLOGY Membranes &amp; Proteins Slide 3 / 181 Membranes &amp; Proteins Click on the topic

Muscle Tissue Muscle Tissue Gen. Info. Muscle tissue makes up nearly half the bodys

Overview Multi-layer networks: Cognitive Modeling limits of single layer networks; Lecture

Network Layer October 2, 2019 guha.jayachandran@sjsu.edu Layer 2: Protocol atop Layer 1

A multi- -layer layer A multi A multi-layer research and training platform research and

Lecture 6: Wireless Link Layer, Lecture 6: Wireless Link Layer, MAC protocols, CSMA MAC

1 Transport Layer Transport Layer Outline Message, Segment, Datagram Transport-layer

ELEC / COMP 177 Fall 2016 Some slides from Kurose and Ross, Computer Networking , 5 th Edition

5 Network Layer Network Layer Network Layer Network Layer Example: Choosing among multiple ASes

TISSUE FREEZING METHODS FOR CRYOSTAT SECTIONING Basic Tissue Freezing Methods Preparing Tissue

Wei Huang, MD Pathology TRIP Laboratory Histology Tissue processing and embedding Cutting

Cancer chemoprevention and identification of antiangiogenic properties of olive oil compounds

In the news Nasdaq:JAN 1 JA JAN101 improves endothelial cell/vascular function Sustained

The Role of the Lung Microenvironment in Has documented that she has no financial Modulating

Some Advice on Applying Machine Learning in Practice CS 760@UW-Madison Its generalization

Biomolecular predictive factors of response: lights and shade Daniele Generali Dipartimento

Vascular Disruption and Vascular Disruption and Vascular Disruption and Vascular Disruption and

Melissa Troester, PhD, MPH What Predicts Breast Cancer Recurrence? Recurrence rates are

Nera%nib plus capecitabine versus lapa%nib plus capecitabine in pa%ents with HER2-posi%ve

AP BIOLOGY Membranes & Proteins Slide 3 / 181 Membranes & Proteins Click on the topic

AP BIOLOGY Membranes & Proteins Slide 3 / 181 Membranes & Proteins Click on the topic