1 Biology Fundamentals - Expression Microarrays Transcriptome: - PDF document

Differential gene expression General Introduction Swiss Institute of Bioinformatics - LF 11.2010 Overview (1) Reminder of biology n Major steps in microarray analysis n Microarray preparation design, clone/probe selection ¡ RNA extraction, hybridization on chip ¡ Scanning, data extraction from image ¡ “ Low-level ” Quality Control ¡ Summarization of per-chip information (one number per feature) ¡ “ High-level ” analysis ¡ High-throughput RNA-level technologies n Microarrays ¡ Affymetrix Chips ¡ SAGE ¡ MPSS ¡ Swiss Institute of Bioinformatics - LF 11.2010 Biology Fundamentals - Genes Swiss Institute of Bioinformatics - LF 11.2010 1

Biology Fundamentals - Expression Microarrays Transcriptome: Genes Proteome: Proteins Swiss Institute of Bioinformatics - LF 11.2010 Genomics Fundamentals - Complexity mRNA purification Difficulties: § Contaminations § Alternative Splicing § Alternative PolyAdenylation Swiss Institute of Bioinformatics - LF 11.2010 RNA abundance in mammalian cells Molecules/cell 500+ 50-500 tRNA mRNA 1-50 1% rRNA 80% 3 x 10 6 molecules/cell 3 x 10 5 molecules/cell 1-2 x10 4 different genes Swiss Institute of Bioinformatics - LF 11.2010 2

Expression analysis Low throughput n Northern blot ¡ Differential display ¡ Quantitative PCR ¡ High throughput n DNA arrays / Chips ¡ Spotted arrays (Stanford arrays) n Affymetrix (photolithography inspired) n Oligo-arrays (Agilent, NimbleGen) n Serial Analysis of Gene Expression (SAGE) ¡ RNASeq ¡ Swiss Institute of Bioinformatics - LF 11.2010 What are DNA Microarrays ? Microarray analysis is a technology that allows scientists to simultaneously detect thousands of genes in a small sample and to analyze the expression of those genes. Microarrays are simply ordered sets of DNA molecules of known sequence. Usually rectangular shaped, they can consist of a few hundred to hundreds of thousands of sets. Each individual sequence goes on the array at precisely defined location. Swiss Institute of Bioinformatics - LF 11.2010 Potential application domains Identification of complex genetic diseases n Drug discovery and toxicology studies n Mutation/polymorphism detection (SNP ’ s) n Pathogen analysis n Differing expression of genes over time, between tissues, and disease n states Preventive medicine n Specific genotype (population) targeted drugs n More targeted drug treatments – AIDS n Genetic testing and privacy n Swiss Institute of Bioinformatics - LF 11.2010 3

The challenge The big revolution here is in the "micro" term. New slides will contain a survey of the human genome on a 2 cm 2 chip! The use of this large-scale method tends to create phenomenal amounts of data, that have then to be analyzed, processed and stored. This is a job for … Bioinformatics ! Swiss Institute of Bioinformatics - LF 11.2010 General overview n Making the chip ¡ Experiment design, clone/probe selection, collection } wet lab maintenance, PCR, spotting, printing, synthesis n Sample hybridization ¡ Sample purification, labelling, hybridization, washing n Scanning and image treatment ¡ Fluorescence correction, find spots, background n Analysing the data ¡ Filtering, normalisation ¡ Clustering (hierarchical, centroid, … ) n Representation, storage ¡ Graphics, databases, web public resources Swiss Institute of Bioinformatics - LF 11.2010 Biological question � Scientific Process ( e.g. Differentially expressed genes, � Sample class prediction, etc .) � Experimental design � Microarray experiment � Pre-processing steps � Image analysis / � (failed) � Quality assessment � Normalization � Data Analysis � Estimation � Testing � Clustering � Discrimination � Biological verification � and interpretation � Swiss Institute of Bioinformatics - LF 11.2010 4

Question addressed by microarrays What are the differences (in gene expression) between two n cell lines ? What is the difference between knock-out and wild-type mice? n What is the difference between a tumor and a healthy tissue ? n Are there different tumor types ? n Key concept: Compare gene expression in two (or more) cell/ n tissue types ? Gene expression assessed by measuring the number of RNA ¡ transcripts. No absolute measurement. ¡ Swiss Institute of Bioinformatics - LF 11.2010 THE EXPERIMENT : making the chip 1- Designing the chip : choosing genes of interest for the experiment and/or select the samples - Selection of sequences that represent the investigated genes. - Finding sequences, usually in the EST database. - Problems : sequencing errors, alternative splicing, chimeric sequences, contamination … Swiss Institute of Bioinformatics - LF 11.2010 Clone/probe selection General n Not too short (sensitivity, selectivity) ¡ Not too long (viscosity, surface properties) ¡ Not too heterogeneous (robustness) ¡ Degree of importance depends on method ¡ Single strand methods (Oligos, ss-cDNA) n Orientation must be known ¡ ss-cDNA methods are not perfect ¡ ds-cDNA methods don’t care ¡ Swiss Institute of Bioinformatics - LF 11.2010 5

Probe selection approaches Accuracy Throughput Selected ESTs Genes Selected Gene Cluster Anonymous Regions Representatives Swiss Institute of Bioinformatics - LF 11.2010 Selection of gene regions 3‘ UTR ORF 5‘ UTR Swiss Institute of Bioinformatics - LF 11.2010 Alternative polyadenylation Particular problem with Affymetrix Swiss Institute of Bioinformatics - LF 11.2010 6

Alternative splicing Swiss Institute of Bioinformatics - LF 11.2010 Alternative promoter usage Swiss Institute of Bioinformatics - LF 11.2010 Selection of gene regions - summary Coding region (ORF) 3’ untranslated region n n Annotation less safe Annotation relatively safe ¡ ¡ danger of alternative polyA sites No problems with alternative ¡ ¡ danger of repetitive elements polyA sites ¡ less likely to cross-hybridize with No repetitive elements or other ¡ ¡ isoforms funny sequences little danger of alternative splicing ¡ danger of close isoforms ¡ 5’ untranslated region n danger of alternative splicing ¡ close linkage to promoter ¡ might be missing in short RT ¡ frequently not available ¡ products Swiss Institute of Bioinformatics - LF 11.2010 7

A checklist n Pick a gene n Try to get a complete cDNA sequence n Verify sequence architecture (e.g. cross-species comparison) n Mask repetitive elements (and vector!) n If possible, discard 3’-UTR beyond first polyA signal n Look for alternative splice events n Use remaining region of interest for similarity searches n Mask regions that could cross-hybridize n Use the remaining region for probe amplification or EST selection n When working with ESTs, use sequence-verified clones Swiss Institute of Bioinformatics - LF 11.2010 THE EXPERIMENT : making the chip 2- Spotting the sequences on the substrate - Substrate : usually glass, but also nylon membranes, plastic, ceramic … - Sequences : cDNA (500-5000 nucleotides), oligonucleotides (20~80-mer oligos), genomic DNA ( ~50 ’ 000 bases) - Printing methods : microspotting, ink-jetting or in-situ printing, photolithography Swiss Institute of Bioinformatics - LF 11.2010 Microarrays: the making of Microspotting and ink-jetting Swiss Institute of Bioinformatics - LF 11.2010 8

Array Production: Spotting Swiss Institute of Bioinformatics - LF 11.2010 Array Production: ” photolithography" Febit/NimbleGen Affymetrix Each probe 25 bp long n 22-40 probes per gene n Perfect Match (PM) as well as n MisMatch (MM) probes Probe length: 24mer -70mer n Gene/Array: Up to 38,000 n Probes/Gene: 10-25 n Only perfect match probes n Swiss Institute of Bioinformatics - LF 11.2010 Array Production: “ Inkjet ” Agilent (HP SurePrint technology) cDNA printing n 60bp oligo in-situ synthesis n Swiss Institute of Bioinformatics - LF 11.2010 9

1- Samples 2- Extracting mRNA 3- Labeling 4- Hybridizing 5- Scanning 6- Visualizing Swiss Institute of Bioinformatics - LF 11.2010 Spotted array preparation “Average” mouse mRNA RT-PCR (conversion mRNA-cDNA, amplification) cDNA isolation Test sequence (probe) production ~100 - ~2000 bp Swiss Institute of Bioinformatics - LF 11.2010 Oligo array preparation Millions of experiences worldwide Probe (sequence) design - known genes - putative genes - alternative splicing - GC contents ~60 bp sequences Sequence databases In-situ synthesis Gene-specific sequences Swiss Institute of Bioinformatics - LF 11.2010 10

Spotted and oligo array usage Relative mRNA levels Scanning cy3 labeled cDNA Mix cy5 labeled cDNA Hybridization washing Swiss Institute of Bioinformatics - LF 11.2010 Affymetrix chip preparation In-situ synthesis 25 bp sequences Millions of experiments worldwide Probe (sequence) design - known genes - putative genes - alternative splicing - GC contents ~100s of bp “ consensus ” Sequence databases sequences Bioinformatics thinking yields gene-specific sequences (3 ’ -end) Swiss Institute of Bioinformatics - LF 11.2010 Affymetrix chip usage Relative mRNA Hybridization levels washing Swiss Institute of Bioinformatics - LF 11.2010 11

1 Biology Fundamentals - Expression Microarrays Transcriptome: - PDF document

Differential gene expression General Introduction Swiss Institute of Bioinformatics - LF 11.2010 Overview (1) Reminder of biology n Major steps in microarray analysis n Microarray preparation design, clone/probe selection RNA

OSHA BWC or OSHA? RECORDKEEPING What do you need to know? You can not determine one based on

The Eternal Challenge of Employee Leave Presented by Pierce Atwoods Employment Group March 10,

JamPack A smarter, lighter camera protection Our Consumers Urbanites carrying expensive cameras

Timed Systems: The Unmet Challenge Oded Maler CNRS - VERIMAG Grenoble, France QAPL, April 2014

Design patterns I : Introduction J.Serrat 102759 Software Design September 29, 2013 References

Natural and man-made disasters assessment 1.01 Prof. R. Nagarajan, CSRE , IIT Bombay GNR 639

The Dynamics of Freight Car Cushioning John F. Deppen Director, Engineering End of Car

Algebraic Immunity of S-boxes and Augmented Functions Simon Fischer and Willi Meier S. Fischer

Modeling Allocation Strategies for the Initial SARS-CoV-2 Vaccine Supply Rachel B. Slayton, PhD,

SUSTAINING WELLNESS: STRESS, COPING, IMMUNITY & RESILIENCE Christopher Fagundes, Ph.D.

Win32 Exploit Development with pvefindaddr Peter Van Eeckhoutte 2011 Peter

COVID Crisis Monumental Changes for Health Care 1 MOSHPIT MASKS OUTDOORS SOCIAL DISTANCING

Exploiting Branch Target Injection Jann Horn, Google Project Zero 1 Outline Introduction

On the Fast Algebraic Immunity of Majority Functions Pierrick M AUX ICTEAM/ELEN/Crypto Group,

Lecture 8.1: Modeling with nonlinear systems Matthew Macauley Department of Mathematical Sciences

TARRANT COUNTY COLLEGE DISTRICT VOCATIONAL NURSING Mission We prepare skilled nurses to deliver

On Enforcing the Digital Immunity of a Large Humanitarian Organization Stevens Le Blond ,

VACCINES TO END CANCER MARY L. (NORA) DISIS ASSOCIATE DEAN, TRANSLATIONAL HEALTH SCIENCE, UW

1 2 . Strategies for Maximizing Digital Impact National Immunization Awareness Month Webinar

Office Hours: COVID-19 Planning and Response September 11, 2020 Housekeeping A recording of

Best Practices in Promoting College Immunizations Hosted by WithinReach and the Immunization

Immunization Information Systems (IIS) with EHRs to Reduce Burden and Improve Care HIMSS Spring

Collaborating for Health A WIC Immunization Journey Department of Health and Human Services 1

Four Providers offering patient care each day Typically 2-3 Physicians paired with 1-2

Sambuz

Useful Links

Newsletter

Mail Us

1 Biology Fundamentals - Expression Microarrays Transcriptome: - PDF document

Differential gene expression General Introduction Swiss Institute of Bioinformatics - LF 11.2010 Overview (1) Reminder of biology n Major steps in microarray analysis n Microarray preparation design, clone/probe selection RNA

OSHA BWC or OSHA? RECORDKEEPING What do you need to know? You can not determine one based on

The Eternal Challenge of Employee Leave Presented by Pierce Atwoods Employment Group March 10,

JamPack A smarter, lighter camera protection Our Consumers Urbanites carrying expensive cameras

Timed Systems: The Unmet Challenge Oded Maler CNRS - VERIMAG Grenoble, France QAPL, April 2014

Design patterns I : Introduction J.Serrat 102759 Software Design September 29, 2013 References

Natural and man-made disasters assessment 1.01 Prof. R. Nagarajan, CSRE , IIT Bombay GNR 639

The Dynamics of Freight Car Cushioning John F. Deppen Director, Engineering End of Car

Algebraic Immunity of S-boxes and Augmented Functions Simon Fischer and Willi Meier S. Fischer

Modeling Allocation Strategies for the Initial SARS-CoV-2 Vaccine Supply Rachel B. Slayton, PhD,

SUSTAINING WELLNESS: STRESS, COPING, IMMUNITY &amp; RESILIENCE Christopher Fagundes, Ph.D.

Win32 Exploit Development with pvefindaddr Peter Van Eeckhoutte 2011 Peter

COVID Crisis Monumental Changes for Health Care 1 MOSHPIT MASKS OUTDOORS SOCIAL DISTANCING

Exploiting Branch Target Injection Jann Horn, Google Project Zero 1 Outline Introduction

On the Fast Algebraic Immunity of Majority Functions Pierrick M AUX ICTEAM/ELEN/Crypto Group,

Lecture 8.1: Modeling with nonlinear systems Matthew Macauley Department of Mathematical Sciences

TARRANT COUNTY COLLEGE DISTRICT VOCATIONAL NURSING Mission We prepare skilled nurses to deliver

On Enforcing the Digital Immunity of a Large Humanitarian Organization Stevens Le Blond ,

VACCINES TO END CANCER MARY L. (NORA) DISIS ASSOCIATE DEAN, TRANSLATIONAL HEALTH SCIENCE, UW

1 2 . Strategies for Maximizing Digital Impact National Immunization Awareness Month Webinar

Office Hours: COVID-19 Planning and Response September 11, 2020 Housekeeping A recording of

Best Practices in Promoting College Immunizations Hosted by WithinReach and the Immunization

Immunization Information Systems (IIS) with EHRs to Reduce Burden and Improve Care HIMSS Spring

Collaborating for Health A WIC Immunization Journey Department of Health and Human Services 1

Four Providers offering patient care each day Typically 2-3 Physicians paired with 1-2

Sambuz

Useful Links

Newsletter

Mail Us

SUSTAINING WELLNESS: STRESS, COPING, IMMUNITY & RESILIENCE Christopher Fagundes, Ph.D.