Genomics extravaganza Genomics overview Genomics analysis of the - - PowerPoint PPT Presentation

genomics
SMART_READER_LITE
LIVE PREVIEW

Genomics extravaganza Genomics overview Genomics analysis of the - - PowerPoint PPT Presentation

A Genomics extravaganza Genomics overview Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion. Structural genomics includes the genetic mapping,


slide-1
SLIDE 1

Genomics

extravaganza

A

slide-2
SLIDE 2

Genomics

  • verview

Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion.

Structural genomics includes the genetic mapping, physical mapping and sequencing of entire genomes.

Functional genomics determines the functions of genes and other non-coding parts of the genome. The sequencing of the Human genome will leave many questions unanswered, including the function of most of the estimated 30,000-40,000 human genes.

Comparative genomics is the analysis and comparison of genomes from different species. The nature and significance of differences between genomes also provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.

slide-3
SLIDE 3

The Genome

A genome is all* of a living thing's genetic material. It is the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life

  • n to the next generation.

* Mitochondria? Plasmids?

slide-4
SLIDE 4

The Genome

Bacterial Genome Eukaryotic Genome

http://www.genomenewsnetwork.org

karyotype

Borrelia burgdorferi

Genes VII, Benjamin Lewin

slide-5
SLIDE 5

The Genome

Size and Complexity

Genes VII, Benjamin Lewin

slide-6
SLIDE 6

The Genome

http://ww2.mcgill.ca/biology/undergra/c200a/sec2-3.htm Genes VII, Benjamin Lewin

Bacterial Genome Eukaryotic Genome

slide-7
SLIDE 7

The Genome

C Value and G Value Paradoxs I Value?

The high incident of alternative splicing and post-translational chemical modifications may provide 3x the number of proteins per gene in humans versus Fly and Round Worms

Genes VII, Benjamin Lewin

slide-8
SLIDE 8

The Genome

Human

Typed in 10-pitch font, stretches for more than 5,000 miles.

coding sequences comprise less than 5% of the genome repeat sequences account for at least 50% (cot) (1) transposon-derived repeats (2) pesudogenes (3) simple sequence repeats (4) segmental duplications (5) blocks of tandemly repeated sequences Repeats are often described as junk but contain: evolutionary record passive markers structural components

Martina McGloughlin, UC Davis

3200 Mb

slide-9
SLIDE 9

Genomics

Historical Timeline

www.genomenewsnetwork.org

1856 Gregor Mendel 1953 Watson & Crick Principles of Heredity Molecular Structure

  • f DNA

1961 Marshall Nirenberg Triplet Code 1970 Hamilton Smith Site specific restriction enzyme 1977 Gilbert and Sanger DNA sequencing 1983 Kary Mullis Polymerase Chain Reaction 1986 Leroy Hood Automated Sequencer 2001 Human Genome Published

slide-10
SLIDE 10

Structural Genomics

Sequencing

Sequenced Genomes- Over 100 completed

Pyrococcus abyssi Pyrococcus furiosus Pyrococcus horikoshii Ralstonia solanacearum Rhodopirellula baltica Rickettsia conorii Rickettsia prowazekii Rickettsia siberica Saccharomyces cerevisiae Salmonella typhi Salmonella typhimurium Schizosaccharomyces pombe Shewanella oneidensis Shigella flexneria Sinorhizobium meliloti Staphylococcus aureus Staphylococcus epidermidis Streptococcus agalactiae Streptococcus mutans Streptococcus pneumoniae Streptococcus pyogenes Streptomyces avermitilis Streptomyces coelicolor Sulfolobus solfataricus Sulfolobus tokodaii Synechococcus Synechocystis Thermoanaerobacter tengcongensis Thermoplasma acidophilum Thermoplasma volcanium Thermosynechococcus elongatus Thermotagoa maritima Treponema pallidum Tropheryma whipplei Ureaplasma urealyticum Vibrio cholerae Vibrio parahaemolyticus Vibrio vulnificus Wigglesworthia glossinidia Xanthomonas axonopodis Xanthomonas campestris Xylella fastidiosa Yersinia pestis Haemophilus ducreyi Haemophilus influenzae Halobacterium Helicobacter hepaticus Helicobacter pylori Homo sapiens Lactobacillus plantarum Lactococcus lactis Leptospira interrogans Listeria innocua Listeria monocytogenes Magnaporthe grisea Mesorhizobium loti Methanobacterium thermoautotrophicum Methanococcoides burtonii Methanococcus jannaschii Methanogenium frigidum Methanopyrus kandleri Methanosarcina acetivorans Methanosarcina mazei Mus musculus Mycobacterium bovis Mycobacterium leprae Mycobacterium paratuberculosis Mycobacterium tuberculosis Mycoplasma gallisepticum Mycoplasma genitalium Mycoplasma penetrans Mycoplasma pneumoniae Mycoplasma pulmonis Neisseria meningitidis Neurospora crassa Nitrosomonas europaea Oceanobacillus iheyensis Oryza sativa Pasteurella multocida Plasmodium falciparum Plasmodium yoelii yoelii Prochlorococcus marinus Pseudomonas aeruginosa Pseudomonas putida Pseudomonas syringae Pyrobaculum aerophilum Aeropyrum pernix Agrobacterium tumefaciens Anabaena Anopheles gambiae Aquifex aeolicus Arabidopsis thaliana Archaeoglobus fulgidus Bacillus anthracis Bacillus cereus Bacillus halodurans Bacillus subtilis Bacteroides thetaiotaomicron Bifidobacterium longum Blochmannia floridanus Bordetella bronchiseptica Bordetella parapertussis Bordetella pertussis Borrelia burgdorferi Bradyrhizobium japonicum Brucella melitensis Brucella suis Buchnera aphidicola Caenorhabditis elegans Campylobacter jejuni Caulobacter crescentus Chlamydia muridarum Chlamydia trachomatis Chlamydophila caviae Chlamydophila pneumoniae Chlorobium tepidum Ciona intestinalis Clostridium acetobutylicum Clostridium perfringens Clostridium tetani Corynebacterium efficiens Coxiella burnetii Deinococcus radiodurans Drosophila melanogaster Encephalitozoon cuniculi Enterococcus faecalis Escherichia coli Fugu rubripes Fusobacterium nucleatum Guillardia theta
slide-11
SLIDE 11

Structural Genomics

Sequencing

Some Notable Sequences

http://gnn.tigr.org/sequenced_genomes/genome_guide_p1.shtml

slide-12
SLIDE 12

Structural Genomics

Sequencing

200,000,000 400,000,000 600,000,000 800,000,000 1,000,000,000 1,200,000,000 1,400,000,000

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105

GenBank Release Numbers 94 93 92 91 90 89 88 87 95 96 97

Growth in GenBank is exponential.

http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html

slide-13
SLIDE 13

Structural Genomics

Human Genome

1990 DOE and NIH provide $3 Billion for 15 year HGP 2 major goals:

1) Map the Human Genome as well as several Model organisms

  • E. Coil

Yeast

  • C. Elegens

Drosphilia Arabidopsis thaliana

2) Sequence Entire Genomes of these Species

Human Genome Project (HGP) Celera Genomics

1998 founded by Applera Corporation and Dr. J. Venter Primary goal to sequence and assemble Human Genome in 3 years

slide-14
SLIDE 14

2 Types:

Genetic Linkage Maps- Based on Marker Recombination Frequency (inheritance) Physical Linkage Maps- Splicing matching overlapping regions

Structural Genomics

HGP Mapping

http://www.web-books.com/MoBio/Free/Ch8D1.htm

slide-15
SLIDE 15

Structural Genomics

HGP Mapping

Resolution

slide-16
SLIDE 16

Structural Genomics

HPG From Map to Sequencing

  • Top Down Approach
  • MAP
  • Break Genome in to 200,000bp pieces
  • Make BAC Library and align BAC’s to Map
  • Make sub-clones of 500+ bp pieces
  • DNA of sub-clones is Sequenced
  • Sequenced DNA ordered (contigs) and aligned to

MAP to produce highest resolution Genomic map

Over sampling is essential Clone by Clone Technique

http://www.ornl.gov/TechResources/Human_Genome/publicat/tko/04a_img.html

slide-17
SLIDE 17

Bacterial Artificial Chromosome

Cloning

  • needs Origin of Replication
  • selectable marker Ampicilin / LacZ

B-galactosidase

http://images.google.ca/imgres?imgurl=www.blc.arizona.edu/courses/181gh

slide-18
SLIDE 18

Structural Genomics

HGP Sequencing

Automated Sanger method- Chain Termination/Shotgun

sequence read assemble Capillary Electrophoresis

http://gnn.tigr.org/whats_a_genome/Chp2_2.shtml

slide-19
SLIDE 19

Structural Genomics

Celera Sequencing

Celera’s Method- Whole Genome Shotgun

  • Bottom up approach
  • Genome Randomly Fractured (sonication)
  • Size selection using Gel Electrophoresis
  • Pieces up to 800bp inserted in Bacteria
  • Sequencing
  • Computer Assembly utilizing overlapping regions

Genes VII, Benjamin Lewin

slide-20
SLIDE 20

Structural Genomics

Assembly

slide-21
SLIDE 21

Structural Genomics

HPH Versus Celera

Whole Genome Shotgun

  • Faster but more difficult to reassemble?
  • No Mapping
  • Less expensive (less labor intensive)
  • requires large IT resources
  • works best with existing scaffolding
  • easier as more genomes are sequenced

(homologous DNA) Clone by Clone

  • Slower, Labor Intensive
  • allows sequencing to be divided amongst

labs/countries

  • can adapt strategy to specific portions of genome

Two genomes differ at about 1 in 1250 bases HGP Celera

0.65% unidentified bases 2.84GBp 8.7% Unidentified bases 2.66Gbp

slide-22
SLIDE 22

Structural Genomics

Errors

Maximum Error Rate for HPG 1 in 10kb

  • must be smaller than rate of Polymorphisms
  • allow pseudo genes to be identified

Error Correction Sources of Error

  • DNA Contamination (from Bacterial Clone)
  • Repeats causing incorrect assembly
  • Sampled DNA Fragments are not Random (wgss)
  • sequencing (read) errors
  • Over-sampling 8x
  • Comparative Analysis

Genes VII, Benjamin Lewin

slide-23
SLIDE 23

Structural Genomics

New sequencing high volume techniques

Dna Chips Pyrosequencing

http://www.pyrosequencing.se/

slide-24
SLIDE 24

Functional Genomics

http://www.ornl.gov/TechResources/Human_Genome/posters/chromosome/chromo20.html

slide-25
SLIDE 25

Functional Genomics

Genome to Protein

Finding Genes

Genes can be identified in Genome sequence from common structure (Open Reading Frame)

Finding AA Sequence

Proper AA sequence can be determined from Triplet-code

slide-26
SLIDE 26

Functional Genomics

Gene Function

Determining Gene Function

Recombinant DNA Knockout Directed Mutagenesis Comparative (interspecies, between mutant and wild type) Gene Expression (Spatial and Temporal) Gene Libraries Data Mining RNAi

Examples

Genetic Diseases

  • From genetic linkage map to sequence
  • From mRNA to genetic linkage map
  • Looking for genes with attributes similar to

known disease genes

DNA Micro Arrays

  • testing differential gene expression

http://www.doegenomestolife.org

slide-27
SLIDE 27

Functional Genomics

Gene Function-Micro array

  • Isolate mRNA (capture poly A tail)
  • Reverse Transcriptase (mRNA-cDNA)
  • Add fluorescent label to cDNA
  • Hybridize with Micro-array containing relevant

DNA from BAC

  • Look for differential expression (cancerous

versus non cancerous cells) *same method could be used for spatial temporal or environmental mRNA expression studies *most traditional gene expression analysis (gene fusion with Lable) can

  • nly be done one cell line at a time
slide-28
SLIDE 28

Functional Genomics

Gene Function- Recombinant DNA

http://www.biologie.uni-hamburg.de/b-online/e34/1.htm

slide-29
SLIDE 29

Functional Genomics

Gene Function- RNAi

Make dsRNAi from Gene library soak Nematode and look for impaired function

http://www.bio-itworld.com/images/0402_news_RNAi_L.gif

slide-30
SLIDE 30

Functional Genomics

Protein Structure

Primary Secondary Tertiary

http://folding.stanford.edu/education/Assets/

slide-31
SLIDE 31

Functional Genomics

Protein Function (Proteomics) Determining Protein Function

Knockout

Labeling (localization) Comparative

phylogentic profiling- homology beyond the AA sequence (evolutionary and hereditary)

Proteins no longer are considered to simply have a function but now exist in context as a network, forming connections with related homologous proteins

Modeling (folding)

slide-32
SLIDE 32

Comparative Genomics

90%+ of mouse genes have a sequence match in the human genome Easier to experiment with simpler model organisms BLAST A computer program that identifies homologous genes in different organisms 40% of predicted human genes share similarity with fruit files or Roundworms

slide-33
SLIDE 33

Future of Genomics

Goals

  • identify the protein machines that carry out

critical life functions

  • characterize the gene regulatory networks

that control these machines

  • characterize the functional repertoire of

complex microbial communities in their natural environments

  • develop the computational capabilities to

integrate and understand these data and begin to model complex biological systems (Systems Biology) Major advances in Bioinformatics, computing and large scale automation will be needed to meet these goals

http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/

slide-34
SLIDE 34

Future of Genomics

Examining SNP variation's (and Halotypes) relationship with disease Nearly 1.5 million SNP variations found in HGP

http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/

slide-35
SLIDE 35

Ethics and Society

Privacy

Risk Analysis

Public Safety

Insurance

Discrimination

Patents

Genes

Procedures

  • wnership

Ethics

Discrimination

Prenatal Diagnosis

Genetic Engineering

Designer Babies

slide-36
SLIDE 36

Genomics

Glossary

Bioinformatics Typically refers to the field concerned with the collection and storage of biological information. All matters concerned with biological databases are

considered bioinformatics.

Capillary Electrophoresis A method for separating molecules extremely rapidly based on their electrophoretic mobility Comparative genomics is the analysis and comparison of genomes from different species. The nature and significance of differences between genomes also

provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.

Computational biology Refers to the aspect of developing algorithms and statistical models necessary to analyze biological data through the aid of computers. DNA Micro Arrays (DNA Chip) Array of DNA (cDNA) anchored on substrate (often by robot) used for hybridization experiments Electrophoresis Separation of ionic molecules, by the differential migration through a gel according to the size and ionic charge of the molecules in an electrical field.

High resolution techniques normally use a gel support for the fluid phase.

Exon The sequences of the primary RNA transcript (or the DNA that encodes them) that exit the nucleus as part of a messenger RNA molecule. In the primary

transcript neighboring exons are separated by introns.

Functional genomics determines the functions of genes and other non-coding parts of the genome. The sequencing of the Human genome will leave many

questions unanswered, including the function of most of the estimated 30,000-40,000 human genes.

Genetic Linkage Map A map of the relative positions of genetic loci on a chromosome, determined onthe basis of how often the loci are inherited together. Distance

ismeasured in centimorgans (cM).

Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion. Genotype The set of genes that an individual carries’ usually refers to a particular pair of alleles (alternative forms of genes) that a person has at that region of the

genome.

Halotypes Also used to refer to the set of alleles on one chromosome or a part of a chromosome, i.e. One set of alleles of linked genes. Its main current usage is in

connection with the linked genes of the major histocompatibility complex.

Homology Two anatomical structures or behavioural traits within different organisms which originated from a structure or trait of their common ancestral organism. Intron A noncoding sequence of DNA within a gene, that is transcribed into hnRNA but is then cut out of the message by RNA splicing in the nucleus, leaving a mature

mRNA that is then translated in the cytoplasm.

Karyotype the appearance of the chromosomes in a somatic cell of an individual or species, with reference to their number, size, shape, etc. Knockout Informal term for the generation of a mutant organism in which the function of a particular gene has been completely eliminated (a null allele). Phenotype The observable properties and physical characteristics of an organism (expression of genotype). Phylogentic Relating to phylogenesis, or the race history of a type of organism. Physical Linkage Map A map of the locations of identifiable landmarks on chromosomes. Physical distance is measured in base pairs. The physical map differs

from the genetic map which is based purely on genetic linkage data. In the human genome, the lowest-resolution physical map is the banding patterns of the 24 different

  • chromosomes. The highest-resolution physical map is the complete nucleotide sequence of all chromosomes

Proteomics The study and analysis of protein structure and function. Becoming quite an important science with the mapping of several genomes, including the human

  • ne, and the discovery of new proteins.

Pyrosequencing Pyrosequencing™ is sequencing by synthesis, a simple to use technique for accurate and consistent analysis of DNA sequences. Recombinant DNA Spliced DNA formed from two or more different sources that have been cleaved by restriction enzymes and joined by ligases. RNAi dsRNA added to certain organisms inhibits transcription of the RNA that was added Single Nucleotide Polymorphism (SNP) Variation in a genome based on the alteration of a single base pair Splicing Process that removes introns from transcribed RNA Structural genomics includes the genetic mapping, physical mapping and sequencing of entire genomes.

slide-37
SLIDE 37

Genomics

General Web References

http://www.genomenewsnetwork.org/whats_a_genome/Chp1_1_1.shtml http://www.ornl.gov/TechResources/Human_Genome/faq/compgen.html http://www.jgi.doe.gov/education/genomics_1.html http://www.yourgenome.org/intermediate/ http://www.ornl.gov/TechResources/Human_Genome/ http://www.genome.gov/ http://www.nature.com/genomics/human/ http://www.ornl.gov/TechResources/Human_Genome/project/benefits.html http://www.sciencemag.org/content/vol291/issue5507/