Genomics
extravaganza
A
Genomics extravaganza Genomics overview Genomics analysis of the - - PowerPoint PPT Presentation
A Genomics extravaganza Genomics overview Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion. Structural genomics includes the genetic mapping,
extravaganza
A
Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion.
Structural genomics includes the genetic mapping, physical mapping and sequencing of entire genomes.
Functional genomics determines the functions of genes and other non-coding parts of the genome. The sequencing of the Human genome will leave many questions unanswered, including the function of most of the estimated 30,000-40,000 human genes.
Comparative genomics is the analysis and comparison of genomes from different species. The nature and significance of differences between genomes also provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.
A genome is all* of a living thing's genetic material. It is the entire set of hereditary instructions for building, running, and maintaining an organism, and passing life
* Mitochondria? Plasmids?
Bacterial Genome Eukaryotic Genome
http://www.genomenewsnetwork.org
karyotype
Borrelia burgdorferi
Genes VII, Benjamin Lewin
Size and Complexity
Genes VII, Benjamin Lewin
http://ww2.mcgill.ca/biology/undergra/c200a/sec2-3.htm Genes VII, Benjamin Lewin
Bacterial Genome Eukaryotic Genome
C Value and G Value Paradoxs I Value?
The high incident of alternative splicing and post-translational chemical modifications may provide 3x the number of proteins per gene in humans versus Fly and Round Worms
Genes VII, Benjamin Lewin
Human
Typed in 10-pitch font, stretches for more than 5,000 miles.
coding sequences comprise less than 5% of the genome repeat sequences account for at least 50% (cot) (1) transposon-derived repeats (2) pesudogenes (3) simple sequence repeats (4) segmental duplications (5) blocks of tandemly repeated sequences Repeats are often described as junk but contain: evolutionary record passive markers structural components
Martina McGloughlin, UC Davis
3200 Mb
Historical Timeline
www.genomenewsnetwork.org
1856 Gregor Mendel 1953 Watson & Crick Principles of Heredity Molecular Structure
1961 Marshall Nirenberg Triplet Code 1970 Hamilton Smith Site specific restriction enzyme 1977 Gilbert and Sanger DNA sequencing 1983 Kary Mullis Polymerase Chain Reaction 1986 Leroy Hood Automated Sequencer 2001 Human Genome Published
Sequencing
Sequenced Genomes- Over 100 completed
Pyrococcus abyssi Pyrococcus furiosus Pyrococcus horikoshii Ralstonia solanacearum Rhodopirellula baltica Rickettsia conorii Rickettsia prowazekii Rickettsia siberica Saccharomyces cerevisiae Salmonella typhi Salmonella typhimurium Schizosaccharomyces pombe Shewanella oneidensis Shigella flexneria Sinorhizobium meliloti Staphylococcus aureus Staphylococcus epidermidis Streptococcus agalactiae Streptococcus mutans Streptococcus pneumoniae Streptococcus pyogenes Streptomyces avermitilis Streptomyces coelicolor Sulfolobus solfataricus Sulfolobus tokodaii Synechococcus Synechocystis Thermoanaerobacter tengcongensis Thermoplasma acidophilum Thermoplasma volcanium Thermosynechococcus elongatus Thermotagoa maritima Treponema pallidum Tropheryma whipplei Ureaplasma urealyticum Vibrio cholerae Vibrio parahaemolyticus Vibrio vulnificus Wigglesworthia glossinidia Xanthomonas axonopodis Xanthomonas campestris Xylella fastidiosa Yersinia pestis Haemophilus ducreyi Haemophilus influenzae Halobacterium Helicobacter hepaticus Helicobacter pylori Homo sapiens Lactobacillus plantarum Lactococcus lactis Leptospira interrogans Listeria innocua Listeria monocytogenes Magnaporthe grisea Mesorhizobium loti Methanobacterium thermoautotrophicum Methanococcoides burtonii Methanococcus jannaschii Methanogenium frigidum Methanopyrus kandleri Methanosarcina acetivorans Methanosarcina mazei Mus musculus Mycobacterium bovis Mycobacterium leprae Mycobacterium paratuberculosis Mycobacterium tuberculosis Mycoplasma gallisepticum Mycoplasma genitalium Mycoplasma penetrans Mycoplasma pneumoniae Mycoplasma pulmonis Neisseria meningitidis Neurospora crassa Nitrosomonas europaea Oceanobacillus iheyensis Oryza sativa Pasteurella multocida Plasmodium falciparum Plasmodium yoelii yoelii Prochlorococcus marinus Pseudomonas aeruginosa Pseudomonas putida Pseudomonas syringae Pyrobaculum aerophilum Aeropyrum pernix Agrobacterium tumefaciens Anabaena Anopheles gambiae Aquifex aeolicus Arabidopsis thaliana Archaeoglobus fulgidus Bacillus anthracis Bacillus cereus Bacillus halodurans Bacillus subtilis Bacteroides thetaiotaomicron Bifidobacterium longum Blochmannia floridanus Bordetella bronchiseptica Bordetella parapertussis Bordetella pertussis Borrelia burgdorferi Bradyrhizobium japonicum Brucella melitensis Brucella suis Buchnera aphidicola Caenorhabditis elegans Campylobacter jejuni Caulobacter crescentus Chlamydia muridarum Chlamydia trachomatis Chlamydophila caviae Chlamydophila pneumoniae Chlorobium tepidum Ciona intestinalis Clostridium acetobutylicum Clostridium perfringens Clostridium tetani Corynebacterium efficiens Coxiella burnetii Deinococcus radiodurans Drosophila melanogaster Encephalitozoon cuniculi Enterococcus faecalis Escherichia coli Fugu rubripes Fusobacterium nucleatum Guillardia thetaSequencing
Some Notable Sequences
http://gnn.tigr.org/sequenced_genomes/genome_guide_p1.shtml
Sequencing
200,000,000 400,000,000 600,000,000 800,000,000 1,000,000,000 1,200,000,000 1,400,000,000
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105
GenBank Release Numbers 94 93 92 91 90 89 88 87 95 96 97
Growth in GenBank is exponential.
http://www.ncbi.nlm.nih.gov/Genbank/GenbankOverview.html
Human Genome
1990 DOE and NIH provide $3 Billion for 15 year HGP 2 major goals:
1) Map the Human Genome as well as several Model organisms
Yeast
Drosphilia Arabidopsis thaliana
2) Sequence Entire Genomes of these Species
Human Genome Project (HGP) Celera Genomics
1998 founded by Applera Corporation and Dr. J. Venter Primary goal to sequence and assemble Human Genome in 3 years
2 Types:
Genetic Linkage Maps- Based on Marker Recombination Frequency (inheritance) Physical Linkage Maps- Splicing matching overlapping regions
HGP Mapping
http://www.web-books.com/MoBio/Free/Ch8D1.htm
HGP Mapping
Resolution
HPG From Map to Sequencing
MAP to produce highest resolution Genomic map
Over sampling is essential Clone by Clone Technique
http://www.ornl.gov/TechResources/Human_Genome/publicat/tko/04a_img.html
Cloning
B-galactosidase
http://images.google.ca/imgres?imgurl=www.blc.arizona.edu/courses/181gh
HGP Sequencing
Automated Sanger method- Chain Termination/Shotgun
sequence read assemble Capillary Electrophoresis
http://gnn.tigr.org/whats_a_genome/Chp2_2.shtml
Celera Sequencing
Celera’s Method- Whole Genome Shotgun
Genes VII, Benjamin Lewin
Assembly
HPH Versus Celera
Whole Genome Shotgun
(homologous DNA) Clone by Clone
labs/countries
Two genomes differ at about 1 in 1250 bases HGP Celera
0.65% unidentified bases 2.84GBp 8.7% Unidentified bases 2.66Gbp
Errors
Maximum Error Rate for HPG 1 in 10kb
Error Correction Sources of Error
Genes VII, Benjamin Lewin
New sequencing high volume techniques
Dna Chips Pyrosequencing
http://www.pyrosequencing.se/
http://www.ornl.gov/TechResources/Human_Genome/posters/chromosome/chromo20.html
Genome to Protein
Finding Genes
Genes can be identified in Genome sequence from common structure (Open Reading Frame)
Finding AA Sequence
Proper AA sequence can be determined from Triplet-code
Gene Function
Determining Gene Function
Recombinant DNA Knockout Directed Mutagenesis Comparative (interspecies, between mutant and wild type) Gene Expression (Spatial and Temporal) Gene Libraries Data Mining RNAi
Examples
Genetic Diseases
known disease genes
DNA Micro Arrays
http://www.doegenomestolife.org
Gene Function-Micro array
DNA from BAC
versus non cancerous cells) *same method could be used for spatial temporal or environmental mRNA expression studies *most traditional gene expression analysis (gene fusion with Lable) can
Gene Function- Recombinant DNA
http://www.biologie.uni-hamburg.de/b-online/e34/1.htm
Gene Function- RNAi
Make dsRNAi from Gene library soak Nematode and look for impaired function
http://www.bio-itworld.com/images/0402_news_RNAi_L.gif
Protein Structure
Primary Secondary Tertiary
http://folding.stanford.edu/education/Assets/
Protein Function (Proteomics) Determining Protein Function
Knockout
Labeling (localization) Comparative
phylogentic profiling- homology beyond the AA sequence (evolutionary and hereditary)
Proteins no longer are considered to simply have a function but now exist in context as a network, forming connections with related homologous proteins
Modeling (folding)
90%+ of mouse genes have a sequence match in the human genome Easier to experiment with simpler model organisms BLAST A computer program that identifies homologous genes in different organisms 40% of predicted human genes share similarity with fruit files or Roundworms
Goals
critical life functions
that control these machines
complex microbial communities in their natural environments
integrate and understand these data and begin to model complex biological systems (Systems Biology) Major advances in Bioinformatics, computing and large scale automation will be needed to meet these goals
http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/
Examining SNP variation's (and Halotypes) relationship with disease Nearly 1.5 million SNP variations found in HGP
http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/
Privacy
Risk Analysis
Public Safety
Insurance
Discrimination
Patents
Genes
Procedures
Ethics
Discrimination
Prenatal Diagnosis
Genetic Engineering
Designer Babies
Glossary
Bioinformatics Typically refers to the field concerned with the collection and storage of biological information. All matters concerned with biological databases are
considered bioinformatics.
Capillary Electrophoresis A method for separating molecules extremely rapidly based on their electrophoretic mobility Comparative genomics is the analysis and comparison of genomes from different species. The nature and significance of differences between genomes also
provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.
Computational biology Refers to the aspect of developing algorithms and statistical models necessary to analyze biological data through the aid of computers. DNA Micro Arrays (DNA Chip) Array of DNA (cDNA) anchored on substrate (often by robot) used for hybridization experiments Electrophoresis Separation of ionic molecules, by the differential migration through a gel according to the size and ionic charge of the molecules in an electrical field.
High resolution techniques normally use a gel support for the fluid phase.
Exon The sequences of the primary RNA transcript (or the DNA that encodes them) that exit the nucleus as part of a messenger RNA molecule. In the primary
transcript neighboring exons are separated by introns.
Functional genomics determines the functions of genes and other non-coding parts of the genome. The sequencing of the Human genome will leave many
questions unanswered, including the function of most of the estimated 30,000-40,000 human genes.
Genetic Linkage Map A map of the relative positions of genetic loci on a chromosome, determined onthe basis of how often the loci are inherited together. Distance
ismeasured in centimorgans (cM).
Genomics analysis of the structure and function of very large numbers of genes (often the entire genome) undertaken in a simultaneous fashion. Genotype The set of genes that an individual carries’ usually refers to a particular pair of alleles (alternative forms of genes) that a person has at that region of the
genome.
Halotypes Also used to refer to the set of alleles on one chromosome or a part of a chromosome, i.e. One set of alleles of linked genes. Its main current usage is in
connection with the linked genes of the major histocompatibility complex.
Homology Two anatomical structures or behavioural traits within different organisms which originated from a structure or trait of their common ancestral organism. Intron A noncoding sequence of DNA within a gene, that is transcribed into hnRNA but is then cut out of the message by RNA splicing in the nucleus, leaving a mature
mRNA that is then translated in the cytoplasm.
Karyotype the appearance of the chromosomes in a somatic cell of an individual or species, with reference to their number, size, shape, etc. Knockout Informal term for the generation of a mutant organism in which the function of a particular gene has been completely eliminated (a null allele). Phenotype The observable properties and physical characteristics of an organism (expression of genotype). Phylogentic Relating to phylogenesis, or the race history of a type of organism. Physical Linkage Map A map of the locations of identifiable landmarks on chromosomes. Physical distance is measured in base pairs. The physical map differs
from the genetic map which is based purely on genetic linkage data. In the human genome, the lowest-resolution physical map is the banding patterns of the 24 different
Proteomics The study and analysis of protein structure and function. Becoming quite an important science with the mapping of several genomes, including the human
Pyrosequencing Pyrosequencing™ is sequencing by synthesis, a simple to use technique for accurate and consistent analysis of DNA sequences. Recombinant DNA Spliced DNA formed from two or more different sources that have been cleaved by restriction enzymes and joined by ligases. RNAi dsRNA added to certain organisms inhibits transcription of the RNA that was added Single Nucleotide Polymorphism (SNP) Variation in a genome based on the alteration of a single base pair Splicing Process that removes introns from transcribed RNA Structural genomics includes the genetic mapping, physical mapping and sequencing of entire genomes.
General Web References
http://www.genomenewsnetwork.org/whats_a_genome/Chp1_1_1.shtml http://www.ornl.gov/TechResources/Human_Genome/faq/compgen.html http://www.jgi.doe.gov/education/genomics_1.html http://www.yourgenome.org/intermediate/ http://www.ornl.gov/TechResources/Human_Genome/ http://www.genome.gov/ http://www.nature.com/genomics/human/ http://www.ornl.gov/TechResources/Human_Genome/project/benefits.html http://www.sciencemag.org/content/vol291/issue5507/