SLIDE 1
Information Storage and Processing in Biological Systems: A seminar course for the Natural Sciences
Sept 16 Introduction / DNA, Gene regulation Sept 18 Translation and Proteins Sept 23 Enzymes and Signal transduction Sept 25 Biochemical Networks Sept 30 Simple Genetic Networks Oct 2 Adventures in Multicellularity Nov 6 Evolution, Evolvability and Robustness
SLIDE 2 Chapters 1-3 “The Thread of Life” S. Aldridge Cambridge University Press. 1996. “Genes & Signals” by Mark Ptashne and Alexander Gann. (2002) CSHL Press.
- From molecular to modular cell biology. (1999) L. H. Hartwell, J. J.
Hopfield, S. Leibler and A. W. Murray. Nature 402 (SUPP): C47-C52. It’s a noisy business! Genetic regulation at the nanomolar scale. H. Harley and A Arkin. Trends In Genetics February 1999, volume 15, No. 2 The challenges of in silico biology. (2000) B. Palsson. Nature Biotechnology 18: 1147-1150.
Reading List for Part 1
SLIDE 3
What is “biological information” and how is it “Stored” and Processed”?
M.C. Escher Spirals
SLIDE 4
What is “biological information”? Genetic
(DNA and RNA)
SLIDE 5
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
SLIDE 6
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication) paragenetic
SLIDE 7
Global patterning of organelles and cilia in Paramecium relies on paragenetic information and is template dependent. Another example is Mad Cow Disease
SLIDE 8
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level (Structural/Metabolism/Signal Transduction)
SLIDE 9
Simplified Connectivity of Map of Metabolism
Each node represents a chemical in the cell (E. coli) Each connection represents an enzymatic step or steps
SLIDE 10
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level (Structural/Metabolism/Signal Transduction) Physiological- Organism Level (Structural/Metabolism/Signal Transduction,
Development, Immune System)
SLIDE 11
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level (Structural/Metabolism/Signal Transduction) Physiological- Organism Level (Structural/Metabolism/Signal Transduction,
Development, Immune System)
Populations
(Population dynamics, Evolution)
SLIDE 12
What is “biological information”? Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level (Structural/Metabolism/Signal Transduction) Physiological- Organism Level (Structural/Metabolism/Signal Transduction,
Development, Immune System)
Populations
(Population dynamics, Evolution)
Ecosystem
(Interacting Populations, environment fl‡ populations )
SLIDE 13
The“Central Dogma” The central dogma relates to the flow of ‘genetic’ information in biological systems. DNA transcription mRNA translation Protein DNAÁËRNAËProtein
SLIDE 14
Overview of Biological Systems Organization of the Tree of Life
Three evolutionary branches of life: Eubacteria, Archaebacteria, Eukaryotes The macroscopic world represents a small portion of the tree.
SLIDE 15
The Eubacteria (bacteria), Archaebacteria (archae), and Eukaryotes represent three fundamental branches of life and represent two fundamental differences in organization of the cell. Major Similarities: Genetic code Basic machinery for interpreting the code Major Differences: Organization of genes Organization of the cell sub-cellular organelles in Eukaryotes * cytoskeletal structure in Eukaryotes ** No true multicellular organization in bacteria and archae (there are many single celled eukaryotes). (debatable) * compartmentalization of function ** morphologically distinct cell structure
SLIDE 16
Bacteria
Morphologically “simple” - shape defined by cell surface structure. Transcription (reading the genetic message) and Translation (converting the genetic message into protein) are coupled- they take place within the same compartment (cytoplasm).
SLIDE 17
Compartmentalization of Function in eukaryotic cells
Transcription (reading the genetic message) and Translation (converting the genetic message into protein) occur in different compartments in the eukaryotic cell.
SLIDE 18
Example of single celled eukaryotic organisms
Morphological diversity (cytoskeleton as well as cell surface structures)
SLIDE 19
There are many distinct morphological cell types within a multicellular organism. Morphological diversity arises from cytoskeletal networks - architectural proteins
SLIDE 20
Some ‘Model’ Experimental Eukaryotic Organisms
Caenorhabditis elegans (round worm) Saccharomyces cerevisiae Drosophila melanogaster (fruit fly) mouse Antirrhinum majus (snapdragons ) Arabidopsis thaliana Zebrafish
SLIDE 21
Bacteriophage (Phage) and Viruses
1) genetic material / nucleic acid 2) protective coat protein The information for their own replication and the means to “target” the correct cell/host but no interpretive machinery
SLIDE 22
Genotype
The genetic constitution of an organism.
Phenotype
The appearance or other characteristic of an organism resulting from the interaction of its genetic constitution with the environment.
SLIDE 23 Constraints in Biological Systems Chemical/Physical constraints
- stability of biological material
- reaction rates and diffusion rates
- properties of biochemical reactions (enzymes) differ from chemical
reactions
- time dependency of many steps - time scales over many orders of magnitude for
different steps
- receptor ligand binding msec
- biochemical response sec
- genetic response minutes- hours-days
- statistical properties of ‘small-scale” chemistry, i.e. where concentration of
reacting molecules is low.
Evolutionary constraints
- a biological system is constrained by it’s own evolutionary history (and also
‘biological’ history)
SLIDE 24
“Alarm clock” from the movie Brazil Evolution of new functions is rarely de novo invention but is typically due to the modification of pre-existing functions/structures.
SLIDE 25 Modularity
- Is the cell/organism designed in a modular fashion?
- Can we approximate cell behavior into modules?
- Can interactions of cells, individuals, organisms be treated in a similar way?
Coarse graining
- At what level of detail do we need to study/model a system to extract
information about the underlying mechanisms?
- What level of detail is required to define the “state” of the cell, the
individual, the population and ecosystem…?
- Can we define the “state” of the cell or only “states” of modules?
SLIDE 26 Stochastic variations and Individuality
- What is the source of stochastic variation (independent of genetic variation)?
- In genetically identical populations, does this play a role in adaptation?
- What role do stochastic processes play in development?
Robustness
- Despite stochastic variations, many cellular processes are extremely robust
(genetic networks, biochemical networks, cell divisions, development,…)
- How does the cell overcome the limitations imposed by stochastic variations?
- Where does robustness arise? Is it a network property?
SLIDE 27 Redundancy
- Many biological processes are duplicated so that the same function is
present in multiple elements. Mutations (changes in genotype) may have no apparent phenotype or one that is less severe than expected.
- Many biological systems are degenerate, they can occur by alternative
pathways.
Complexity
“the whole is greater than the sum of its parts.”
SLIDE 28
Genotype ‡ Phenotype
Can we understand the mechanisms and processes that shape the expression of genetic variation in phenotypes?
SLIDE 29
The Natural History of Dictyostelium discoideum
Adventures in Multicellularity The social amoeba (a.k.a. slime molds)
SLIDE 30
The Natural History of Dictyostelium discoideum
Adventures in Multicellularity The social amoeba (a.k.a. slime molds)
SLIDE 31
The Natural History of Dictyostelium discoideum
Adventures in Multicellularity The social amoeba (a.k.a. slime molds)
SLIDE 32
DNA Basics
Four bases A - adenine T - thymine C - cytosine G - guanine anti- parallel double stranded structure with specific bonding between the two strands: A ‡ T base pairing C ‡ G base pairing
SLIDE 33 DNA Structure
A - T C - G G - C A - T T - A G - C G - C G - C T - A
- DNA is composed of two strands
- Each strand is composed of a sugar phosphate backbone
with one of four bases attached to each sugar
- The arrangement of bases along a strand is aperiodic
- The two strands are arranged anti-parallel
- There is base specific pairing between the strands such
that A pairs with T and C with G, consequently knowing the sequence of one strand gives us the sequence of the
SLIDE 34
Chemical Structure of DNA The Double Helix
SLIDE 35 DNA Replication
- Template copying
- Semi-conservative
A - T C - G G - C A - T T - A G - C G - C G - C T - A A C G A T G G G T - A A - T C - G G - C A - T T - A G - C G - C G - C T - A A - T C - G G - C A - T T - A G - C G - C G - C T - A A - T G C T A C C C A
SLIDE 36 The Genetic Code – Triplet Code
- directional (always read 5’‡ 3’)
- each triplet of bases codes one amino acid (Codon)
- degenerate (many AA have more than one codon)
SLIDE 37
For a given sequence there are three possible reading frames
DNA contains information about the start and end of the gene as well as when to make or if to make transcribe the information.
SLIDE 38 DNA as an information molecule
- DNA sequence itself
- DNA sequence as a code of protein
(sequence/properties of the protein)
- DNA sequence as controlling elements and recognition sites for cellular
machinery
- DNA secondary structure and chemical modifications (e.g. methylation)
- genetic networks from multiple controlling elements and recognition sites
with multiple genes and feedback and or feedforward systems
SLIDE 39 5001 CATAAACCGG GGTTAATTTA AATACTGGAA CCGCTTACCA ATAAGACTAA GTATTTGGCC CCAATTAAAT TTATGACCTT GGCGAATGGT TATTCTGATT
‡? gene start +1 MetGlnPhe LeuGlnPhe PhePheArgGln ArgGlnLeu PheIleAla 5051 ATATGCAATT CCTGCAGTTT TTCTTTCGGC AGCGCCAGCT CTTTATTGCT TATACGTTAA GGACGTCAAA AAGAAAGCCG TCGCGGTCGA GAAATAACGA
- 2 leHisLeuGlu GlnLeuLys GluLysProLeu AlaLeuGlu LysAsnSer
+1 hrProAspArg ArgArgLeu HisProGlyMet IleAspCys GluAlaIle 5501 CCCCGGACCG CCGGCGCTTG CATCCGGGTA TGATCGACTG CGAAGCTATC GGGGCCTGGC GGCCGCGAAC GTAGGCCCAT ACTAGCTGAC GCTTCGATAG
- 2 lyArgValAla ProAlaGln MetArgThrHis AspValAla PheSerAsp
+1 *** end of ? gene 5551 TAATAATGGC ATTTAGTCAC CTCCGATAAT TTTTTAAAAA TAAACTGAAC ATTATTACCG TAAATCAGTG GAGGCTATTA AAAAATTTTT ATTTGACTTG
fl luxS start
SLIDE 40 5’---CTCAGCGTTACCAT---3’ 3’---GAGTCGCAATGGTA---5’ 5’---CUCAGCGUUACCAU---3’ N---Leu-Ser-Val-Thr---C DNA RNA PROTEIN Transcription Translation 1) DNA has sequence information which is TRANSCRIBED into RNA (i.e. it is a template) and TRANSLATED from RNA into protein (Genetic Code).
Two ways of thinking about “information” in DNA
- In RNA T’s are replaced by U’s
- Some gene products are RNA, i.e. they are not translated (e.g. tRNA, rRNA)
SLIDE 41
2) DNA has sequence information at a structural level. This form of information directs the ‘interpretative machinery’ in the cell (protein complexes), in most instances binding sites for proteins. This type of ‘information’ is important for example in determining where (along a sequence of DNA) and when a gene may be turned on, initiation of DNA replication, packaging of DNA etc… i.e - Regulation
Two ways of thinking about “information” in DNA
SLIDE 42 a2bb’holoenzyme s factor
start
The Basic Transcription Components (Bacterial) Promoter - binding site for RNA polymerase, defines where the process
will begin.
RNA Polymerase DNA Transcription Machinery
SLIDE 43 Promoter Binding Open Complex Formation Promoter Clearance
Messenger RNA (mRNA)
SLIDE 44 Transcriptional Regulators are proteins that act to modulate gene expression. Proteins that negatively regulate expression (i.e decrease transcription) are called Repressors and those that act positively (i.e. increase transcription of a gene) are called Activators. These proteins act by binding at specific DNA sites are modulate RNA polymerase function. These binding sites are called operators.
Regulation of Gene Expression: The Basics
start
promoter
SLIDE 45
start
X
Repression can be viewed as a competition for binding between the polymerase and the repressor (an oversimplification). Repressor
SLIDE 46
start
promoter
An Activator promotes RNA polymerase biding activity through direct protein-protein interactions (an oversimplification). Activator
SLIDE 47
- Any DNA binding protein, with an appropriately placed binding site
can act as a repressor. Activation requires specific protein-protein interaction between the activator and RNA polymerase.
- Typically bacterial promoters are regulated by a few proteins at most
and the control regions tend to be quite small.
- Eukaryotic gene regulatory regions can be very large and involve many
transcriptional regulators.
- Activation and repression depend on positioning of operator sites.
- Multiple inputs can be integrated at the level of gene expression.
SLIDE 48 The interaction of a DNA-Binding Protein (such as RNA Polymerase or transcriptional regulators) is dependent on the ‘affinity’ of the protein for the binding site. This affinity will vary under different physiological conditions, as the concentration of the protein changes and also will depend on the binding site itself. The optimal binding site is usually close to the consensus sequence for that site
- btain by aligning all the know binding sites. On can thus have a range of
‘activity’ at different promoters/operators by having differences in DNA binding sites.
Consensus Binding Sites
- E. coli Promoters
- 35 box -10 box
Consensus TTGACA- N17- TATAAT
Examples: TTGATA- N16- TATAAT TTCCAA- N17- TATACT TGTACA- N19- CATAAT TTGATC- N17- TACTAT TTGACA- N17- TAGCTT
SLIDE 49
“Activity” of Transcriptional Regulators in Response to ‘Signals’
Case 1. Affinity of the protein for DNA may be modified by binding a ‘ligand’ (Allosteric mechanism). Case 2. Affinity of the protein may be affected by covalent modification such as phosphorylation. Both of these mechanisms (ligand binding and post-translational modification) are common themes in the regulation of proteins, not just in transcription control.
DNA R R-DNA x DNA Rx Rx-DNA DNA
SLIDE 50
Regulation of Gene Expression
DNA RNA polymerase binding Open Complex Formation Transcription mRNA mRNA stability Translation Protein Polypeptide folding Protein stability Both positive and negative regulation can occur at any step in this process.
SLIDE 51 General Principles of Regulation of Gene Expression
- Regulation occurs through recruitment or preventing
recruitment of transcription machinery.
- Repressors typically prevent recruitment of polymerase
- Activators increase recruitment of polymerase
- Multiple inputs from different transcription factors (TFs) can
be integrated or compete.
- Protein-DNA interactions (TF, RNAP) can have different
affinities, ie can act differently at different promoters at the same level of activity.
SLIDE 52 Eukaryotic Gene Expression
- the same principles but added complexity
SLIDE 53 Eukaryotic Gene Expression
- the same principles but added complexity
“simple’ RNA polymerase replaced by a large transcription complex (As many as 50 proteins)
SLIDE 54 Eukaryotic Gene Expression
- the same principles but added complexity
Relatively compact regulatory regions in bacteria are spread over larger regions, more transcription factors
- more inputs /signal integrations.
O1 O3 O2 CAP
e.g. E. coli lac , ~250 bp, 2 inputs Drosophila eve stripe 2 enhancer, >1000bp, multiple TFs
SLIDE 55 Eukaryotic Gene Expression
- the same principles but added complexity
The added regulatory components increases the potential complexity of gene regulation in eukaryotic cells. Organism complexity a number of genes Organism complexity a regulatory elements
SLIDE 56
Organism # of genes
Mycoplasma genetalium 750 Escherichia coli 5000 Pseudomonas aeruginosa 6000 Saccharomyces cerevisiae 6000 Caenorhabditis elegans 19,000 Drosophila melanogaster 15,000 Homo sapiens (man) 40,000