Systems Biology
- Structural Biology
- Dr. Shaila C. Rössle
1
Systems Biology Structural Biology Dr. Shaila C. Rssle 1 Our - - PowerPoint PPT Presentation
Systems Biology Structural Biology Dr. Shaila C. Rssle 1 Our life is maintained by molecular network systems Our life is maintained by molecular network systems Molecular network system in a cell (From ExPASy Biochemical Pathways;
1
Molecular network system in a cell
(From ExPASy Biochemical Pathways; http://www.expasy.org/cgi-bin/show_thumbnails.pl?2
Hiroaki Kitano, Science 2010
KEGG
1.2.1.24 1.4.1.13 1.4.1.3 2.6.1.1 2.6.1. 2 6.3.2.2 6.3.2.3 6.1.1.17 6.1.1.17 1.5.1.12 1.8.1.7 4.1.1.15 5.1.1.3 6.3.5.5 6.1.1.18 6.3.5.2 6.3.5.7 6.3.5. 1 2.6.1.16 2.3.1.4 2.7.1.59 2.6.1.19 2.4.2.1 4 3.5.1.3 6.3.1.2 2.7.2. 2 3.5.1.38 6.3.5.5
6
within the cell
controlling their reates
– Catalysis:
Almost all chemical reactions in a living cell are catalyzed by protein enzymes.
– Transport:
Some proteins transports various substances, such as oxygen, ions, and so on.
– Information transfer:
For example, hormones.
Alcohol dehydrogenas e oxidizes alcohols to aldehydes or ketones Haemoglobin carries
Insulin controls the amount of sugar in the blood
Amino group Carboxylic acid group
acids in length
12
Charged amino acids Polar but uncharged Special cases hydrophobic
13
From the book: “DNA: The Secret of Life” by James Watson and Andrew Berry
5’ 3’
AUG AUG UAA RIBOSOME
N N N N C
mRNA
The starting sequence AUG codes for Methionine and is present several times in the mRNA sequence. Initiation Termination Elongation In Prokaryotes: A special sequence (Shine- Dalgarno) identifies the starting AUG. (Multiple proteins on the same mRNA). In Eukaryotes: It is the first AUG sequence starting from the 5’ terminus. (Only one protein for each mRNA).
Molten globule
19
PSI PHI Omega Ligação Peptídica
Molten globule
22
Secondary structures, α-helix and β-sheet, have regular hydrogen-bonding patterns.
α-helix β-sheet
24
betas antiparalelas (formam os sítios de ligação de anticorpos).
enzimas
Molten globule
27
28
Pairwise sequence comparison of proteins led to strange results
that can be made from the 20 natural aa’s
regulation
30
Zn Cys Cys Cys Cys
The zinc finger
About 35-51% of the proteins have unstructured regions that are longer than 50 residues; 6-17% of proteins in the Swiss-Prot are probably fully disordered. Determined by neural networks predictors (based on the protein sequence). Proteins (segments of proteins) that are lacking well- structured 3-dimentional fold. They are referred as “natively denatured/unfolded”, “intrinsically unstructured/unfolded”.
Predicted α-helices in free peptide Experimentally determined α-helices in complex
KID domain of CREB pKID bound to KIX domain
protein).
Hif1α peptide bound to the TAZ1 domain of the Creb binding protein. Here the peptide forms an α-helix.
Hif1α peptide bound to asparagine hydroxylase. Here the peptide binds in an extended conformation.
Molten globule
Haemoglobin Enzyme - HIV-1 protease K+ channel Transport protein
Helix-turn-Helix: a basic nucleic acid binding structure. This motif (green on left) and the exact relationship between the helices is conserved from bacteria to man. H T H
39
enzym e
A B A Binding to A Digestion
enzym e
Matching the shape to A Hormone receptor Antibody Example of enzyme reaction
enzym e substrate s
41 HIV gp120 / CD4 / FAB PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.
42
The Cell
43
44
A molecule contains a cavity exactly complementary in shape to a protruding group of another molecule
45
46
Wikipedia
47
In a leucine zipper DNA-binding protein the two helices are held together by hydrophobic interactions between, mainly, leucine sidechains
48
The Cell
49
responsible for DNA binding according to Duval et al (2002)
50
51
Glaser et al. (2003) Bioinformatics 19:163-164
Cytokines: secreted proteins that regulate cellular function.
OmpF porin is a non-specific transport channel that allows for the passive diffusion of small, polar molecules (600-700 Da in size) through the cell's
nutrients as well as waste products (Cowan et al., 1995).
54
55
56
– Chou-Fasman method – GOR method – Machine learning
– Ab-initio – Comparative/homology modeling – Threading
57
58
Advances in Artificial Intelligence, 2010
59
60
The coordinates of all the known structures of proteins can be found in the Protein Data Bank: http://www.rcsb.org/pdb
CLUSTAL W (1.7) multiple sequence alignment Human-Zcr MATGQKLMRAVRVFEFGGPEVLKLRSDIAVPIPKDHQVLIKVHACGVNPVETYIRSGTYS Ecoli-QOR ------MATRIEFHKHGGPEVLQA-VEFTPADPAENEIQVENKAIGINFIDTYIRSGLYP : :...:.******: ::: . * :::: :: :* *:* ::****** *. Human-Zcr RKPLLPYTPGSDVAGVIEAVGDNASAFKKGDRVFTSSTISGGYAEYALAADHTVYKLPEK Ecoli-QOR -PPSLPSGLGTEAAGIVSKVGSGVKHIKAGDRVVYAQSALGAYSSVHNIIADKAAILPAA * ** *::.**::. **.... :* ****. :.: *.*:. ... ** Human-Zcr LDFKQGAAIGIPYFTAYRALIHSACVKAGESVLVHGASGGVGLAACQIARAYGLKILGTA Ecoli-QOR ISFEQAAASFLKGLTVYYLLRKTYEIKPDEQFLFHAAAGGVGLIACQWAKALGAKLIGTV :.*:*.** : :*.* * :: :*..*..*.*.*:***** *** *:* * *::**. Human-Zcr GTEEGQKIVLQNGAHEVFNHREVNYIDKIKKYVGEKGIDIIIEMLANVNLSKDLSLLSHG Ecoli-QOR GTAQKAQSALKAGAWQVINYREEDLVERLKEITGGKKVRVVYDSVGRDTWERSLDCLQRR ** : : .*: ** :*:*:** : ::::*: .* * : :: : :.. . .:.*. *.: Human-Zcr GRVIVVG-SRGTIEINPRDTMAKES----SIIGVTLFSSTKEEFQQYAAALQAGMEIGWL Ecoli-QOR GLMVSFGNSSGAVTGVNLGILNQKGSLYVTRPSLQGYITTREELTEASNELFSLIASGVI * :: .* * *:: . : ::. : .: : :*:**: : : * : : * : Human-Zcr KPVIGSQ--YPLEKVAEAHENIIHGSGATGKMILLL Ecoli-QOR KVDVAEQQKYPLKDAQRAHE-ILESRATQGSSLLIP * :..* ***:.. .*** *:.. .: *. :*:
Estrelas indicam resíduos identicos e pontos indicam substituições conservativas
Human zeta crystallin vs E.coli quinone
Homólogas: Proteínas que evoluíram de um mesmo ancestral comum, quase sempre compartilham da mesma função e adotam uma mesma estrutura tridimensional (podem possuir menos do que 20 % de identidade de sequência, i.e. homólogas distantes) Homologia ≠ identidade de sequência. Análogas: Proteínas que adotam um mesmo enovelamento tridimensional mas que possuem menos de 20% de identidade de sequência (não evoluíram de um mesmo ancestral comum). Parálogas: Proteínas homólogas em uma mesma espécie (duplicação de genes) e que podem exercer diferentes funções. Possuem estrutura tridimensionais bastante similares. Em proteínas parálogas que exercem funções distintas apenas
Ortólogas: Proteínas homólogas que possuem funções biológicas similares em diferentes espécies. Possuem estrutura tridimensionais bastante similares mas podem ter sequências de aminoácidos bem diferentes (podem exibir adaptações e modificações funcionais em algumas espécies). Resíduos importantes para estrutura e/ou função são conservados.
64
65
requires a small volume of concentrated protein solution that is placed in a strong magnetic field. The NMR method is especially useful when a protein of interest has resisted attempts at crystallization, a common problem for many membrane proteins. Because NMR studies are performed in solution, this method also offers a convenient means of monitoring changes in protein structure, for example during protein folding or when a substrate binds to the protein. NMR is also used widely to investigate molecules other than proteins and is valuable, for example, as a method to determine the three-dimensional structures of RNA molecules and the complex carbohydrate side chains of glycoproteins.
protein.
groups depending on the sidechain chemistry. This defines where in the 3D protein structure a given aa is likely to be found.
defined by the dihedral angles φ and ψ, is hindered with certain values being preferred.
secondary through quaternary. You should know what each refers to.
multiple, independently folding domains that not only provide specific functions, but interact to add a further level of regulation to protein function.
Tertiary structure Quaternary structure
GATTACCA GATGACCA GATTACCA insertion GATCATCA GATTGATCA GATTACCA GATTATCA GATTACCA deletion Substitution GAT ACCA T The term homology implies a common ancestry, which may be inferred from observations of sequence similarity
Derivation from a common ancestor through incremental change due to dna replication errors, mutations, damage, or unequal crossing-over.
SIGNAL PEPTIDES DOMAINS Ca_BINDING GPI ANCHORS GLYCOSYLATION SITES PHISICO-CHEMICAL PROPEPERTIES
DNA_BINDINDING MOTIF
TRANSMEMBRANE ZN_FINGER MOTIFS SIMILAR REGIONS REPEAT REGIONS HYDROPHOBICITY SOLVENT ACCESSIBILITY
Methods Sequence Identity with known structures
CPU time to model Quality of model & Loop modeling Errors in the sequence alignment Detection of homology
the remote homology has to be detected; Q and T have to be aligned correctly; homology modeling procedure has to be tailored to the harder problem of extremely low sequence identity.
the Swedish chemist Jöns Jakob Berzelius in 1838. Early nutritional scientists such as the German Carl von Voit believed that protein was the most important nutrient for maintaining the structure of the body, because it was generally believed that "flesh makes flesh."[3] The central role of proteins as enzymes in living organisms was however not fully appreciated until 1926, when James B. Sumner showed that the enzyme urease was in fact a protein.[4] The first protein to be sequenced was insulin, by Frederick Sanger, who won the Nobel Prize for this achievement in 1958. The first protein structures to be solved were hemoglobin and myoglobin, by Max Perutz and Sir John Cowdery Kendrew, respectively, in 1958.[5][6] The three- dimensional structures of both proteins were first determined by x-ray diffraction analysis; Perutz and Kendrew shared the 1962 Nobel Prize in Chemistry for these discoveries. Proteins may be purified from other cellular components using a variety of techniques such as ultracentrifugation, precipitation, electrophoresis, and chromatography; the advent of genetic engineering has made possible a number of methods to facilitate purification. Methods commonly used to study protein structure and function include immunohistochemistry, site- directed mutagenesis, nuclear magnetic resonance and mass spectrometry.
81