Systems Biology Structural Biology Dr. Shaila C. Rssle 1 Our - - PowerPoint PPT Presentation

systems biology
SMART_READER_LITE
LIVE PREVIEW

Systems Biology Structural Biology Dr. Shaila C. Rssle 1 Our - - PowerPoint PPT Presentation

Systems Biology Structural Biology Dr. Shaila C. Rssle 1 Our life is maintained by molecular network systems Our life is maintained by molecular network systems Molecular network system in a cell (From ExPASy Biochemical Pathways;


slide-1
SLIDE 1

Systems Biology

  • Structural Biology
  • Dr. Shaila C. Rössle

1

slide-2
SLIDE 2

Our life is maintained by molecular network systems Our life is maintained by molecular network systems

Molecular network system in a cell

(From ExPASy Biochemical Pathways; http://www.expasy.org/cgi-bin/show_thumbnails.pl?2

slide-3
SLIDE 3

Systems Biology

Hiroaki Kitano, Science 2010

slide-4
SLIDE 4

Glutamate Metabolism

KEGG

slide-5
SLIDE 5

1.2.1.24 1.4.1.13 1.4.1.3 2.6.1.1 2.6.1. 2 6.3.2.2 6.3.2.3 6.1.1.17 6.1.1.17 1.5.1.12 1.8.1.7 4.1.1.15 5.1.1.3 6.3.5.5 6.1.1.18 6.3.5.2 6.3.5.7 6.3.5. 1 2.6.1.16 2.3.1.4 2.7.1.59 2.6.1.19 2.4.2.1 4 3.5.1.3 6.3.1.2 2.7.2. 2 3.5.1.38 6.3.5.5

slide-6
SLIDE 6

Proteins

6

slide-7
SLIDE 7

Proteins

  • They are the enzymes that rearrange chemical bonds
  • They carry signals to and from the outside of the cell, and

within the cell

  • They transport small molecules
  • They form many of the cellular structures
  • They regulate cell processes, turning them on and off and

controlling their reates

Proteins have a variety of roles that they must fulfil:

slide-8
SLIDE 8

Proteins play key roles in a living system Proteins play key roles in a living system

  • Three examples of protein functions

– Catalysis:

Almost all chemical reactions in a living cell are catalyzed by protein enzymes.

– Transport:

Some proteins transports various substances, such as oxygen, ions, and so on.

– Information transfer:

For example, hormones.

Alcohol dehydrogenas e oxidizes alcohols to aldehydes or ketones Haemoglobin carries

  • xygen

Insulin controls the amount of sugar in the blood

slide-9
SLIDE 9

A protein is a polymer of a fixed length, composition and structure made by a combination of the 20 naturally occurring amino acids.

Proteins

slide-10
SLIDE 10

Amino acid: Basic unit of protein Amino acid: Basic unit of protein

COO- NH3

+ C

R H

An amino acid

Different side chains, R, determin the properties of 20 amino acids.

Amino group Carboxylic acid group

slide-11
SLIDE 11

Proteins – amino acids

  • There are 20 different types of amino acids
  • Different sequences of amino acids fold into different 3D shapes
  • Proteins can range from fewer than 20 to more than 5000 amino

acids in length

  • Each protein that an organism can produce is encoded in a piece
  • f the DNA called a “gene“
  • The single-celled bacterium E.coli has about 4300 different genes
  • Humans are believed to have about 30,000 different genes
slide-12
SLIDE 12

12

Charged amino acids Polar but uncharged Special cases hydrophobic

slide-13
SLIDE 13

13

From the book: “DNA: The Secret of Life” by James Watson and Andrew Berry

slide-14
SLIDE 14

5’ 3’

AUG AUG UAA RIBOSOME

N N N N C

mRNA

The starting sequence AUG codes for Methionine and is present several times in the mRNA sequence. Initiation Termination Elongation In Prokaryotes: A special sequence (Shine- Dalgarno) identifies the starting AUG. (Multiple proteins on the same mRNA). In Eukaryotes: It is the first AUG sequence starting from the 5’ terminus. (Only one protein for each mRNA).

Translation - detail

slide-15
SLIDE 15
  • Proteins are key players in our living systems.
  • Each protein folds into a unique three-dimensional

structure defined by its amino acid sequence..

  • Protein structure is closely related to its function.
  • Protein structure prediction is a grand challenge of

computational biology.

Proteins

slide-16
SLIDE 16

Protein Structure

slide-17
SLIDE 17

The levels of protein structure

Molten globule

slide-18
SLIDE 18

Each Protein has a unique structure Each Protein has a unique structure

Amino acid sequence NLKTEWPELVGKSV EEAKKVILQDKPEAQ IIVLPVGTIVTMEYRI DRVRLFVDKLDNIAE VPRVG

Foldin g!

slide-19
SLIDE 19

19

slide-20
SLIDE 20

PSI PHI Omega Ligação Peptídica

slide-21
SLIDE 21

he levels of protein structure

Molten globule

slide-22
SLIDE 22

22

Secondary structures, α-helix and β-sheet, have regular hydrogen-bonding patterns.

slide-23
SLIDE 23

Supersecondary structure Supersecondary structure

α-helix β-sheet

slide-24
SLIDE 24

Ramachandran plot

24

slide-25
SLIDE 25

Loops and Turns

  • Conectam elementos de estrutura secundária. Hairpin loops: conectam duas folhas

betas antiparalelas (formam os sítios de ligação de anticorpos).

  • Vários tamanhos e formas irregulares
  • Estão na superfície da proteína (ricos em resíduos polares e carregados)
  • Freqüentemente participam na formação de sítios de ligação e sítios ativos de

enzimas

  • Problema importante de modelagem (possuem estruturas preferenciais).
slide-26
SLIDE 26

The levels of protein structure

Molten globule

slide-27
SLIDE 27

Tertiary structure

27

slide-28
SLIDE 28

28

slide-29
SLIDE 29

Protein domains

Pairwise sequence comparison of proteins led to strange results

  • A domain is an independent folding unit
  • A domain is the next step up in complexity from a motif
  • There appear to be a limited number of folds (domains)

that can be made from the 20 natural aa’s

  • Domain unit of evolution
  • Mixing and matching can create new function and

regulation

  • Most proteins involved in cell signalling consist exclusively
  • f small domains interspersed by linker regions.
slide-30
SLIDE 30

Database of protein domains – Search tools

  • Prosite - expasy.org/prosite/
  • Smart - smart.embl-heidelberg.de
  • ProDom - prodom.prabi.fr/
  • InterPro -www.ebi.ac.uk/interpro/
  • Pfam - pfam.sanger.ac.uk/

30

slide-31
SLIDE 31

Tertiary structure. DNA-binding proteins.

Zn Cys Cys Cys Cys

The zinc finger

slide-32
SLIDE 32
  • t all proteins are structured: Intrinsically Unstructured Proteins

About 35-51% of the proteins have unstructured regions that are longer than 50 residues; 6-17% of proteins in the Swiss-Prot are probably fully disordered. Determined by neural networks predictors (based on the protein sequence). Proteins (segments of proteins) that are lacking well- structured 3-dimentional fold. They are referred as “natively denatured/unfolded”, “intrinsically unstructured/unfolded”.

Disordered Regions

slide-33
SLIDE 33

Coupling of folding to target binding

Predicted α-helices in free peptide Experimentally determined α-helices in complex

  • Can provide tighter binding than similar sized, folded prote
  • Enthalpy-Entropy compensation.
  • Allows post-translational modification.

KID domain of CREB pKID bound to KIX domain

  • f CBP (CREB binding

protein).

slide-34
SLIDE 34

Unstructured proteins can adopt multiple structures upon target binding- they are “plastic”

Hif1α peptide bound to the TAZ1 domain of the Creb binding protein. Here the peptide forms an α-helix.

Hif1α peptide bound to asparagine hydroxylase. Here the peptide binds in an extended conformation.

slide-35
SLIDE 35

he levels of protein structure

Molten globule

slide-36
SLIDE 36

Quaternary structures.

Haemoglobin Enzyme - HIV-1 protease K+ channel Transport protein

slide-37
SLIDE 37

Peptidase

slide-38
SLIDE 38

A few examples of common structural motifs

Helix-turn-Helix: a basic nucleic acid binding structure. This motif (green on left) and the exact relationship between the helices is conserved from bacteria to man. H T H

slide-39
SLIDE 39

Structural Classification of Proteins

  • SCOP
  • CATH

39

slide-40
SLIDE 40

Close relationship between protein structure and its Close relationship between protein structure and its function function

enzym e

A B A Binding to A Digestion

  • f A!

enzym e

Matching the shape to A Hormone receptor Antibody Example of enzyme reaction

enzym e substrate s

slide-41
SLIDE 41

Molecular Interaction types

41 HIV gp120 / CD4 / FAB PD Kwong, R Wyatt, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (1998) Nature 393, 648-659.PD Kwong, R Wyatt, S Majeed, J Robinson, RW Sweet, J Sodroski & WA Hendrickson (2000) Structure 8, 1329-1339.

  • Internal
  • Domain-domain
  • Homo-obligomer
  • Homo-oligomer
  • Hetero-oligomer
slide-42
SLIDE 42

How Proteins interact

42

The Cell

slide-43
SLIDE 43

Interactions forces

43

slide-44
SLIDE 44

Example - van der Waals force

44

A molecule contains a cavity exactly complementary in shape to a protruding group of another molecule

slide-45
SLIDE 45

45

slide-46
SLIDE 46

46

Wikipedia

slide-47
SLIDE 47
  • Inhibitor for anti-apoptotic protein

47

In a leucine zipper DNA-binding protein the two helices are held together by hydrophobic interactions between, mainly, leucine sidechains

slide-48
SLIDE 48

48

The Cell

slide-49
SLIDE 49

49

responsible for DNA binding according to Duval et al (2002)

slide-50
SLIDE 50

50

slide-51
SLIDE 51

51

Glaser et al. (2003) Bioinformatics 19:163-164

slide-52
SLIDE 52

Some 4 helix bundle proteins

Cytokines: secreted proteins that regulate cellular function.

slide-53
SLIDE 53

The β-barrel

OmpF porin is a non-specific transport channel that allows for the passive diffusion of small, polar molecules (600-700 Da in size) through the cell's

  • uter membrane. Such molecules include water, ions, glucose, and other

nutrients as well as waste products (Cowan et al., 1995).

slide-54
SLIDE 54

Analyzing Protein Structure and Function

  • Methods currently used to characterize protein structure

and function

  • Techniques used to determine the three-dimensional

structure of proteins

  • Methods that are used to predict how a protein

functions, based on its homology to other known protein

  • Methods to predict its location inside the cell
  • Techniques for detecting protein-protein interactions

54

slide-55
SLIDE 55
  • Site-directed mutagenesis
  • Nuclear magnetic resonance
  • Mass spectrometry
  • Proteomics (2D eletrophoresis)
  • Protein microarrays

55

Methods currently used to characterize protein structure and function

slide-56
SLIDE 56

Techniques used to determine the three dimensional structure of proteins

  • X-ray crystallography
  • Nuclear Magnetic Resonance (NMR)
  • Circular dichroism
  • Dual plorisation interferometry
  • Cryo-electron microscopy

56

slide-57
SLIDE 57

Methods that are used to predict how a protein functions, based on its homology to other known protein

  • Secondary structure

– Chou-Fasman method – GOR method – Machine learning

  • Tertiary structure

– Ab-initio – Comparative/homology modeling – Threading

57

  • Quaternary structure
  • docking
slide-58
SLIDE 58

Techniques for detecting protein-protein interactions

58

  • Co-immunoprecipitations
  • Surface Plamon Resonance
  • Nuclear Magnetic Resonance (NMR)
  • X-ray Crystallography
  • Label Transfer
  • FRET
  • Far Western Blot

Advances in Artificial Intelligence, 2010

slide-59
SLIDE 59

Methods to predict its location inside the cell

  • Genenic engineering
  • Immunohistochemistry
  • Immunofluorescence
  • Immunoelectron microscopy
  • Isopyonic centrifugation
  • Site-directed mutagenesis

59

slide-60
SLIDE 60

60

slide-61
SLIDE 61

The structure of proteins can be defined in a hierachical way Primary structure: the AA sequence (Thr-Gly-Leu-Pro-…) Secondary structure: local repetitive motifs common to most classes of protein structures Tertiary structure: the 3D arrangement of the secondary structure motifs to form a compact protein. Quaternary structure: the arrangements of several proteins units to form a functional multimeric structure.

The coordinates of all the known structures of proteins can be found in the Protein Data Bank: http://www.rcsb.org/pdb

slide-62
SLIDE 62

CLUSTAL W (1.7) multiple sequence alignment Human-Zcr MATGQKLMRAVRVFEFGGPEVLKLRSDIAVPIPKDHQVLIKVHACGVNPVETYIRSGTYS Ecoli-QOR ------MATRIEFHKHGGPEVLQA-VEFTPADPAENEIQVENKAIGINFIDTYIRSGLYP : :...:.******: ::: . * :::: :: :* *:* ::****** *. Human-Zcr RKPLLPYTPGSDVAGVIEAVGDNASAFKKGDRVFTSSTISGGYAEYALAADHTVYKLPEK Ecoli-QOR -PPSLPSGLGTEAAGIVSKVGSGVKHIKAGDRVVYAQSALGAYSSVHNIIADKAAILPAA * ** *::.**::. **.... :* ****. :.: *.*:. ... ** Human-Zcr LDFKQGAAIGIPYFTAYRALIHSACVKAGESVLVHGASGGVGLAACQIARAYGLKILGTA Ecoli-QOR ISFEQAAASFLKGLTVYYLLRKTYEIKPDEQFLFHAAAGGVGLIACQWAKALGAKLIGTV :.*:*.** : :*.* * :: :*..*..*.*.*:***** *** *:* * *::**. Human-Zcr GTEEGQKIVLQNGAHEVFNHREVNYIDKIKKYVGEKGIDIIIEMLANVNLSKDLSLLSHG Ecoli-QOR GTAQKAQSALKAGAWQVINYREEDLVERLKEITGGKKVRVVYDSVGRDTWERSLDCLQRR ** : : .*: ** :*:*:** : ::::*: .* * : :: : :.. . .:.*. *.: Human-Zcr GRVIVVG-SRGTIEINPRDTMAKES----SIIGVTLFSSTKEEFQQYAAALQAGMEIGWL Ecoli-QOR GLMVSFGNSSGAVTGVNLGILNQKGSLYVTRPSLQGYITTREELTEASNELFSLIASGVI * :: .* * *:: . : ::. : .: : :*:**: : : * : : * : Human-Zcr KPVIGSQ--YPLEKVAEAHENIIHGSGATGKMILLL Ecoli-QOR KVDVAEQQKYPLKDAQRAHE-ILESRATQGSSLLIP * :..* ***:.. .*** *:.. .: *. :*:

Estrelas indicam resíduos identicos e pontos indicam substituições conservativas

Human zeta crystallin vs E.coli quinone

  • xidoreductase
slide-63
SLIDE 63

Homólogas: Proteínas que evoluíram de um mesmo ancestral comum, quase sempre compartilham da mesma função e adotam uma mesma estrutura tridimensional (podem possuir menos do que 20 % de identidade de sequência, i.e. homólogas distantes) Homologia ≠ identidade de sequência. Análogas: Proteínas que adotam um mesmo enovelamento tridimensional mas que possuem menos de 20% de identidade de sequência (não evoluíram de um mesmo ancestral comum). Parálogas: Proteínas homólogas em uma mesma espécie (duplicação de genes) e que podem exercer diferentes funções. Possuem estrutura tridimensionais bastante similares. Em proteínas parálogas que exercem funções distintas apenas

  • s resíduos importantes para a estrutura são conservados.

Ortólogas: Proteínas homólogas que possuem funções biológicas similares em diferentes espécies. Possuem estrutura tridimensionais bastante similares mas podem ter sequências de aminoácidos bem diferentes (podem exibir adaptações e modificações funcionais em algumas espécies). Resíduos importantes para estrutura e/ou função são conservados.

slide-64
SLIDE 64

64

slide-65
SLIDE 65
  • X-ray crystallographyTechnique
  • The main technique that has been used to discover the

three-dimensional structure of molecules

  • for determining the three-dimensional arrangement of

atoms in a molecule based on the diffraction pattern of X-rays passing through a crystal of the molecule

65

slide-66
SLIDE 66

Nuclear Magnetic Resonance (NMR)

requires a small volume of concentrated protein solution that is placed in a strong magnetic field. The NMR method is especially useful when a protein of interest has resisted attempts at crystallization, a common problem for many membrane proteins. Because NMR studies are performed in solution, this method also offers a convenient means of monitoring changes in protein structure, for example during protein folding or when a substrate binds to the protein. NMR is also used widely to investigate molecules other than proteins and is valuable, for example, as a method to determine the three-dimensional structures of RNA molecules and the complex carbohydrate side chains of glycoproteins.

slide-67
SLIDE 67

Introduction to Structural Biology

Goals of the lecture To become familiar with protein structures and how these structures define the function role of the protein. To understand how and why proteins fold.

slide-68
SLIDE 68

Take-Home Lessons

  • Proteins are polymers of 20 naturally occurring, L-amino acids (aa).
  • The sequence of aa’s defines the structure and hence function, of a

protein.

  • The aa’s can be divided into hydrophobic, polar and charged

groups depending on the sidechain chemistry. This defines where in the 3D protein structure a given aa is likely to be found.

  • Because of the sidechain, the rotation around the backbone bonds,

defined by the dihedral angles φ and ψ, is hindered with certain values being preferred.

  • Proteins fold into different levels of structure referred to as

secondary through quaternary. You should know what each refers to.

  • Large proteins generally do not consist of one large structure but

multiple, independently folding domains that not only provide specific functions, but interact to add a further level of regulation to protein function.

slide-69
SLIDE 69

Amino acid sequence is encoded by DNA Amino acid sequence is encoded by DNA base sequence in a gene base sequence in a gene

・ C G C G A A T T C G C G ・ ・ G C G C T T A A G C G C ・ DNA molecule DNA base sequence

slide-70
SLIDE 70

Three-dimensional structure of Three-dimensional structure of proteins proteins

Tertiary structure Quaternary structure

slide-71
SLIDE 71

GATTACCA GATGACCA GATTACCA insertion GATCATCA GATTGATCA GATTACCA GATTATCA GATTACCA deletion Substitution GAT ACCA T The term homology implies a common ancestry, which may be inferred from observations of sequence similarity

Derivation from a common ancestor through incremental change due to dna replication errors, mutations, damage, or unequal crossing-over.

HOMOLOGY

slide-72
SLIDE 72

LEVEL OF FUNCTION INFORMATION IN PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF REGIONS RESIDUE SECONDARY STRUCTURE 3D STRUCTURE

slide-73
SLIDE 73

Regions:

 SIGNAL PEPTIDES  DOMAINS  Ca_BINDING  GPI ANCHORS  GLYCOSYLATION SITES  PHISICO-CHEMICAL PROPEPERTIES

 DNA_BINDINDING MOTIF

 TRANSMEMBRANE  ZN_FINGER MOTIFS  SIMILAR REGIONS  REPEAT REGIONS  HYDROPHOBICITY  SOLVENT ACCESSIBILITY

slide-74
SLIDE 74

Why Modeling?

  • Experimental determination of structure is still a time

consuming and expensive process.

  • Number of known sequences are more than number
  • f known structures.
  • Structure information is essential in understanding

function.

slide-75
SLIDE 75

Sequence identities & Molecular Modeling methods

Methods Sequence Identity with known structures

  • ab initio

0-20%

  • Fold recognition

20-35%

  • Homology Modeling

>35%

slide-76
SLIDE 76

What is Homology or Comparative Modeling?

  • Comparisons of the tertiary structures of

homologous proteins have shown that 3-D structures have been better conserved during evolution than protein primary structures.

  • In the absence of experimental data, model-

building on the basis of the known three dimensional structure of homologous protein(s) is the only reliable method to obtain structural information.

slide-77
SLIDE 77

Difference between Homology and Similarity

  • Homology does not necessarily imply similarity.
  • Homology has a precise definition: having a

common evolutionary origin.

  • Since homology is a qualitative description of the

relationship, the term “% homology” has no meaning.

  • Supporting data for a homologous relationship may

include sequence or structural similarities, which can be described in quantitative terms.

slide-78
SLIDE 78

100% 75% 50% 25%

Sequence Identity α Model Accuracy

Rate limiting factors in modeling

CPU time to model Quality of model & Loop modeling Errors in the sequence alignment Detection of homology

slide-79
SLIDE 79

What is Remote Homology Modeling?

  • Modeling based on low levels of sequence

identity (<30%).

  • Has 3 obstacles to overcome:

 the remote homology has to be detected;  Q and T have to be aligned correctly;  homology modeling procedure has to be tailored to the harder problem of extremely low sequence identity.

slide-80
SLIDE 80

Steps in Homology Modeling

template recognition alignment backbone generation generation of canonical loops (data based) side chain generation & optimisation model optimisation (energy minimisation) model verification Optional repeat of previous steps: Generating more

slide-81
SLIDE 81
  • Proteins were first described by the Dutch chemist Gerhardus Johannes Mulder and named by

the Swedish chemist Jöns Jakob Berzelius in 1838. Early nutritional scientists such as the German Carl von Voit believed that protein was the most important nutrient for maintaining the structure of the body, because it was generally believed that "flesh makes flesh."[3] The central role of proteins as enzymes in living organisms was however not fully appreciated until 1926, when James B. Sumner showed that the enzyme urease was in fact a protein.[4] The first protein to be sequenced was insulin, by Frederick Sanger, who won the Nobel Prize for this achievement in 1958. The first protein structures to be solved were hemoglobin and myoglobin, by Max Perutz and Sir John Cowdery Kendrew, respectively, in 1958.[5][6] The three- dimensional structures of both proteins were first determined by x-ray diffraction analysis; Perutz and Kendrew shared the 1962 Nobel Prize in Chemistry for these discoveries. Proteins may be purified from other cellular components using a variety of techniques such as ultracentrifugation, precipitation, electrophoresis, and chromatography; the advent of genetic engineering has made possible a number of methods to facilitate purification. Methods commonly used to study protein structure and function include immunohistochemistry, site- directed mutagenesis, nuclear magnetic resonance and mass spectrometry.

81