CSE182-L7 CSE182-L7 Protein structure Basics Protein structure - - PowerPoint PPT Presentation

cse182 l7 cse182 l7
SMART_READER_LITE
LIVE PREVIEW

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure - - PowerPoint PPT Presentation

CSE182-L7 CSE182-L7 Protein structure Basics Protein structure Basics Protein sequencing via MS Protein sequencing via MS Quiz Quiz What research won the Nobel prize in What research won the Nobel prize in Chemistry in 2004?


slide-1
SLIDE 1

CSE182-L7 CSE182-L7

Protein structure Basics Protein structure Basics Protein sequencing via MS Protein sequencing via MS

slide-2
SLIDE 2

Quiz Quiz

ß ß What research won the Nobel prize in

What research won the Nobel prize in Chemistry in 2004? Chemistry in 2004?

ß ß In 2002?

In 2002?

slide-3
SLIDE 3

A structural view of proteins A structural view of proteins

slide-4
SLIDE 4

CS view of a protein CS view of a protein

  • >sp|P00974|BPT1_BOVIN Pancreatic

>sp|P00974|BPT1_BOVIN Pancreatic trypsin trypsin inhibitor precursor (Basic inhibitor precursor (Basic protease inhibitor) (BPI) (BPTI) protease inhibitor) (BPI) (BPTI) ( (Aprotinin Aprotinin) - ) - Bos taurus Bos taurus (Bovine). (Bovine).

  • MKMSRLCLSVALLVLLGTLAASTPGCDT

MKMSRLCLSVALLVLLGTLAASTPGCDT SNQAKAQRPDFCLEPPYTGPCKARIIRYF SNQAKAQRPDFCLEPPYTGPCKARIIRYF YNAKAGLCQTFVYGGCRAKRNNFKSAED YNAKAGLCQTFVYGGCRAKRNNFKSAED CMRTCGGAIGPWENL CMRTCGGAIGPWENL

slide-5
SLIDE 5

Protein structure basics Protein structure basics

slide-6
SLIDE 6

Side chains determine amino-acid type Side chains determine amino-acid type ß ß The residues may have different properties.

The residues may have different properties.

ß ß Aspartic acid (D), and

Aspartic acid (D), and Glutamic Glutamic Acid (E) are Acid (E) are acidic residues acidic residues

slide-7
SLIDE 7

Bond angles form structural Bond angles form structural constraints constraints

slide-8
SLIDE 8

Various constraints determine 3d Various constraints determine 3d structure structure ß ß Constraints

Constraints

ß ß Structural constraints due to physiochemical Structural constraints due to physiochemical properties properties ß ß Constraints due to bond angles Constraints due to bond angles ß ß H-bond formation H-bond formation

ß ß Surprisingly, a few conformations are seen

Surprisingly, a few conformations are seen

  • ver and over again.
  • ver and over again.
slide-9
SLIDE 9

Alpha-helix Alpha-helix

ß ß 3.6 residues per

3.6 residues per turn turn

ß ß H-bonds between

H-bonds between 1st and 4th 1st and 4th residue stabilize residue stabilize the structure. the structure.

ß ß First discovered

First discovered by by Linus Pauling Linus Pauling

slide-10
SLIDE 10

Beta-sheet Beta-sheet

ß ß

Each strand by itself has 2 residues per turn, and is not stable. Each strand by itself has 2 residues per turn, and is not stable.

ß ß

Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel. Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel.

ß ß

Beta sheets have long range interactions that stabilize the structure, while alpha-helices Beta sheets have long range interactions that stabilize the structure, while alpha-helices have local interactions. have local interactions.

slide-11
SLIDE 11

Domains Domains

ß ß The basic structures (helix, strand, loop)

The basic structures (helix, strand, loop) combine to form complex 3D structures. combine to form complex 3D structures.

ß ß Certain combinations are popular. Many

Certain combinations are popular. Many sequences, but only a few folds sequences, but only a few folds

slide-12
SLIDE 12

3D structure 3D structure

  • Predicting tertiary structure is an important problem in

Bioinformatics.

  • Premise: Clues to structure can be found in the sequence.
  • While de novo tertiary structure prediction is hard, there

are many intermediate, and tractable goals.

  • The PDB database is a compendium of structures

PDB

slide-13
SLIDE 13

Protein Domains Protein Domains

ß ß

An important realization (in the last decade) is that proteins have a An important realization (in the last decade) is that proteins have a modular architecture of domains/folds. modular architecture of domains/folds.

ß ß

Example: The zinc finger domain is a DNA-binding domain. Example: The zinc finger domain is a DNA-binding domain.

ß ß

What is a domain? What is a domain? ß ß Part of a sequence that can fold independently, and is present in Part of a sequence that can fold independently, and is present in

  • ther sequences as well
  • ther sequences as well
slide-14
SLIDE 14

Proteins containing Proteins containing zf zf domains domains

How can we find a motif corresponding to a zf domain

slide-15
SLIDE 15

Domain review Domain review

ß ß What is a domain?

What is a domain? ß ß How are domains expressed

How are domains expressed

ß ß Motifs (Regular expression & others) Motifs (Regular expression & others) ß ß Multiple alignments Multiple alignments ß ß Profiles Profiles ß ß Profile Profile HMMs HMMs

slide-16
SLIDE 16

Prosite

http://us.expasy.org/prosite/

Protein Domain databases Protein Domain databases

ß ß Motifs

Motifs

ß ß PROSITE: Regular PROSITE: Regular Expressions & Expressions & Profiles Profiles ß ß BLOCKS:Multiple BLOCKS:Multiple Alignments Alignments ß ß Pfam Pfam: HMMS : HMMS PFAM

http://www.sanger.ac.uk/Software/Pfam/

slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19

How are Proteins Sequenced? How are Proteins Sequenced? Mass Spec 101: Mass Spec 101:

slide-20
SLIDE 20

Nobel Citation 2002 Nobel Citation 2002

slide-21
SLIDE 21

Nobel Citation, 2002 Nobel Citation, 2002

slide-22
SLIDE 22

Mass Spectrometry Mass Spectrometry

slide-23
SLIDE 23

Sample Preparation Sample Preparation

Enzymatic Digestion (Trypsin) + Fractionation

slide-24
SLIDE 24

Single Stage MS Single Stage MS

Mass Spectrometry LC-MS: 1 MS spectrum / second

slide-25
SLIDE 25

Tandem MS Tandem MS

Secondary Fragmentation

Ionized parent peptide

slide-26
SLIDE 26

The peptide backbone The peptide backbone

H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH Ri-1 Ri Ri+1

AA residuei-1 AA residuei AA residuei+1 N-terminus C-terminus

The peptide backbone breaks to form fragments with characteristic masses.

slide-27
SLIDE 27

Ionization Ionization

H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH Ri-1 Ri Ri+1

AA residuei-1 AA residuei AA residuei+1 N-terminus C-terminus

The peptide backbone breaks to form fragments with characteristic masses. Ionized parent peptide

H+

slide-28
SLIDE 28

Fragment ion generation Fragment ion generation

H...-HN-CH-CO NH-CH-CO-NH-CH-CO-…OH Ri-1 Ri Ri+1

AA residuei-1 AA residuei AA residuei+1 N-terminus C-terminus

The peptide backbone breaks to form fragments with characteristic masses. Ionized peptide fragment

H+

slide-29
SLIDE 29

Tandem MS for Peptide ID Tandem MS for Peptide ID

147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 250 500 750 1000 [M+2H]2+ m/z % Intensity

slide-30
SLIDE 30

Peak Assignment Peak Assignment

147 K 1166 L 260 1020 E 389 907 D 504 778 E 633 663 E 762 534 L 875 405 F 1022 292 G 1080 145 S 1166 88 y ions b ions 100 250 500 750 1000 y2 y3 y4 y5 y6 y7 b3 b4 b5 b8 b9 [M+2H]2+ b6 b7 y9 y8 m/z % Intensity Peak assignment implies Sequence (Residue tag) Reconstruction!

slide-31
SLIDE 31

Database Searching for peptide ID Database Searching for peptide ID

ß ß For every peptide from a database

For every peptide from a database

ß ß Generate a hypothetical spectrum Generate a hypothetical spectrum ß ß Compute a correlation between observed Compute a correlation between observed and experimental spectra and experimental spectra ß ß Choose the best Choose the best

ß ß Database searching is very powerful and

Database searching is very powerful and is the is the de facto de facto standard for MS. standard for MS.

ß ß Sequest Sequest, Mascot, and many others , Mascot, and many others

slide-32
SLIDE 32

Spectra: the real story Spectra: the real story

ß ß Noise Peaks

Noise Peaks

ß ß Ions, not prefixes & suffixes

Ions, not prefixes & suffixes

ß ß Mass to charge ratio, and not mass

Mass to charge ratio, and not mass

ß ß Multiply charged ions Multiply charged ions

ß ß Isotope patterns, not single peaks

Isotope patterns, not single peaks

slide-33
SLIDE 33

Peptide fragmentation possibilities (ion types)

  • HN-CH-CO-NH-CH-CO-NH-

Ri CH-R’

ai bi ci xn-i yn-i zn-i yn-i-1 bi+1

R”

di+1 vn-i wn-i

i+1 i+1

low energy fragments high energy fragments

slide-34
SLIDE 34

Ion types, and offsets Ion types, and offsets

ß ß P = prefix residue mass

P = prefix residue mass

ß ß S = Suffix residue mass

S = Suffix residue mass

ß ß b-ions = P+1

b-ions = P+1

ß ß y-ions = S+19

y-ions = S+19

ß ß a-ions = P-27

a-ions = P-27

slide-35
SLIDE 35

Mass-Charge ratio Mass-Charge ratio

ß ß The X-axis is (M+Z)/Z

The X-axis is (M+Z)/Z

ß ß Z=1 implies that peak is at M+1 Z=1 implies that peak is at M+1 ß ß Z=2 implies that peak is at (M+2)/2 Z=2 implies that peak is at (M+2)/2 ß ß M=1000, Z=2, peak position is at 501

M=1000, Z=2, peak position is at 501

ß ß Suppose you see a peak at 501. Is the mass Suppose you see a peak at 501. Is the mass 500, or is it 1000? 500, or is it 1000?

slide-36
SLIDE 36

Isotopic peaks Isotopic peaks

ß ß Ex: Consider peptide SAM

Ex: Consider peptide SAM

ß ß Mass =

Mass = 308.12802 308.12802

ß ß You should see:

You should see:

ß ß Instead, you see

Instead, you see

308.13 308.13 310.13

slide-37
SLIDE 37

Isotopes Isotopes

ß ß C-12 is the most common. Suppose C-13

C-12 is the most common. Suppose C-13

  • ccurs with probability 1%
  • ccurs with probability 1%

ß ß EX:

EX: SAM SAM

ß ß Composition: C11 H22 N3 O5 S1 Composition: C11 H22 N3 O5 S1

ß ß What is the probability that you will see a

What is the probability that you will see a single C-13? single C-13?

ß ß Note that C,S,O,N all have isotopes. Can you

Note that C,S,O,N all have isotopes. Can you compute the isotopic distribution? compute the isotopic distribution?

11 1 Ê Ë Á ˆ ¯ ˜ ⋅ 0.1⋅ 0.910

slide-38
SLIDE 38

All atoms have isotopes All atoms have isotopes

ß ß O16 & O18

O16 & O18

slide-39
SLIDE 39

Tandem MS summary Tandem MS summary

ß ß The basics of peptide ID using tandem

The basics of peptide ID using tandem MS is simple. MS is simple.

ß ß Correlate experimental with theoretical Correlate experimental with theoretical spectra spectra

ß ß In practice, there might be many

In practice, there might be many confounding problems. confounding problems.

ß ß A toolkit that resolves some of these

A toolkit that resolves some of these problems will be useful. problems will be useful.