Computational Systems Biology TUM WS 2010/11 Lecture 4: Protein - - PowerPoint PPT Presentation

computational systems biology
SMART_READER_LITE
LIVE PREVIEW

Computational Systems Biology TUM WS 2010/11 Lecture 4: Protein - - PowerPoint PPT Presentation

Computational Systems Biology TUM WS 2010/11 Lecture 4: Protein Structure and Disorder in Complete Genomes 2010-11-11 Dr. Arthur Dong How To Read A Paper Focus: Technical details or the big picture? Within the paper: What's the whole


slide-1
SLIDE 1

Computational Systems Biology

TUM WS 2010/11

Lecture 4: Protein Structure and Disorder in Complete Genomes

2010-11-11

  • Dr. Arthur Dong
slide-2
SLIDE 2

How To Read A Paper

Focus: Technical details or the big picture? Within the paper:

  • What's the whole point, the take-home lesson?
  • Why did they do what they did? (historical perspective)
  • Any parts problematic and could be improved?
  • Expected versus unexpected

Go beyond the paper:

  • Observation – Question – Hypothesis – Investigation – Application
  • What's the next obvious step?
  • Can I apply the same ideas/techniques in other areas?

Turn any question into a project (and possibly a paper)!

slide-3
SLIDE 3
  • Catalysis:

Almost all chemical reactions in a living cell are catalyzed by protein enzymes.

  • Transport:

Some proteins transports various substances, such as oxygen, ions, and so on.

  • Information transfer:

For example, hormones.

Alcohol dehydrogenase

  • xidizes alcohols

to aldehydes or ketones Haemoglobin carries oxygen Insulin controls the amount of sugar in the blood

Proteins are the worker molecules in a cell

slide-4
SLIDE 4

Levels of Protein Structure

slide-5
SLIDE 5

Secondary structures, α-helix and β-sheet, have regular hydrogen-bonding patterns.

Sometimes we don't have a choice...

slide-6
SLIDE 6

Tertiary structure

6

slide-7
SLIDE 7

Protein Structure in Complete Genomes 1990s – The start of complete-genome sequencing

  • Sequencing and Assembly
  • Gene Prediction
  • Proteome – the “parts” list of all proteins (our starting point)

 H. influenzae – 1995 (bacteria)  M. jannaschii – 1996 (archaea)  S. cerevisiae – 1996 (eukarya)

Comparison of living organisms at different scales:

  • At atom and amino acids level (physics and chemistry) they are all the same.
  • At species level they are all different.
  • Find the happy medium – molecular biology (individual proteins etc) and systems biology

(the interaction of proteins etc) 3 diverse organisms from 3 kingdoms of life Expect significant differences in their genomes – what are those? What are actually similar?

  • Method – sequence analysis
  • Object – protein structure
  • Perspective – genome-wide, systems-level
slide-8
SLIDE 8

Compare secondary structures across genomes

  • The expected
  • The unexpected
  • Why? Possible explanations
slide-9
SLIDE 9

Comparison of super-secondary structures

slide-10
SLIDE 10

Protein Tertiary Structure: PDB and SCOP

  • PDB – depository of all solved structures (can be multi-domain or multi-protein)
  • SCOP – classification of domains/proteins by structural and evolutionary relatedness

SCOP hierarchy:

  • Family: homologs (evolutionarily related, >30% sequence identity, similar function)
  • Superfamily: likely homologs (low sequence identity but similar function)
  • Fold:

 Similar tertiary structure – same secondary elements arranged in the same way in

space

 Difference mainly in flanking and connecting regions e.g. loops/turns  Possibly no evolutionary relation and low sequence identity

slide-11
SLIDE 11
slide-12
SLIDE 12

Folds across genomes Bias → Structural Genomics Ancient folds Prevalence of mixed folds

slide-13
SLIDE 13

5 Most Common Folds Present in All 3 Genomes

  • Similar architecture!
  • Similar function (basic metabolism)
  • Why are they common? (evolution, folding energy, ...)
slide-14
SLIDE 14

Application: Whole-genome trees based on fold occurance

slide-15
SLIDE 15

Protein Disorder

What is protein disorder?

  • Not everything folds into compact 3D structure
  • Abundance of “floppy”, extended regions
  • Conformation ensemble rather than fixed structure

What is its function?

  • Coupled binding (“induced fit” rather than “lock-and-key”)
  • High specificity, low affinity (easily reversible)
  • Interaction with a large number of targets

Can you predict disorder from sequence?

  • Low sequence complexity
  • Amino acid compositional bias
slide-16
SLIDE 16

Coupling of folding to target binding

Predicted α-helices in free peptide Experimentally determined α-helices in complex

  • Can provide tighter binding than similar sized, folded proteins.
  • Enthalpy-Entropy compensation.
  • Allows post-translational modification.

KID domain of CREB pKID bound to KIX domain

  • f CBP (CREB binding

protein).

slide-17
SLIDE 17
slide-18
SLIDE 18

Protein Disorder in Complete Genomes Which kinds of proteins tend to be disordered?

slide-19
SLIDE 19

Gene Ontology – A Unifying Vocabulary Across Organisms

slide-20
SLIDE 20

Clustering of Genes – mRNA versus GO

slide-21
SLIDE 21

GO: Molecular Function Un/expected?

slide-22
SLIDE 22

GO: Cellular Component Consistent?

slide-23
SLIDE 23

Some obvious questions:

  • Are disorder conserved?
  • More protein interactions?