Specificity of Protein-DNA recognition of a long DNA binding motif - - PowerPoint PPT Presentation

specificity of protein dna recognition of a long dna
SMART_READER_LITE
LIVE PREVIEW

Specificity of Protein-DNA recognition of a long DNA binding motif - - PowerPoint PPT Presentation

Specificity of Protein-DNA recognition of a long DNA binding motif Francisco Melo Ledermann EMBO Global Exchange Lecture Course Structural and biophysical methods for biological macromolecules in solution Pontificia Universidad Catlica de


slide-1
SLIDE 1

Specificity of Protein-DNA recognition of a long DNA binding motif

Francisco Melo Ledermann

Molecular Genetics and Microbiology Department, School of Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, Chile.

http://melolab.org

EMBO Global Exchange Lecture Course Structural and biophysical methods for biological macromolecules in solution Pontificia Universidad Católica de Chile, Santiago, Chile 16 October 2019

slide-2
SLIDE 2

CAN WE IDENTIFY / MODULATE THE KEY SPECIFICITY AND AFFINITY DETERMINANTS THAT DEFINE A FUNCTIONAL EVENT ? CAN WE REALLY “UNDERSTAND” HOW THIS PROCESS OPERATES ? Our question(s) … The general problem we want to address …

Protein-DNA interactions BINDING SPECIFICITY BINDING AFFINITY FUNCTION UPON BINDING

slide-3
SLIDE 3

What determines protein-DNA binding specificity ?

Adapted from: Rohs et al. (2010) "Origins of Specificity in Protein-DNA Recognition" Annu. Rev. Biochem. 79, 233–269. Jen-Jacobson et al., (2000) "Structural and Thermodynamic Strategies for Site-Specific DNA Binding Proteins" Structure 8, 1015-1023.

slide-4
SLIDE 4

Our methodological approach: general scheme, part I

ACGTGGTAGAAACGTGAGCCT

Optimization Method

AYYNCACAAWTTRRTTN ATTNCRCRACTTAGTTW

3D modeling Energy scoring functions (knowledge-based potentials) Metropolis-Montecarlo Low energy score sequence ensemble HMMs Sequence Patterns Position Weight Matrix New DNA sequence Known protein-DNA complex structure (PDB) 3D protein-DNA complex model PDIdb Experimental Binding Data (EMSA, NGS, UPBMs)

slide-5
SLIDE 5

5’ ✔ ✖ 5’

Genome scanning Complete Genome Annotation Tables Comparison with experimental data Experimental validation of predictions

  • Isothermal Titration Calorimetry
  • Surface Plasmon Resonance
  • Fluorescence Anisotropy
  • SAXS
  • in vitro transcription assays
  • DNA binding microarrays
  • One round SELEX
  • EMSA

Our methodological approach: general scheme, part II

HMMs

AYYNCACAAWTTRRTTN ATTNCRCRACTTAGTTW

Sequence Patterns Position Weight Matrix Predicted Binding Sites

slide-6
SLIDE 6

PDIdb Database

http://melolab.org/pdidb

(2010) Norambuena and Melo. BMC Bioinformatics 11, 262, 1-12.

slide-7
SLIDE 7

PDIdb: PDB complexes and interfaces definition

slide-8
SLIDE 8

Knowledge-based potentials for protein-DNA interactions

Protein DNA Interface database (PDIdb)

  • Used to derive a non-redundant list of protein-DNA complexes
  • NR list of complexes used to calculate knowledge-based potentials

d Ecc(d)

ΔEd

ij = −kT ln

fd

ii

fd

kk

# $ % & ' (

Sippl, M., J.Comp.Aided Mol.Des. 1993

Observations Relative Frequency Inverse Boltzmann Law Knowledge-based portential

Proteína ADN Capriotti et al. Bioinformatics 27,2011. Norambuena et al. (unpublished work)

slide-9
SLIDE 9

Knowledge-based potentials for protein-DNA interactions

d

Ecc(d) d

d

Edb(d)

d

Eba(d)

d

Ebd(d)

ARG ASN

slide-10
SLIDE 10

Comparative Modeling of DNA duplexes

Full Atom Comparative Modeling of Protein-DNA Complexes

  • Conformational degrees of freedom in DNA duplexes
  • Strategy and geometrical restraints to model 3D structure of DNA duplexes
  • Example case (static or interactive display of the modeling process)

FULL ATOM COMPARATIVE THREE-DIMENSIONAL MODELING OF DNA DUPLEXES

Ibarra and Melo (unpublished work)

slide-11
SLIDE 11

Full Atom Comparative Modeling of protein-DNA complexes

5‘-…ATGCCACGTTTT…-3'/5‘-…TACGGTGCAAAA…-3' 5‘-…ATGGACCGTTTT…-3'/ 5'-…TACCTGGCAAAA…-3'

Start Template selection Template-target alignment Model building Model assessment

No Yes

End

OK?

Target Template

5‘-…ATGGACCGTTTT…-3'/5‘-…TACCTGGCAAAA…-3'

slide-12
SLIDE 12

Building a non-redundant database of duplex DNA

O O

N = 34 Selection of DNA structures

…
 CGCAAATTTGCG CGCGAAAAAACG AGGGGCCCCT AGGGGCGGGGCT ACCGACGTCGGT
 …

Non-redundant dataset N = 86 Multiple filters (pH, temperature, R-crys, resolution, sequence, structure clustering)

slide-13
SLIDE 13

Non-redundant database of full duplex DNA

A(N=12) B(N=17) Z(N=5)

Secuencias (5'-3')

GCGGGCCCGC GCGGGCCCGC GAAGCTTC GAAGCTTC CATGGGCCCATG CATGGGCCCATG TCTGCGGTC TGACCGCAG CCCGGCCGGG CCCGGCCGGG CCGGGCCCGG CCGGGCCCGG AGGGGCGGGGCT TAGCCCCGCCCC GGCATGCC GGCATGCC GGTATACC GGTATACC CGCGGGTACCCGCG CGCGGGTACCCGCG GGGGCGCCCC GGGGCGCCCC AGGGGCCCCT AGGGGCCCCT CGCGTTAACGCG CGCGTTAACGCG CCGGCGCCGG CCGGCGCCGG CGCAGAATTCGCG CGCAGAATTCGCG CGCAAATTTGCG CGCAAATTTGCG CGCGAAAAAACG CGTTTTTTCGCG ACCGAATTCGGT ACCGAATTCGGT GCTTAATTCG CGAATTAAGC CCATTAATGG CCATTAATGG CCGATATCGG CCGATATCGG CCTGCGCAGG CCTGCGCAGG CCGAGCTCGG CCGAGCTCGG CCTAATTAGG CCTAATTAGG CCGCTAGCGG CCGCTAGCGG CGCGATATCGCG CGCGATATCGCG ACCGACGTCGGT ACCGACGTCGGT ACCGGTACCGGT ACCGGTACCGGT CGCGAATTCGCG CGCGAATTCGCG CGCACG CGTGCG CACGCG CGCGTG CGCGCG CGCGCG CCCGGG CCCGGG CACACG CGTGTG

slide-14
SLIDE 14

No Yes No

Derivation of geometrical restraints for the 3D modeling of full duplex DNA

Non-redundant database

  • f duplex DNA

Histogram building Counting of occurrences for each geometrical restraint

Gaussian a

Spline

BEGIN

OK ?

Yes

Poli- gaussian

Calculation Fitting

END

OK ?

slide-15
SLIDE 15

Fitting of continuous mathematical functions to discrete experimental data

slide-16
SLIDE 16

Conformational degrees of freedom in DNA strands There are many degrees of freedom in a single RNA or DNA chains

  • For a single DNA strand, we have 12 rotational bonds:
  • Sugar phosphate backbone: α, β, γ, δ, ε, ζ, ν0, ν1, ν2, ν3, ν4
  • Glycosidic bond: χ
slide-17
SLIDE 17

Conformational degrees of freedom in DNA duplexes Base pairing parameters Base pair step parameters

3 angles 3 distances 3 angles 3 distances

slide-18
SLIDE 18

ProteinDNA and FreeDNA backbone dihedral angle restraints

slide-19
SLIDE 19

ProteinDNA and FreeDNA ribose dihedral angle restraints

slide-20
SLIDE 20

ProteinDNA and FreeDNA glycosidic dihedral angle restraints

slide-21
SLIDE 21

ProteinDNA and FreeDNA glycosidic dihedral angle restraints

slide-22
SLIDE 22

Geometrical restraints to model 3D DNA duplex conformation

O4 N6 N3 N1

AT BASE PAIR

C1’ C1’ N4 N3 O2 O6 N1 N2

CG BASE PAIR

C1’ C1’ H-BOND DONORS, ACCEPTORS

A T C G

Distance restraints that define the geometry of a base pair

slide-23
SLIDE 23

Comparative modeling: detailed flowchart

>P1;M1dsz sequence:M1dsz:.:. :. : .: .::: jelltjeeelltjel/jtlejjtttlejjtl* >P1;1kb2 structureX:1kb2:.:. :. : .: .::: jllttjejlellttj/leejjtjltleejjl*

alignment Template structure Modeller DNAUtils

DNAModel

Generate topology Transfer template coordinates Model building Optimization of nucleotide stereochemistry Optimization of base pair geometry refinement Optimized Model

slide-24
SLIDE 24

Comparative modeling: model optimization

Non-bonded terms (from knowledge-based potential)

CHARMM forcefield

Dihedral restraints

slide-25
SLIDE 25

DNAviz PyMol Plugin

slide-26
SLIDE 26

Atoms with ΔSASA > 0 DNA Protein SASA(i) Individual Structures SASA(i) Complex Structure ΔSASA or BSA Calculation of Solvent Accessible Surface Area (SASA) and ΔSASA

slide-27
SLIDE 27

SASA Complex DNA backbone SASA Complex DNA bases SASA Individual Structures DNA Bases DNA backbone Protein Atpms with ΔSASA > 0 Calculation of ΔSASA for DNA bases and backbone atoms

(2015) Ribeiro, J., Melo, F. and Schüller A “PDIviz: analysis and visualization of protein–DNA binding interfaces” Bioinformatics 31, 2751-2753.

slide-28
SLIDE 28

MarA and Rob: our experimental working models (AraC/XylS TF family)

Why we have chosen MarA as a working model ? 1. Monomeric transcription factor 2. Single structural domain (2 HTH motifs) 3. Asymmetric (not palindromic) binding site 4. Long binding site, known as marbox (21 bps) 5. 28 known marboxes by experiment (+5 putatives) 6. MarA is ambidexter 7. Crystal Structure of MarA-mar complex available 8. Several in vitro transcription assays with many promoters 9. Complete alanine-scanning and functional tests

  • 10. DNA microarrays with MarA expressed constitutively in vivo
  • 11. EMSA assays of MarA and several promoters
  • 12. Restricted only to Enterobacteria (highly specific binding mode ?)
  • 13. Clinically relevant (antibiotic resistance, stress tolerance)

MarA-mar complex

(crystal structure 1BL0)

Rob-micF complex

(crystal structure 1D5Y)

slide-29
SLIDE 29

Marboxes known to be bound by MarA proteins with high/medium affinity

slide-30
SLIDE 30

Marboxes known: fuzzy pattern AYNGCACNNWNNRYYAAACN

slide-31
SLIDE 31

MarA and Rob proteins complexed to mar and micF DNA marboxes

slide-32
SLIDE 32

MarA , Rob, RobDBD and Chimera protein constructs MarA, 128 AAs Rob DBD, 128 AAs Rob Regulatory Domain, 170 AAs RobDBD, 128 AAs MarA, 128 AAs Rob Regulatory Domain, 170 AAs Natural proteins (E. coli K12) Artificial proteins MarA Rob RobDBD Chimera

slide-33
SLIDE 33

Marboxes DNA sequence logo and box A, spacer and box B definitions

DNA base interaction

DNA backbone interaction

Interaction with base and backbone

slide-34
SLIDE 34

mar and micF duplex DNA used in EMSAs with MarA and Rob proteins

DNA (35 bp) SEQUENCE mar GACCGATGCCACGTTTTGCTAAATCGAGGTGTTAG micf GACCGACAGCACTGAATGTCAAAACGAGGTGTTAG mut GACCGACATTGTTTTTTGCACTCAAGAGGTGTTAG marmicFAB GACCGACAGCACGTTTTGTCAAATCGAGGTGTTAG micFmarAB GACCGATGCCACTGAATGCTAAAACGAGGTGTTAG marmutB GACCGATGCCACGTTTTGCACTCTCGAGGTGTTAG micFmutB GACCGACAGCACTGAATGCACTCACGAGGTGTTAG marmutA GACCGACATTGTGTTTTGCTAAATCGAGGTGTTAG micFmutA GACCGACATTGTTGAATGTCAAAACGAGGTGTTAG marmicFsp GACCGATGCCACTGAATGCTAAATCGAGGTGTTAG micFmarsp GACCGACAGCACGTTTTGTCAAAACGAGGTGTTAG mar-bs GACCGATGCTAAAGTTTTGCCACTCGAGGTGTTAG micF-bs GACCGATGTCAAATGAACAGCACACGAGGTGTTAG

BoxA BoxB Sp

slide-35
SLIDE 35

mar and micF Oligos EMSAs MarA, Rob proteins

MarA-mar context

C- 1 2 3 4 5 6 7

* * * *

mar mut marmutA marmutB marmicFsp marmicFAB mar-bs mar MarA protein

+

No protein

slide-36
SLIDE 36

mar and micF Oligos EMSAs MarA, Rob proteins

Rob-mar context

C- 1 2 3 4 5 6 7

* * * *

mar mut marmutA marmutB marmicFsp marmicFAB mar-bs mar Rob protein

+

No protein

slide-37
SLIDE 37

mar and micF Oligos EMSAs MarA, Rob proteins

MarA-micF context

C- 1 2 3 4 5 6 7

* * *

micF mut micFmutA micFmutB micFmarsp micFmarAB micF-bs micF MarA protein

+

No protein

slide-38
SLIDE 38

mar and micF Oligos EMSAs MarA, Rob proteins

Rob-micF context

* * * *

C- 1 2 3 4 5 6 7 micF mut micFmutA micFmutB micFmarsp micFmarAB micF-bs micF Rob protein

+

No protein

slide-39
SLIDE 39

0.0 0.1 0.2 0.3 0.4 0.5 s, A ° -1 log I(s), relative

Theoretical scattering, pdb id: 1bl0

protein complex dna 0.0 0.1 0.2 0.3 0.4 0.5 s, A ° -1 log I(s), relative

Theoretical scattering, pdb id: 1d5y

protein complex dna

SAXS (batch mode) and visit to EMBL Hamburg (June 2016)

slide-40
SLIDE 40

Design and production of 5 new MarA mutants with increased solubility MarA-M3Q (M) MarA-L28S (L) MarA-M111S-M114S (MM) MarA-L28S-M111S-M114S (LMM) MarA-M3Q-I11S-L28S-M111S-M114S(LMMQI)

slide-41
SLIDE 41

SEC-SAXS, new visit to EMBL Hamburg (September 2017)

slide-42
SLIDE 42

Molecular Dynamics, AMBER forcefield, explicit water (1 microseconds)

slide-43
SLIDE 43

Radius of giration distributions: MarA-mar and Rob-micF (and B-DNA)

slide-44
SLIDE 44

Representative Conformers in each trajectory by CSA similarity measure

slide-45
SLIDE 45

Trajectory and radius of gyration: MarA-B-DNA

slide-46
SLIDE 46

Trajectory and CSA with H3 and H6: MarA-B-DNA

slide-47
SLIDE 47

Trajectory and radius of gyration: Rob-micF

slide-48
SLIDE 48

Trajectory and CSA with H3 and H6: Rob-micF

slide-49
SLIDE 49

Trajectory and CSA with H3 and H6: Rob-B-DNA

slide-50
SLIDE 50

EMSA Rob and micF marbox DNA wobbles [Rob]: 0,47uM [ADN]: 50nM

Protein-DNA complexes (bound DNA) micF mut micF_2-5_wb C+ C- Free duplex DNA (unbound DNA) micF mut micF_2-5_wb C+ C-

+ + +

Protein-DNA complexes (bound DNA) Protein Duplex DNA wobbles (256 variants each)

+

Protein-DNA complexes (Wb-Bound) Free duplex DNA (Wb-Unbound)

+ + +

Protein-DNA binding Assay: EMSA experiment:

Free duplex DNA (Wb-Free)

slide-51
SLIDE 51

5’-Z5-AATTGACCGATGCCACNNNNTGCTAAATCGGTTCAAG-Z3-3’ CAJA A CAJA B ESPACIADOR UPSTREAM DOWNSTREAM PRIMER Z5 PRIMER Z3

256 secuencias ADN diferentes Proteína MarA pura + Ensayo de unión, separación cromatográfica y purificación complejos proteína-ADN formados Secuencias de ADN reconocidas por MarA Secuencias de ADN NO reconocidas por MarA Purificación de ADN y secuenciación profunda Análisis computacional De secuencias

slide-52
SLIDE 52

Capture of synthetic DNA wobbles , NGS digital reading (Ion Torrent PGM)

slide-53
SLIDE 53

EMSA Rob and micF marbox DNA wobbles

micF mut 2-5 4-7 6-9 8-11 10-13 12-15 14-17 16-19

EMSA Rob-micF wobbles [Rob]: 0,47uM [ADN]: 50nM Rob protein (MarA, Rob-DBD, Chimera, Initial Inputs (2) )

+

Duplex DNA wobbles (256 variants each)

C+ C-

slide-54
SLIDE 54

Position Weight Matrices Obtained (represented as sequence logos) mar wobbles micF wobbles MarA Rob RobDBD Quimera

slide-55
SLIDE 55

Comparison of obtained dominant sequences and mar / micF marboxes mar wobbles micF wobbles MarA mar Rob mar RobDBD mar Quimera mar

TGCCACTTTTTGTTAATA ||||||*|||||*|||** TGCCACGTTTTGCTAAAT AAAAAASSSSBBBBBBB- CAGCACTACTTGTTAAAC |||||||***|||*|||* CAGCACTGAATGTCAAAA AAAAAASSSSBBBBBBB- TGCCACTATTTGTTAATA ||||||**||||*|||** TGCCACGTTTTGCTAAAT AAAAAASSSSBBBBBBB- TGCTATTATTTGTTAATA |||*|***||||*|||** TGCCACGTTTTGCTAAAT AAAAAASSSSBBBBBBB- CGCCACATTTTGTTAATT *|||||*|||||*|||*| TGCCACGTTTTGCTAAAT AAAAAASSSSBBBBBBB- CAGCACTACTAGTTAAAC |||||||****||*|||* CAGCACTGAATGTCAAAA AAAAAASSSSBBBBBBB- TAGCACTACTAGTTAAAC *||||||****||*|||* CAGCACTGAATGTCAAAA AAAAAASSSSBBBBBBB- CAGCACTAATTGTTAAAC |||||||*|*|||*|||* CAGCACTGAATGTCAAAA AAAAAASSSSBBBBBBB-

MarA micF Rob micF RobDBD micF Quimera micF

slide-56
SLIDE 56

Comparison between dominant sequences for each protein and context mar wobbles micF wobbles mar MarA Rob RobDBD Quimera consensus

TGCCACGTTTTGCTAAAT *||*|***||||*|||** TGCCACTTTTTGTTAATA TGCCACTATTTGTTAATA TGCTATTATTTGTTAATA CGCCACATTTTGTTAATT *||*|***|||||||||* YGCYAYWWTTTGTTAATW

  • AAAAAASSSSBBBBBBB-

CAGCACTGAATGTCAAAA *||||||****||*|||* CAGCACTACTTGTTAAAC CAGCACTACTAGTTAAAC TAGCACTACTAGTTAAAC CAGCACTAATTGTTAAAC *|||||||*|*||||||| YAGCACTAMTWGTTAAAC

  • AAAAAASSSSBBBBBBB-

micF MarA Rob RobDBD Quimera consensus

YGCYAYWWTTTGTTAATW |**.|...*|.|||||** YAGCACTAMTWGTTAAAC

  • AAAAAASSSSBBBBBBB-

consensus mar wb consensus micF wb BoxA+Sp+BoxB

slide-57
SLIDE 57

Pairwise comparison of dominant sequences for each protein and seq context Mar wobbles micF wobbles

MarA vs Rob (1) TGCCACTTTTTGTTAATA |||||||*|||||||||| TGCCACTATTTGTTAATA

  • MarA vs RobDBD (3)

TGCCACTTTTTGTTAATA |||*|*|*|||||||||| TGCTATTATTTGTTAATA

  • MarA-Chimera (3)

TGCCACTTTTTGTTAATA *|||||*||||||||||* CGCCACATTTTGTTAATT

  • Rob vs RobDBD (2)

TGCCACTATTTGTTAATA |||*|*|||||||||||| TGCTATTATTTGTTAATA

  • Rob vs Chimera (4)

TGCCACTATTTGTTAATA *|||||**|||||||||* CGCCACATTTTGTTAATT

  • RobDBD vs Chimera (6)

TGCTATTATTTGTTAATA *||*|***|||||||||* CGCCACATTTTGTTAATT MarA vs Rob (1) CAGCACTACTTGTTAAAC ||||||||||*||||||| CAGCACTACTAGTTAAAC

  • MarA vs RobDBD (2)

CAGCACTACTTGTTAAAC *|||||||||*||||||| TAGCACTACTAGTTAAAC

  • MarA-Chimera (1)

CAGCACTACTTGTTAAAC ||||||||*||||||||| CAGCACTAATTGTTAAAC

  • Rob vs RobDBD (1)

CAGCACTACTAGTTAAAC *||||||||||||||||| TAGCACTACTAGTTAAAC

  • Rob vs Chimera (2)

CAGCACTACTAGTTAAAC ||||||||*|*||||||| CAGCACTAATTGTTAAAC

  • RobDBD vs Chimera (3)

TAGCACTACTAGTTAAAC *|||||||*|*||||||| CAGCACTAATTGTTAAAC

slide-58
SLIDE 58

MarA, Rob, RobDBD and Chimera PWMs in mar wobbles context PWMs of the for proteins, MarA, Rob, RobDBD and Chimera in the marbox mar sequence context (mar DNA wobbles) are very similar.

slide-59
SLIDE 59

MarA-RobDBD and Rob-Chimera PWMs cluster together PWM-based hierarchical clustering shows that sequence binding preferences in the marbox mar sequence context are closer or alike on the one hand for MarA and RobDBD and for Rob and Chimera on the other.

slide-60
SLIDE 60

Specificity analysis based on 256 wobble variants (mar sequence context)

R_DBD Rob MarA Quimera GTTC GATC GACC GATT AGTT CGAA TGCC TGCT ATAG AAAG ATCA ATTA GTCT TATT GGTT TACT AACC TTTA TTTG CTTA CTTG CACC GAAA TTCT AGGT TTCC TTGT TCTT TTGC TAAT TGAC GGAC ATCT GTTT AGCT ATAA AGGC AGAA CGCA TGTC CGGT CGTG CGTA CGGC GGAT CCCA ACTA TCCC TCAA CCGG CTAA CCGA CAAA CAGC GTGG CCCG ACTG GCCG GCTA GCCA GCGA ACGA AACT ACGG GCTG GTGT TCTC CTCA AAGT CTGG GTAG AATT GTGC ACCG CTCG AAGC TCCT CTAG CCAG CCTG GCGG GCGT AGAG GGGG TAGA TAGG AAAA GGGA TGAA GGAA GGGC CGCG AGTA GAGG TGGC ATCG GATG GGCT CGAG AGCA TGTA GGAG GGCC ATGG ATTG CACG TGGT CATA TTAG GTCC TACC TTCG AACA TTAA GTAA CAGT TCCG CATG TATC GACT GAAG GAGC GGCG GGTA CGGA GACA GGCA TGGG AGCG CGGG TGCG TATA AGGG AGGA AGTG TGAG TAAA CAGG TAGT GTTA TATG TTGA AACG AAGA GGTG TACA GGGT AAGG TGGA TAAG TACG CACA TGCA GATA TGTG GTCA TAGC GACG GAGT TTCA CAAG TTGG ATGA CAGA GAGA TCGA AATG TCTA TCTG GTGA TCGG GTCG TCAG CTGA TCGT TCGC GGTC TCCA AATA GTTG CGCT CGTT CGAT CGTC CTGT CCCC GTAC TTAC ACGT GCTT ATGC AGTC TAAC CTGC ATGT GCCT AATC ACCA CCAA CTCC CCTA GCGC GCAG GCCC CGAC CCCT GTAT CTCT CTTC AGCC CCGT ATCC CGCC ATTC ATTT TTAT AGAT TTTT TGTT TTTC GAAT CACT TGAT CAAC ACTC CCAT ACAA AAAC ATAC ACCC CAAT AGAC ACAG CCTT CATT GAAC ACCT CCGC GCTC ACTT GCAA TCAC TCAT CATC CCTC AAAT CTAC ATAT GCAT CCAC ACAC ACAT ACGC CTTT CTAT GCAC mar DNA wobble region 4−7 −5 5 Column Z−Score R_DBD Rob MarA Quimera GACG CTTA GACT CAAA CTAT CAGG CAAG CTAA ACAC ATCG CAGT CCTA CAGC CCAA ACGC GCCA ACGA AGCG CTAC GCTG AACG CATA CCTC CCCA GCGA CCAC CTGT CAAC CAGA TGCG GGTG CACC TCGA CCTG CTCC CTGA CGGT TCGG CGAT GCGG CCGA CGGA CGGC GCAT GGCG GGGG TGGG CTGC CCGT ACCA GTGG GCAA CCGC TGTG TGAT GCCT TCGT GGGT TGGT TGCC CGTT GGTC TGAG ATGG TGAC TGCA GTGA GACA GCAC CTTC GTTG GTAG TACG TGGA TTCG ACTG AGGG ACGT AGCA GCTA GTCA TGGC GCAG TCCG CCAT GCGC TCAC GAGG GCGT AAAA CCCC CTCG CGAC CGCG CGAG CGCA CGTG CACG CGTC CTAG ACCG CCAG GTCG CATC CATG CCGG CGTA CACA CGGG CGCC CTCA CTGG CTCT CTTG GCCG CGAA CGCT CACT ACGG CCCG GGCC ATAG GGAC GGAT ACCT GAGT GGTT TTGC CCCT GGTA GCTC TCCA GGCT TACA AGGA TCAG TGTC ACAT ACCC AGCT ATGC GGGC TAGG GAAG ATGT AAGG ACTC GGAG AGTC AGAC GTAT ATTG AAGA ACAA AACA AGTA ACAG TGTA ATCA GTCT TTTT AGCC ATGA ACTA TTGA AGTG GTGT GACC AGGC TTGG GAGA GAGC TCCT TAGC GGCA TCGC TTCA GCCC GTCC TCTG TGCT TCAA AGGT CCTT GGGA TCCC GTGC TCAT ATAT ATCT AGAA ATTA AAAG GAAA AATG AGAT TAAG ATCC GATG AGTT GTTA GAAC TAGA ATAC GATA TTTG AACC AAGC GTAA TTAG AACT CAAT AGAG TACC TCTA ACTT TGAA AAGT TATG TGTT TTAC TAGT GCTT TCTC CATT TACT GGAA CTTT TTGT GATC GTTC GTTT ATTT TTTA TTTC AATT GATT AAAC AATA TAAA GAAT ATTC TTAA ATAA TATA TAAC TATC AAAT AATC TAAT TTAT TATT TCTT GTAC TTCC TTCT mar DNA wobble region 8−11 −6 −2 2 6 Column Z−Score R_DBD MarA Rob Quimera TGCT TGAT TGTT GCTC CCGG GAAG CTGG CCCG GAGG GCTA GCTG GTTG GCGG CCTG ACGG ATGG ACTG GTGG GCCG GCAG GTCG CCAG GACG GCCC GGGG CAAG AAAG ACAG AAGG ATAG CTCG ACCG CTAG ACGA CAGG ATCG GTAG GCAC CCCC GCGC CCAT ACCC GTCC GTTC AGAA TGAA GCAA GTAA TAAA TTTT ATAA CTAA TTTC TTTA GAAA TCAA AAAA CGAA GGAA CAAA CCAA ACAA TTAA TTTG GTGA TCGA TCTA ACTA GCGA CGCA GGGA TTAT TTGT TAGA TTGC AGCA AGTA GCCA ACCA CCCA TTGA TTCA CATA AATA GTCA TACA ATTA ATGA GAGA CCTA CCGA CGGA CAGA AAGA TCCA TTCC TTCG GTTA TTGG CACA TTAG TTCT GACA GATA TTAC ATCA CTCA CTTA AACA CTGA AATT CATC AAAT AACC CGCC TAAC CAAC TCCT CAAT TCAT TCGT AGAG GGAG GGCG CGTG TAGT AGCG AGGC CTTT ATTC GACT ACCT GATT GCTT ACTT ATCT TGGA TGCA TGTA GGCC TATA GGTA AGGA CGTA GGCA TCCC AACG CTCT TCGC CTGT TACG GTCT ATGC CGAG GTTT ATCC CAGC GTAT GTGT GAAT TCTT CGCG GGGC ATAC ATGT ATTT CATT AAGT CGGC CACT TACC TAGC CTAT CCTT TGGG GATG CTAC CTGC TCTC AGGG CACC CTTC TAAG CCGT GCGT CTCC TATG CAGT GTGC GACC GAGT GAGC AATG CCAC CCGC TCAG TCGG ATTG CACG TCCG TAGG CTTG TCTG CATG CGGG AAGC ACGC GAAC ACAC ACTC GCCT TCAC CCTC GCAT ACGT CCCT ACAT GTAC AAAC GATC AACT GGTC ATAT TAAT CGGT TGGC TATC CGAC AGCC GGAC AGAC CGTC TATT GGTG AGGT AGTG GGGT TGAC TGGT TACT AATC CGCT AGCT GGTT AGTT AGTC GGCT TGTC AGAT CGTT TGCG GGAT TGCC TGTG CGAT TGAG mar DNA wobble region 12−15 −6 −2 2 6 Column Z−Score

...

slide-61
SLIDE 61

Specificity analysis based on 256 wobble variants (mar sequence context)

R_DBD Rob MarA Quimera GTTC GATC GACC GATT AGTT CGAA TGCC TGCT ATAG AAAG ATCA ATTA GTCT TATT GGTT TACT AACC TTTA TTTG CTTA CTTG CACC GAAA TTCT AGGT TTCC TTGT TCTT TTGC TAAT TGAC GGAC ATCT GTTT AGCT ATAA AGGC AGAA CGCA TGTC CGGT CGTG CGTA CGGC GGAT CCCA ACTA TCCC TCAA CCGG CTAA CCGA CAAA CAGC GTGG CCCG ACTG GCCG GCTA GCCA GCGA ACGA AACT ACGG GCTG GTGT TCTC CTCA AAGT CTGG GTAG AATT GTGC ACCG CTCG AAGC TCCT CTAG CCAG CCTG GCGG GCGT AGAG GGGG TAGA TAGG AAAA GGGA TGAA GGAA GGGC CGCG AGTA GAGG TGGC ATCG GATG GGCT CGAG AGCA TGTA GGAG GGCC ATGG ATTG CACG TGGT CATA TTAG GTCC TACC TTCG AACA TTAA GTAA CAGT TCCG CATG TATC GACT GAAG GAGC GGCG GGTA CGGA GACA GGCA TGGG AGCG CGGG TGCG TATA AGGG AGGA AGTG TGAG TAAA CAGG TAGT GTTA TATG TTGA AACG AAGA GGTG TACA GGGT AAGG TGGA TAAG TACG CACA TGCA GATA TGTG GTCA TAGC GACG GAGT TTCA CAAG TTGG ATGA CAGA GAGA TCGA AATG TCTA TCTG GTGA TCGG GTCG TCAG CTGA TCGT TCGC GGTC TCCA AATA GTTG CGCT CGTT CGAT CGTC CTGT CCCC GTAC TTAC ACGT GCTT ATGC AGTC TAAC CTGC ATGT GCCT AATC ACCA CCAA CTCC CCTA GCGC GCAG GCCC CGAC CCCT GTAT CTCT CTTC AGCC CCGT ATCC CGCC ATTC ATTT TTAT AGAT TTTT TGTT TTTC GAAT CACT TGAT CAAC ACTC CCAT ACAA AAAC ATAC ACCC CAAT AGAC ACAG CCTT CATT GAAC ACCT CCGC GCTC ACTT GCAA TCAC TCAT CATC CCTC AAAT CTAC ATAT GCAT CCAC ACAC ACAT ACGC CTTT CTAT GCAC mar DNA wobble region 4−7 −5 5 Column Z−Score

List names number of elements number of unique elements 33_mbxes 33 14 Chimera 95 95 MarA 109 109 Rob 83 83 RobDBD 110 110 Overall number of unique elements 148 Names total elements 33_mbxes Chimera MarA Rob RobDBD 8 CCAA GCTC TCAT GCAC GCAT CCAC ACAC GCAA Chimera MarA Rob RobDBD 54 CAAC CAAT GCTT CTCC ACAA CGAT CTGC ACCA AGTC CTTT CCCC ATGC CTCT ACTT ATTC CCCT CCGC CGAC AATC TCAC ATCC ACGC AGAC ATAT ACCC ACGT CATC CCAT TTAT CTAT TAAC AAAT CATT ATTT ATAC GAAC CCTT CGTC ACTC ACCT GTAC CGTT AAAC AGAT CTAC CTTC TTAC ACAG CCTC CCGT CGCC ACAT GAAT GTAT

slide-62
SLIDE 62

Intrinsic Specificity Ratio (ISR) for MarA, Rob, RobDBD, Chimera (mar wobbles)

0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 02-05 04-07 06-09 08-11 10-13 12-15 14-17 16-19 Intrinsic Specificity Ra0o (ISR) Wobble Region in marbox mar MarA Rob Rob-DBD Chimera

slide-63
SLIDE 63

Conclusions

  • 1. Based on the results of positional analysis of DNA wobbles capture and NGS assays,

the selectivity or specificity of MarA and Rob are quite similar, specially in the BoxB region of marboxes.

  • 2. Theoretical molecular dynamics simulations or Rob and MarA with mar, micF and

B-DNA conformations suggest that binding of H6 to BoxA always occurs for MarA and Rob proteins. This may indicate that the crystal structure of Rob complexed with micF May be an artifact and that binding of H6 or Rob and BoxB of micF takes place. This is Also supported by our EMSA experiments with the Chimera and RobDBD proteins.

  • 3. The most relevant positions for high affinity binding (nanomolar range) seems to be

located at the boundary of the Spacer and BoxB regions in the marboxes.

  • 4. It is possible that MarA and Rob have two independent or cooperative binding modes,
  • ne of medium affinity that involves only part of the DNA sequence and one of high

Affinity that involves the complete marbox sequence (BoxA+Spacer+BoxB). SAXS experiments should be useful to address this hypothesis. The poor solubility of MarA and Rob proteins in low salt buffers difficult the use of SAXS to study the details of their binding modes.

  • 5. In vivo directed evolution data (that we are generating now in our lab) may prove

useful to better understand the DNA specificity of recognition by these proteins.

slide-64
SLIDE 64

Questions ? Comments ?

Thank you for your attention ! Questions ? Comments ? http://melolab.org