Multivariate Data Analysis in Microbial Ecology New Skin for the old - - PowerPoint PPT Presentation

multivariate data analysis in microbial ecology
SMART_READER_LITE
LIVE PREVIEW

Multivariate Data Analysis in Microbial Ecology New Skin for the old - - PowerPoint PPT Presentation

Multivariate Data Analysis in Microbial Ecology New Skin for the old Ceremony Jean Thioulouse UMR 5558 CNRS Biomtrie, Biologie volutive CNRS University of Lyon - France 1 Jean Thioulouse - useR! 2008 Headlines Topic:


slide-1
SLIDE 1

1

Multivariate Data Analysis in Microbial Ecology


New Skin for the old Ceremony

Jean Thioulouse UMR 5558 CNRS « Biométrie, Biologie Évolutive » CNRS – University of Lyon - France

Jean Thioulouse - useR! 2008

slide-2
SLIDE 2

Jean Thioulouse - useR! 2008

2

Headlines

  • Topic: Environmetrics and Ecology

 descriptive exploratory multivariate data analysis ("Geometric Data Analysis")  ade4 and ade4TkGUI packages  case studies

  • The EcoMic – RMQS project
  • Mycorrhizal symbiosis in tropical soils
slide-3
SLIDE 3

Jean Thioulouse - useR! 2008

3

The EcoMic – RMQS project

  • Analyse the relationships between soil microbial

molecular diversity and environmental factors at the regional and national scales in France. Molecular diversity

  • f soil bacterial

communities Environmental factors

slide-4
SLIDE 4

Coordinator : L Ranjard, UMR INRA/ U-Bourgogne Microbiologie et Géochimie des Sols, Dijon

(Microbial Ecology)

UMR CNRS/UCBL Biométrie Biologie Evolutive, Lyon

(Data analysis) LBE INRA-NARBONE

Analyse des Systèmes et Biométrie

(Modelling) Unité INRA INFOSOL, Orléans (Soil Science)

DREAM Unit, CEFE-CNRS Montpellier

(Soil Science)

Multi-institutionnal

(INRA, CNRS, Universities)

Multidisciplinary

(Soil science, Microbial ecology, Modelling, Data analysis) UMR CNRS Génomique Microbienne Environnementale, Lyon

(Microbial Ecology)

slide-5
SLIDE 5

Jean Thioulouse - useR! 2008

5

The EcoMic – RMQS project

  • Large (2M€) ANR project on Microbial Ecology of

French soils

  • Microbial diversity in soil

 Evaluate beta diversity  Processes generating and maintaining this diversity  Large spatial scale (France)  Molecular tools (PCR, DNA fingerprints, DNA µarrays)

  • Based on the RMQS soil library
slide-6
SLIDE 6

The RMQS

Jean Thioulouse - useR! 2008

6

1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 Profil pédologique Surface d’échantillonnage N 5 m 2 m 2 m 20 m 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 1 4 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 2 3 Profil pédologique Surface d’échantillonnage N 5 m 2 m 2 m 20 m
  • Soil Quality Measure Network
  • Started in 2002 by Infosol - INRA Orleans
  • Square sampling grid over all France 16 x 16 km
  • 2200 sampling points, finished in 2009
  • Renewed every 10 years.
slide-7
SLIDE 7

The RMQS

Jean Thioulouse - useR! 2008

7

Many parameters are measured:

  • Physico-chemical parameters (pedology)

 granulometry, pH  C, N, Ca, Na, heavy metals, etc.

  • Vegetation cover, lanscape, agricultural practices, etc.
  • Molecular data (DNA extraction from raw soil samples)
slide-8
SLIDE 8

Six regions

Jean Thioulouse - useR! 2008

8

1 2 3 4 5 6

  • Based on vegetation, landscape, climate, pedology, and

available samples (578)

Region 1: North, intermediate Region 2: Brittany, low diversity Region 3: grand Paris, highly urbanized Region 4: Center, intermediate Region 5: Landes, very low diversity, sand dunes and pine forests Region 6: South Alps, highest diversity, mountains, contrasted climate

slide-9
SLIDE 9

Molecular data

  • RISA: Ribosomal Intergene Spacer Analysis

 length polymorphism of the intergene sequence between the large and small ribosomal subunit genes  no information on taxonomic level (OTU: Operational Taxonomical Unit)  data already available for almost 1000 sites

  • Sequencer data must be processed before analysis

 prepRISA package: rectangular data tables (sites x OTU)  hundreds of OTU  typical data table size ≈ 2200 x 500

Jean Thioulouse - useR! 2008

9

slide-10
SLIDE 10

Molecular data

  • DNA µarrays

 probes can be specific of particular taxonomic levels  data not yet available  thousands of probes  typical data table size ≈ 2200 x 10 000

Jean Thioulouse - useR! 2008

10

slide-11
SLIDE 11

Microbiogeography questions

Jean Thioulouse - useR! 2008

11

  • "Everything is everywhere, but, the environment selects." Baas

Becking, 1934.

  • "Are microbial communities a black box with no spatial structure or, like

macroorganisms do they exhibit a particular distribution with predictable aggregates patterns from local to regional scales ? " Horner-Devine et al., Nature, 2004.

  • "Identify the environmental factors (edaphic, climatic, anthropogenic…)

which exert the strongest influences on microbial communities in nature." Martiny et al., Nature Reviews, 2006.

slide-12
SLIDE 12

Biological objectives

Jean Thioulouse - useR! 2008

12

  • Inventory of bacterial diversity in French soils
  • Spatial components of this diversity and ecological models

to explain it (species-area relationship)

  • Mechanisms determining this diversity (pedology, climate,

vegetation, etc.)

  • Quantify the impact of human activities (agriculture, industrial

sites, wastes)

  • Microbiological markers of soil evolution in various ecological

situations

slide-13
SLIDE 13

Species - area relationship

  • S = number of species, A = area under study
  • When using RISA, OTU ≠ species

 OTUA and OTUa number of OTU in areas A and a  Sørensen index for 2 samples at distances D and d  Green et al. 2004, Nature, 432, 747-750 OTUa =OTUA a A

     

z χd = χD d D

     

−2z S =CAz

slide-14
SLIDE 14

Species - area relationship

  • Computations: (packages: vegan, labdsv)
slide-15
SLIDE 15

Species - area relationship

Region 1 Slope = -0.00590 p = 0.031 Region 2 Slope = -0.00022 p = 0.002 Region 3 Slope = +0.00494 p = 0.018 Region 4 Slope = -0.00683 p = 0.008 Region 5 Slope = -0.00416 p = 0.690 Region 6 Slope = -0.01280 p = 0.000 Total Slope = -0.02657 p = 0.000

1 2 3 4 5 6

Slopes are negative: diversity increases Region 5: very homogeneous (p-value NS) Region 3: slope is positive (urbanized) Region 6: high landscape diversity Region 2: low diversity Regions 1 and 4: intermediate

slide-16
SLIDE 16

Multivariate analysis: data sets

Jean Thioulouse - useR! 2008

16

Sampling sites (578) RISA bands (331) Pedological variables (24) PCA PCA Coinertia analysis

Region I Region V Region VI I V VI

  • Computations: (prepRISA, ade4 + ade4TkGUI)
slide-17
SLIDE 17

PCA of RISA data: OTU distrib.

Jean Thioulouse

17

PC1 PC2

slide-18
SLIDE 18

PCA of RISA data (PC1)

18

slide-19
SLIDE 19

PCA of pedological data (PC1)

19

slide-20
SLIDE 20

Coinertia analysis (pedo/RISA)

Jean Thioulouse - useR! 2008

20

slide-21
SLIDE 21

Biological interpretations

Jean Thioulouse - useR! 2008

21

  • The spatial structures of physico-chemical pedological variables and

RISA data are very similar, and this similarity exists at both regional and national scales (within and between regions)

  • Ecological processes responsible of spatial structures in animals and

plants (differenciation, extinction, endemism) either do not exist for bacterias, or may be masked by bacteria characteristics, such as dispersion capacities, stress resistance, or short generation time.

  • "everything is everywhere, but, the environment selects"
  • These results are for RISA OTU, which are mostly unknown. But

(hopefully) more to come with DNA µarrays...

slide-22
SLIDE 22

Jean Thioulouse - useR! 2008

22

RISA methodological flow chart

Sampling site

bulk soil samples

DNA Sequencer prepRISA ade4TkGUI ade4 Species-area relationships vegan, labdsv Multivariate data analysis Arcinfo GIS postgresql DB RODBC Sequence databanks seqinR in silico RISA Geostatistics geoR, gstat

slide-23
SLIDE 23

Mycorrhizal symbiosis in tropical soils

  • Soil is a very complex environment, with many

interactions between plants, mycorrhizal fungi, soil bacterial communities, and abiotic factors

  • Research project started in 2000 in collaboration with

several IRD research labs in Africa:

 Dakar (Senegal)  Ouagadougou (Burkina Faso)  Marrakech (Morocco)  Antananarivo (Madagascar)

Jean Thioulouse - useR! 2008

23

slide-24
SLIDE 24

Mycorrhizal symbiosis in tropical soils

  • Relationships between plants, mycorrhizal

symbiosis, and the soil bacterial microflora

  • Review of 15 papers published since 2005 on this

topic in microbial ecology journals

 Effects of mycorrhizal symbiosis  Nurse plants  Termite mound powder amendment

Jean Thioulouse - useR! 2008

24

slide-25
SLIDE 25

Mycorrhizal symbiosis in tropical soils

  • Data used to study the diversity of soil bacteria

 RISA and DNA µarrays are too complex to be used  Catabolic profiles / SIR (substrate induced response) profile = measure of CO2 production for ≈ 30 substrates functional diversity vs. taxonomic diversity  DNA fingerprint: DGGE

Jean Thioulouse - useR! 2008

25

slide-26
SLIDE 26

Mycorrhizal symbiosis in tropical soils

  • Statistical data analysis methods (ade4 )

 BGA (between group analysis): robust alternative to discriminant analysis to separate groups. PCA on group means, with projection of original data  Coinertia analysis: analyse the relationships between two data tables. Robust alternative to Canonical Analysis or Canonical Correspondence Analysis  both methods allow the use of low numbers of samples as compared to the number of variables  permutation test

Jean Thioulouse - useR! 2008

26

slide-27
SLIDE 27

Mycorrhizal symbiosis in tropical soils

  • Statistical data analysis methods

 Cluster analysis on DGGE fingerprints (hclust )  Linear model for various hypothesis test (lm, nlme )

Jean Thioulouse - useR! 2008

27

slide-28
SLIDE 28

Results

  • Effects of mycorrhizal symbiosis

 improves plant growth (Cupressus atlantica, Acacia holosericea, Acacia seyal, Uapaca bojeri)  improves P solubilization and assimilation  improves seedling survival (C.a.) in degraded soils  improves the functional diversity of soil microflora  counterbalance the negative effect of an exotic plant species on the structure and functioning of soil bacterial communities

Jean Thioulouse - useR! 2008

28

slide-29
SLIDE 29

Results

  • Nurse plant effects

 Lavandula species (L. stoechas, L. dentata, L. multifida) act as nurse plants for C. Atlantica  improve plant growth  improve mycorrizal soil infectivity  interactions with the functional diversity of soil bacteria  Thymus satureioides also improves plant growth and functional diversity

Jean Thioulouse - useR! 2008

29

slide-30
SLIDE 30

Results

  • Termite mound powder amendment effect

 improves mycorrhizal symbiosis and plant growth of A. holosericea and A. seyal (MHB: mycorriza helper bacteria)  improves nutrient supplies (effect on soil microflora)  can be used for biological control of phytoparasite Striga hermontica (effect on soil microflora)  decreases Cd toxicity and improves its accumulation in plants (sorghum)

Jean Thioulouse - useR! 2008

30

slide-31
SLIDE 31

Result examples: BGA on SIR

Jean Thioulouse - useR! 2008

31

Graphical display (PC2 biplot) of BGA on catabolic profiles. Up: scores of the 33 substrates. Down: the three Gauss curves represent the mean and variance

  • f the scores of soil samples

(three treatments). CS: crop soil NIS: soil from uninoculated trees IR100S: soil from P. albus IR100 fungi inoculated trees

slide-32
SLIDE 32

Result examples: Coinertia on SIR

32

Graphical display of Coinertia analysis between catabolic profiles and plant growth parameters. Left: F1xF2 factor maps. Right: cross-covariances

slide-33
SLIDE 33

Conclusion

  • in the Biometry lab

 Has become an indispensable tool in just a few years  Teaching (all levels)  2 mailing lists: « rpackdev » and « rpourlesnuls »  6 packages (on CRAN): ade4 and spin-off packages  ade4TkGUI, adehabitat, adegenet  seqinR, nlstools

Jean Thioulouse - useR! 2008

33

slide-34
SLIDE 34

Conclusion

  • Need for GUIs

 For teaching  For biologists  ade4TkGUI screenshots

Jean Thioulouse - useR! 2008

34

slide-35
SLIDE 35

Conclusion

Jean Thioulouse - useR! 2008

35

Synthetic display

slide-36
SLIDE 36

Conclusion

Jean Thioulouse - useR! 2008

36

Complex graphs

slide-37
SLIDE 37

Conclusion

Jean Thioulouse - useR! 2008

37

Interactive factor map

slide-38
SLIDE 38

Conclusion

Jean Thioulouse - useR! 2008

38

Interactive data analysis

slide-39
SLIDE 39

Conclusion

Jean Thioulouse - useR! 2008

39

Interactive data analysis

slide-40
SLIDE 40

Conclusion

Jean Thioulouse - useR! 2008

40

Interactive data analysis

slide-41
SLIDE 41

Conclusion

Jean Thioulouse - useR! 2008

41

Interactive data analysis

slide-42
SLIDE 42

Acknowledgments

  • EcoMic - RMQS

 Lionel Ranjard, Samuel Dequiedt - INRA Lab in Dijon  Bruno Saby, Manuel Martin - INRA Lab in Orleans

  • Mycorrhizal symbiosis

 Robin Duponnois - IRD Lab in Dakar & Ouagadougou

  • ade4 package

 Stéphane Dray, Anne-Béatrice Dufour - Biometry Lab in Lyon

Jean Thioulouse - useR! 2008

42

slide-43
SLIDE 43

RISA profiles

slide-44
SLIDE 44

Remigi P., Faye A., Kane A., Deruaz M., Thioulouse J., Cissoko M., Prin Y., Galiana A., Dreyfus B., & Duponnois R. (2008). Applied and Environmental Microbiology 74, 5, 1485-1493. Andrianjaka Z., Bally R., Lepage M., Thioulouse J., Comte G., & Duponnois, R. (2007). Applied Soil Ecology 37, 175-183. Kisa M., Sanon A., Thioulouse J., Assigbetse K., Sylla S., Dieng L., Berthelin J., Prin Y., Galiana A., Lepage M. & Duponnois R. (2007). FEMS Microbiology Ecology, 62, 32-44. Ouahmane L., Hafidi M., Thioulouse J., Ducousso M., Kisa M., Prin Y., Galiana A., Boumezzough A., & Duponnois R. (2007). Journal of Applied Microbiology 103, 683-690. Ouahmane L., Thioulouse J., Hafidi M., Prin Y., Galiana A., Plenchette C., Kisa M., & Duponnois R. (2007). Forest Ecology and Management 241, 200-208. Ramanankierana N., Ducousso M., Rakotoarimanga N., Prin Y., Thioulouse J., Randrianjohany E., Ramaroson L., Kisa M., Galiana A. & Duponnois, R. (2007). Mycorrhiza, 17, 3, 195-208. Duponnois R., Assigbetse K., Ramanankierana H., Kisa M., Thioulouse, J. & Lepage, M. (2006). FEMS Microbiology Ecology, 56, 292-303. Duponnois R., Kisa M., Assigbetse K., Prin Y., Thioulouse J., Issartel M., Moulin P., & Lepage M. (2006). Science of the Total Environment, 370, 391-400. Ouahmane L., Hafidi M., Plenchette C., Kisa M., Boumezouch A., Thioulouse J. & Duponnois R. (2006). Applied Soil Ecology, 34, 190-199. Ouahmane L., Duponnois R., Hafidi M., Kisa M., Boumezouch A., Thioulouse J. & Plenchette C. (2006). Plant Ecology, 185, 123-134. Ramanankierana N., Rakotoarimanga N., Thioulouse J., Kisa M., Randrianjohany E., Ramaroson L. & Duponnois R. (2006). International Journal of Soil Science, 1, 8-19. Sanon A., Martin P., Thioulouse J., Plenchette C., Spichiger R., Lepage M. & Duponnois R. (2006). Mycorrhiza, 16, 125-132. Assigbetse K., Gueye M., Thioulouse, J. & Duponnois R. (2005). Microbial Ecology, 50, 350-359. Duponnois, R., Colombet, A., Hien, V., & Thioulouse, J. (2005). Soil Biology and Biochemistry, 37,1460-1468. Duponnois, R., Paugy, M., Thioulouse J., Masse, D. & Lepage, M. (2005). Geoderma, 124, 349-361.