Family-Based Association Analyses in Plants Clay Sneller, Ohio - - PowerPoint PPT Presentation

family based association analyses in plants
SMART_READER_LITE
LIVE PREVIEW

Family-Based Association Analyses in Plants Clay Sneller, Ohio - - PowerPoint PPT Presentation

Family-Based Association Analyses in Plants Clay Sneller, Ohio State University Kevin Smith, Jon Massman, University of Minnesota Association Analyses (AA) Associate to connect in the mind or imagination Statistically associate


slide-1
SLIDE 1

Family-Based Association Analyses in Plants

Clay Sneller, Ohio State University Kevin Smith, Jon Massman, University of Minnesota

slide-2
SLIDE 2

Association Analyses (AA)

  • Associate – to connect in the mind or

imagination

  • Statistically associate marker and

phenotypic data

  • Detect a physical linkage of marker and

trait loci (QTL)

  • Normally used in complex populations:

many parents

  • AA must deal with population structure
slide-3
SLIDE 3

Population Structure: Unequal relationship between individuals

1.Between Subgroups

  • 2. Within

Subgroups

AA must accommodate structure to control type I errors: Declaring linkage when none exists

slide-4
SLIDE 4

Population- vs Family-Based AA

Estimation association parameter Over entire population Within lineages, between relatives then compiled Population structure Estimated & modeled Negated by sampling Inference

  • f linkage

Implied by significance Required for significance

Family Population

slide-5
SLIDE 5

Population-Based AA

  • Commonly used in plants
  • Applicable to many population types
  • Common statistics

– Main effect of marker: means comparison – Covariance for effect of subgroups – TASSLE+STRUCTURE, unified mixed-model

  • f Yu et al. 2006
slide-6
SLIDE 6

0 0 00 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 2 1 1 1 2

X X 1 X 2

A B C D E F G H I J K L M N O

H0: X 0

X 1

=

X 2

=

Genotyped Phenotyped

slide-7
SLIDE 7

1 1 1 1 1 1 1 1 1 0 1

Mean Freq “1” Freq “0”

75 0.5 0.5 50 0.1 0.9 100 0.9 0.1

Yi = u + gi + other effects

X 0 X 1 >

Yi = u + Cov + gi + ….

"

" X

" "

1

X

=

“75” 0.1 0.9 “75” 0.9 0.1

slide-8
SLIDE 8

1 1 1 1 1 1 1 1 1 0 1

Mean Freq “1” Freq “0”

75 0.5 0.5 75 0.5 0.5

1Q 0q 1q 0Q

X0 X1 > X0 X1 <

Yi = u + Cov + gi + ….

X X

2 1 =

slide-9
SLIDE 9

Family-Based AA

  • As individuals become more related, they

become more similar

  • Estimate association parameter within

lineages

  • Compile and test for significance
slide-10
SLIDE 10

Mean Freq “1” Freq “0” 1Q 0q

1 1 1 1 1 75 0.5 0.5

X0 X1 >

1q 0Q

1 1 1 1 0 1 75 0.5 0.5

X0 X1 <

slide-11
SLIDE 11

“Sib” Pair Regression

Behavior Sweet Sassy Steady Hair Pigment 2 2 7 A A B Marker

Haseman & Elston, 1972

slide-12
SLIDE 12

Regress Phenotypic Difference2 on Proportion of IBD alleles at Marker PD = (Xi – Xj)2 Mark 1 IBD

25 25 1

Shared allele No shared allele

Pair 1 2 3

Regress PD on IBD

slide-13
SLIDE 13

B = -25

σ

2

a

B = -2(1-2c)2

σ

2

a = 12.5 if c=0

0 0.5 1.0 IBD 25 PD

slide-14
SLIDE 14

Multiple Families: Lineages

Family n No. Pairs Freq Freq 1 Freq 2 Snellers 3 3 0.66 0.33 Vassilyev 69 2346 0.50 0.50 Daad 86 3655 0.35 0.55 0.10 Hatfields 35 595 0.90 0.10 McCoys 35 595 0.90 0.10 7194

slide-15
SLIDE 15

Human Genetics

  • Family data is hard to

collect, verify parentage

  • Studied populations are not

highly structured - random

  • Careful apriori sampling to

minimize effect of structure

  • VERY large population size

FBAA PBAA

X X X X X X

slide-16
SLIDE 16

FBAA Example: 206 Barley Lines, Barley CAP

  • Derived from 65 biparental crosses
  • Average 3.1 progeny per cross
  • DON data from three environments

– h2 = 0.52

  • Genotyped with 2924 SNP markers

BOPA_C(1)

  • Analysis used 676 SNPs (PIC > 0.18)
slide-17
SLIDE 17

PCA of Genetic Similarity Matrix

  • 3
  • 2
  • 1

1 2 3

  • 2
  • 2
  • 1
  • 1

1 1 2 PC1 Scores P C 2 S c o re s

ND ND AB & MN Average GS=0.62 +/- 0.13

slide-18
SLIDE 18
  • 3
  • 2
  • 1

1 2 3

  • 2
  • 2
  • 1
  • 1

1 1 2 PC1 Scores P C 2 S c o re s

3 Lineages Used Lineages with PIC >0.18 Used pairs with GS >0.75

x

Average GS=0.62 +/- 0.13 N=29 5886 pairs

Developing Pairs for the Pair-Regression

slide-19
SLIDE 19

Models

TASSLE Yi = u + Cov + gi + polygene

STRUCTURE

Pair Regression PD

i = u + B1Si + B2Ii

Intercept Genetic similarity IBD Proportion

(Q+K)

Covariance of individuals within a lineage

slide-20
SLIDE 20

22 27 10 Mark PR T (LOD) ***** *** 7.0 ***** ***** ***** ***** ***** ***** * ***** * 2.8 ***** * ***** * 50 (VAR)

Chromosome 4H

46 49 9 Mark 55 56 4 Mark PR T (LOD) *** ** *** *** * ***** ***** 10.2 * ** ** * ** ** ** (VAR) PR T (LOD) * ** ***** ** ***** ***** 10.2 ***** ** (VAR) 105 190 Prob < .00001 Pair Reg Tassle

slide-21
SLIDE 21

Tassle vs Pair-Regression

Tassle & Pair-Regression 16 Tassle Only 1 Pair Regression Only 4

# of QTL Population well suited for both Clear lineages 3 lineages

slide-22
SLIDE 22

7H 161 43 ***** * 2.6 6H 13 **** 13 58 ***** 17 * 2.7 5H 87 26 ***** 89 ***** 94 94 ***** 95 * 3H 145 46 ***** ** 3.1 148 ***** 150 ***** 155 ***** 1H 51 *** 53 56 47 *****

5H 173 ** 4.0 Xsm cM Var PR T (LOD) Xsm cM Var PR T (LOD)

slide-23
SLIDE 23

FBAA is Well Suited for Plant Breeding Populations

  • Populations are EXTREMELY relevant
  • Many lines are phenotyped annually
  • Multiple large lineages are present

– Full Sibs – Half Sibs – Other degrees of relationship, lineages

slide-24
SLIDE 24

2009 YR1 Phenotyping: FHB Index

5 10 15 20 25 30 35 40 45 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 Cross F H B In d ex (% )

570 Lines 47 crosses 12 lines/cross Many Xs seg 4597 Full-Sib pairs FBAA to evaluate a marker in a breeding population:

  • 1. Build lineages based on pedigree: FS, HS
  • 2. Genotype for marker to be tested

S MR

slide-25
SLIDE 25

Other Types of FBAA

  • Quantitative Inbred Pedigree

Disequilibrium Test

  • Two-level Haseman-Elston Regression
slide-26
SLIDE 26

Quick Takes on FBAA

  • 1 study, much more needed to see

applications: simulations

  • Well suited for breeding populations
  • May circumvent some issues inherent to

population-based AA

  • Can handle rare alleles
  • QTL validation & evaluation in breeding

populations

  • Stability of QTL effects over lineages
slide-27
SLIDE 27

Thanks

  • Kevin Smith, Jon Massman
  • Barley CAP folks
  • Dr Elston
  • Diane Mather
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35

Types of Plant Populations and Association Analyses

Diverse Breeding Biparental

Amount of Lots Lots V Little Structure Evolution Breeding Relevance to Some Lots Variable Breeding Type of Analysis Population-Based AA Family-Based AA CIM Number of Many Many 2 Parents Ancestors Elite Selected

slide-36
SLIDE 36

Association Analysis:

  • Associate variation of marker genotypes

with variation of phenotypes

  • Imply linkage of marker locus and QTL

Associate: to connect in the mind or imagination

Link: to connect, to tie or bind

M Q

slide-37
SLIDE 37

X 0 X 1

Yi = u + gi + other effects

1 1 1 1 1 1 1 1 1 1

P1 = 1 Q P2 = 0 q 1 Q 0 q

  • 1. Pop0 and Pop1

likely equivalent If large

  • 2. Two alleles
  • 3. High LD
  • 4. Significance

requires linkage

slide-38
SLIDE 38

Yi = u + gi + other effects

H0: = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Population, Genotyped 1 marker Phenotyped Test Association: Parameters are means

X 0 X 1

slide-39
SLIDE 39