East Asian mtDNA haplogroup determination in Koreans: - - PDF document

east asian mtdna haplogroup determination in koreans
SMART_READER_LITE
LIVE PREVIEW

East Asian mtDNA haplogroup determination in Koreans: - - PDF document

East Asian mtDNA haplogroup determination in Koreans: Haplogroup-level coding region SNP analysis and subhaplogroup-level control region sequence analysis Hwan Young Lee, Ji-Eun Yoo, Myung Jin Park, Ukhee Chung, Kyoung-Jin Shin, Chong-Youl


slide-1
SLIDE 1

1

East Asian mtDNA haplogroup determination in Koreans:

Haplogroup-level coding region SNP analysis and subhaplogroup-level control region sequence analysis

Hwan Young Lee, Ji-Eun Yoo, Myung Jin Park, Ukhee Chung, Kyoung-Jin Shin, Chong-Youl Kim

Department of Forensic Medicine, College of Medicine, Yonsei University, Seoul, Korea Human Identification Research Institute, Yonsei University, Seoul, Korea

ISFG 2005

Korean mtDNA database establishment and haplogroup assignment

A high quality mtDNA control region sequence database was established in 593 Koreans (http://forensic.yonsei.ac.kr/) Based on shared haplogroup-specific polymorphisms in control region sequence, 592 mtDNAs (99.8%) were classified into various East Asian haplogroups or subhaplogroups

ISFG 2005

(K-J Shin, Yonsei University, unpublished)

Statistical parameters were calculated using “mtDNA Star”

slide-2
SLIDE 2

2

mtDNA haplogroup determination has practical value in forensic field

Sequencing and documenting processes are prone to copying

errors (e.g. base shift, reference bias, phantom mutations, base misscoring, artifactual recombination) As mtDNA evolves along a tree, assigning new mtDNA types to a spot in the global mtDNA tree can prevent potential errors in mtDNA database Phylogenetic analysis is the key tool in understanding the structure of the mtDNA data

ISFG 2005

The ideal approach is confirmation of diagnostic coding region SNP

Previously identified control region mutation motifs cannot exactly define major haplogroups and their subhaplogroups without complementation of coding region information As an example, D4, G and M9 mtDNA are not distinguishable

  • nly with control region sequence polymorphisms

16172Y 16189 16223 16278 16362 73 263 309.1C 315.1C 489 573.1C 573.2C 573.3C 573.pC D4 BF4271 16189 16223 16269 16278 16362 73 260 263 284 309.1C 309.2C 315.1C 489 G 409 16223 16234 16274 16362 73 153 263 315.1C 489 M9 BF4102 16078T 16179 16223 16234 16362 16519 73 152 263 309.1C 309.2C 315.1C 489 G 476 16223 16362 16519 73 263 309.1C 315.1C 489 G 385 16223 16362 16519 73 263 309.1C 315.1C 489 523d 524d D4 BF4229 Control region sequence Haplogroup Sample

ISFG 2005

slide-3
SLIDE 3

3

Design of three multiplex systems for coding region SNP scoring

ISFG 2005

Multiplex PCR

100 bp 145 bp 130 bp 115 bp 160 bp

M11 M9 N9 D4 M10 D5 M A

Multiplex SNaPshot

M D4 D5 M9 M10M11 A N9

Multiplex I

SNP site

1 4 1 4 6 6 8 1 3 9 7 4 4 9 1 8 7 9 3 7 6 4 2 8 7 9 4 5 4 1 7 160 bp

Multiplex PCR

100 bp 145 bp 130 bp 115 bp

M7 R B R9 M8 G D

Multiplex SNaPshot

Multiplex II

SNP site

D G M7 M8 R R9 B

Design of three multiplex systems for coding region SNP scoring

ISFG 2005

4 8 8 3 4 8 3 3 9 8 2 4 7 1 9 6 1 2 7 5 3 9 7 9 b p d e l

slide-4
SLIDE 4

4

160 bp

Design of three multiplex systems for coding region SNP scoring

ISFG 2005

Multiplex PCR

100 bp 145 bp 130 bp 115 bp

D4b D4e D4J D4a D4

Multiplex SNaPshot

Multiplex III

SNP site

D4g D4 D4a D4b D4e D4g D4j

3 1 1 4 9 7 9 8 2 1 1 2 1 5 8 7 1 1 1 6 9 6

Control region motifs for East Asian haplogroups were identified

489 16223-16355A-16362 D4n 489 16244-16362 D4m 489 195 16223-16274-16290-16319-16362 D4k2 489 195 16192-16223 D4k1 489 16184-16223-16311-16362 D4j1 489 16223-16362 D4j* 489 16223-16294-16362 D4i 489 152 16174-16223-16311-16362 D4h2 489 146-183 16174-16223-16362 D4h1 489 16223-16362 D4h* 489-573.pC 16223-16278-16362 D4g1 489 94 16223-16362 D4e1 489 16223-16362 D4e* 489 16245-16362 D4d 16519, 489-523d-524d 194 (16223)-16362 D4b2b 489-523d-524d 16223-16362 D4b2* 489-523d-524d 16223-16319-16362 D4b1 (16519), 489 152 16129-16223-16362 D4a 489 16223-16362 D4 HV3 etc HV2 HV1 Haplogroup 16519, 489-523d-524d (or 513d-514d) (146)-199 16223-16295 M7c1 16519, 489-523d-524d 146-199 16223 M7c 489 150-199 16129-16189-16223-16297-16298 M7b2 489 150-199 16129-16192-16223-16297 M7b1 489-(523d-524d) 16209-16223-16324 M7a1 489 16209-16223 M7a 489 143-152 16223-16274-16362 G3a 489 16204-16223-16278-16362 G2a4 489 16223-16278-16303-16362 G2a3 489 16051-16150-16223-16278-16362 G2a2 489 16223-16227-16278-(16362) G2a1a 489 16189-16223-16278-16362 G2a1 489 16184-16214-16223-16362 G1b 16519, 489 150 16223-16325-16362 G1a 489 16223-16362 G 489 150-152 16188.1C-16193.1C-16362 D5c 456-489 150 16189-16223-16362 D5b 489-523d-524d 150 16182Y-16183C-16189-16223-16266-16362 D5a2 16390-68-489 150-309d 16182C-16183C-16189-16223-16362 D5a1 489 150 16189-16223-16362 D5 185-189 16189-16311 R11 249d 16291-16304 F2a (16519), 523d-524d 249d (16182C)-16183C-16189-16232A-16249-16304-16311 F1b1 16519, 523d-524d 249d 16183C-16189-16304 F1b 16519, 523d-524d 152-249d 16111-16129-16304 F1c 16390-16519, 523d-524d 249d 16172-16284-16304-16311 F1a2 (16519), 523d-524d 249d (16129)-16162-16172-(16304) F1a1 16519, 523d-524d 249d 16129-16304 F1ac 249d (16304) F 489 215-318-326 16223 M11 489-573.pC 16066-16223-16311 M10b 16497, 489, 523d-524d-573.pC 16129-16223-16311-16357 M10a 489 16223-16234-16316-16362 M9a 489 152-249d (or 247d) 16223-16234-16362 M9 489 152-249d 16185-16223-16260-16298 Z 16519, 489 249d 16223-16260-16298 pre-Z 456-489 (16223)-16298-16327 C 489 16184-16223-16298-16319 M8a 489 16223-16298 M8 482 16126-16231-16311 Y2 16183C-16189-16217 B4 16519 146 16126-16231-16266 Y1b 16519 16183C-16189-16223 N9b 16497 150 16172-16223-16257A-16261 N9a2a 150 16172-16223-16257A-(16261) N9a2 150 16129-16223-16257A-16261 N9a1 150 16223-16257A-16261 N9a 523d-524d 235 16187-16223-16290-16319 A5 523d-524d 235 16223-16290-16319-16362 A4 235 16223-16290-16319 A 16519, 523d-524d, (or 513d-514d) 16140-16183C-16189-16243 B5b 16519, 523d-524d 93-210 16129-16140-16187-16189-16266R B5a1 16390 200 16168-16172-16183C-16189-16217-16249-16325 B4f 150-195-214 16183C-16189-16217-16311 B4c1c 150 16140-16183C-16189-16217-16274-16335 B4c1b 16519 16183C-16189-16217-16311 B4c1a 546 16183C-16185-16189d-16217-16234 B4d 16519, 499 16136-16183C-16189-16217 B4b1 (16519), 523d-524d 16182C-16183C-16189-16217-16261 B4a

ISFG 2005

slide-5
SLIDE 5

5

Coding region SNP scoring is useful for molecular dissection of D4 haplogroup

ISFG 2005

1.00 D4g 2.52 D4e 1.69 D4b1 3.71 D4b2 2.35 D4j 0.84 D4b 5.06 D4a 8.26 D4* Freq.(%) Haplogroup

D4* D4a D4b D4b1 D4b2 D5 D5a D5b D* G1a G1b G2a1 G2a1a G2a2 G3a M7a M7a1 M7b1 M7b2 M7c M7c1 M8CZ C ZM8a M9a M10 M11 M* F F1 F1ac F1a F1c F1b F2a R11 B4B4a B4b1 B5a B5b A A4 A5 N9a N9a1 N9a2 N9bY Y1 2 4 6 8 10 12 14 16 18

Haplogroup Frequency (% )

Coding region SNP scoring using Multiplex III

Frequency distribution of haplogroups determined by control region motifs

Control region motifs for D4 subhaplogroups are identified

ISFG 2005

489 16223-16355A-16362 D4n 489 16244-16362 D4m 489 195 16223-16274-16290-16319-16362 D4k2 489 195 16192-16223 D4k1 489 16184-16223-16311-16362 D4j1 489 16223-16362 D4j* 489 16223-16294-16362 D4i 489 152 16174-16223-16311-16362 D4h2 489 146-183 16174-16223-16362 D4h1 489 16223-16362 D4h* 489-573.pC 16223-16278-16362 D4g1 489 94 16223-16362 D4e1 489 16223-16362 D4e* 489 16245-16362 D4d 16519-489-523d-524d 194 (16223)-16362 D4b2b 489-523d-524d 16223-16362 D4b2* 489-523d-524d 16223-16319-16362 D4b1 (16519)-489 152 16129-16223-16362 D4a 489 16223-16362 D4 HV3 etc HV2 HV1 Haplogroup

slide-6
SLIDE 6

6

Coding region SNP scoring is indispensable in some haplogroups

One of G2a1 haplotype according to control region sequence was found to be D4g haplogroup, and 8 and 1 of D4 haplotypes turned out to be G and M9 haplogroups, respectively D4 paragroups, e.g., D4*, D4b2*, D4e* and D4j*, which have a mutation motif 16223-16362-489, need coding region SNP scoring for exact haplogroup determination Complementation of coding region SNP information to control region polymorphisms will lead to mtDNA data quality control and molecular dissection of haplogroups

ISFG 2005

Multiplex systems are proved to be efficient in skeletal remain analysis

Efficiency test was performed in 101 skeletal remains from Korean War (1950~1953) victims Small amplicon sizes enabled SNP score in old skeletal remains to be successfully analyzed without artifact

ISFG 2005

HV1-HV2-HV3 region sequence

16093-16129-16223-16362 73-152-263-309.1C-315.1C 489 Multiplex I Multiplex III Multiplex II D4 D D4a

14668T 4883T 14979C

slide-7
SLIDE 7

7

East Asian HG can be determined using “mtDNA Sequence Manager”

We have developed the haplogroup determining program, “mtDNA Sequence Manager” based on the collated control region mutation motifs for East Asian haplogroups or subhaplogroups By using this program, 593 Korean mtDNAs and 101 Korean War victim mtDNAs can be classified into various East Asian haplogroups or subhaplogroups

(K-J Shin, Yonsei University, unpublished)

ISFG 2005

Concluding remarks

ISFG 2005

East Asian haplogroup determination is efficiently carried out

using haplogroup-level coding region SNP analysis and subhaplogroup-level control region sequence analysis Identification of control region mutation motif and molecular dissection of haplogroups can be achieved by coding region SNP analysis The 3 multiplex systems work well even in degraded samples and it will present a promising means for forensic and human genetics involving East Asian mtDNA haplogroups

slide-8
SLIDE 8

8

Acknowledgement

ISFG 2005

to our lab members and to Kokuryo Research Foundation for research fund support