In silico blood genotyping from exome sequencing data
Silvio Tosatto
BioComputing UP, Department of Biology, University of Padova, Italy URL: http://protein.bio.unipd.it/
In silico blood genotyping from exome sequencing data Silvio - - PowerPoint PPT Presentation
In silico blood genotyping from exome sequencing data Silvio Tosatto BioComputing UP, Department of Biology, University of Padova, Italy URL: http://protein.bio.unipd.it/ Today Personalized genetics has been upon us for some time How
BioComputing UP, Department of Biology, University of Padova, Italy URL: http://protein.bio.unipd.it/
from raw sequence (or genotype) data
predicted for real people with genetic data
Personal Genome Project are provided (PGP-10)
Dataset provided by George Church
Numerical traits
(in mg/dL) *
(in mg/dL)
Numerical traits
(in mg/dL) *
(in mg/dL)
(Blood. 2009;114: 248-256)
Blood Grp Genes Antigens
ABO ABO A, B, O
Amino acid residues differing between blood group A- and B-active transferases, respectively (Arg176Gly; Gly235Ser; Leu266Met; Gly268Ala) are shown with the single-letter code and their positions indicated.
Blood Grp Genes Antigens
ABO ABO A, B, O RH RHCE, RHD D, E, C plus 50 minor DUFFY DARC FY(a), FY(b) Kell KEL K1, K2 plus 23 minor Diego SLC4A1 Dia, Dib, Wra, Wrb Kidd SLC14A1 Jk(a), Jk(b) Lewis FUT3 a, b Lutheran BCAM Lu(a), Lu(b) plus 15 minor MNS GYPA, GYPB, GYBE M, N, S plus 40 minor Bombay FUT1, FUT2 H, secretor
then <phenotype(s)>” form
Blood G rp G enes Antigens
ABO ABO A, B, O RH RH CE, RHD D, E, C plus 50 m inor DUFFY DARC FY(a), FY(b) Kell KEL K1, K2 plus 23 m inor Diego SLC4A1 Dia, Dib, Wra, Wrb Kidd SLC14A1 Jk(a), Jk(b) Lewis FUT3 a, b Lutheran BCAM Lu(a), Lu(b) plus 15 m inor M NS GYPA, GYPB, GYBE M , N, S plus 40 m inor Bom bay FUT1, FUT2 H, secretor
Relevant variants Gene‐based annotation of variants Select conserved positions Remove unrelated genes
(Wang et al., Nucleic Acids Research 2010)
Millions of SNVs
ANNOVAR is used to reduce the SNVs to manageable number.
Few relevant SNVs
Blood G rp G enes Antigens
ABO ABO A, B, O RH RHCE, RHD D, E, C plus 50 m inor DUFFY DARC FY(a), FY(b) Kell KEL K1, K2 plus 23 m inor Diego SLC4A1 Dia, Dib, W r a, Wr b Kidd SLC14A1 Jk(a), Jk(b) Lewis FUT3 a, b Lutheran BCAM Lu(a), Lu(b) plus 15 m inor M NS GYPA, GYPB, GYBE M , N, S plus 40 m inor Bom bay FUT1, FUT2 H, secretor
The mission of the PGP is to encourage the development of personal genomics
Personal Genome Project are provided (PGP-10)
1,000 genomes
Unfortunately, only ABO and Rh blood group information is available
Back row (left to right): James Sherley, Misha Angrist, John Halamka, Keith Batchelder, Rosalynn Gill. Front row (left to right): Esther Dyson, George Church, Kirk Maxey. Not shown: Stan Lapidus and Steven Pinker.
PGP1 PGP4 PGP8 Known O + A - B + ABO O A B Rh c; e; weak D c; e; weak D c; e; weak D DUFFY FY(a+); FY(b-) FY(a-); FY(b+) FY(a-); FY(b+) KELL K2; K21+; K4-; K3-; K11; K17; K14; K24; K6+; K7- K2; K21+; K4-; K3-; K11; K17; K14; K24; K6+; K7- K2; K21+; K4-; K3-; K11; K17; K14; K24; K6+; K7- Diego Dib; Memph neg Dib; Memph neg Dib; Memph neg KIDD Jk(a-); Jk(b+) Jk(a-); Jk(b+) Jk(a+); Jk(b-) Lewis negative negative negative Lutheran Lu(a-); Lu(b+); Lu6+; Lu9-; Lu4; Lu8+; Aua+;Aub- Lu(a-); Lu(b+); Lu6-; Lu9+;Lu4-; Lu8+; Aua-;Aub+ Lu(a-); Lu(b+); Lu6+; Lu9-;Lu4-; Lu8+; Aua+;Aub- MNS M; S M; s M,s Bombay H+; secretor H+; secretor H+; secretor
BOOGIE predicts correctly all ABO types and all except one (PGP-4) Rh groups
blood group information for a total of 22 individuals
P = predicted R = real
groups relevant for transfusions from sequencing data
– Specialized knowledgebase with 580 genotype to phenotype rules – Novel variants can be easily considered
Rh blood groups
– The ABO and Rh systems are correctly predicted in 85-100% of cases – The Rh- type presents some additional difficulties
URL: URL: http:// http://protein.bio.unipd.it protein.bio.unipd.it/ /
Funding
FIRB Futuro in Ricerca
Università di Padova CARIPLO AIRC