How do genomes function? Using machine vision and computation to describe genome function at the
- rganismal level.
machine vision and computation to describe genome function at the - - PowerPoint PPT Presentation
How do genomes function? Using machine vision and computation to describe genome function at the organismal level. Tessa Durham Brooks, Ph.D. Doane College Department of Biology Anticipation at the dawn of the Genomic Era Within the next
“Within the next few years, technologies developed for the Human Genome Project and similar sequencing efforts will revolutionize medicine, agriculture, crimefighting, and other fields.” – Gwynne and Page, Science, 2000
~20,000 genes 97 mil base pairs Sequence finished 1998 ~25,000 genes 100 mil base pairs Sequenced finished 2000 ~22,000 genes 137 mil base pairs Sequence finished 2000 For reference: Humans have about 25,000 genes, 3.2 bil base pairs.
For reference: Humans have about 25,000 genes, 3.2 bil base pairs. ~20,000 genes 97 mil base pairs Sequence finished 1998 ~25,000 genes 100 mil base pairs Sequenced finished 2000 ~22,000 genes 137 mil base pairs Sequence finished 2000
The task seems likely to change the nature of biological research, requiring teams of engineers, mathematicians, nanotechnologists and computer programmers, and farms of computers if not a national computer grid.
At a conference this month …, biologists tried to explore how the study of genomes might develop over the next 20 years and what tools might be needed. Central to their vision of the future is a thorough computerization
~25,000 genes
2 4 6 8 10 20 40 60 80 100 glr3.3-1 glr3.3-2 SalkCol Tip Angle (deg) Time (h)
First Order (swing rate)
Scale Scale
Second Order (acceleration)
Time (h)
glr3.3-1 vs wt glr3.3-2 vs wt Miller, Durham Brooks, and Spalding 2010, Genetics
~25,000 genes
Doane Phytomorph
Matthieu Reymond Max-Planck Institute S T S T QTL Analysis
~25,000 genes
Durham Brooks, Miller, and Spalding 2010, Plant Physiology
One ecotype
Position (cM) Time (minutes)
LOD
Seed Size Small Large Seedling Age 2d 164 lines X 15 indiv. 3d 4d
Moore, et al., unpublished result
~25,000 genes
Data Capture Data Capture
Data Storage (30 TB) Data Compression and Analysis Feature Extraction Data Storage (X TB) Schorr Center and OSG
0.5 TB/day
Data Compression QTL analysis
uncompressed TIF’s
lossless -compressed PNG’s
least significant bits of the first 14 pixels of each image
using FFV1 codec
intraframe codec
FFV1 codec
root tip
curvature features are used
ground tissue from image background
methods work well
meristematic tissue
value (e.g. root tip angle) with a genetic element
at each time point for each genetic line
likelihoods of trait data
imputation (0.24 vs 63 CPU years per time point)
randomized data
locally
Mike Carpenter (CIO), David Andersen, Dan Schneider
Students: Amy Craig and Brad Higgins (Physics), Autumn
Longo and Grant Dewey (Biochemistry), Tracy Guy, Miles Mayer, Halie Smith, Anthony Bieck, Sarah Merithew, Devon Niewohner, Muijj Ghani, Julie Wurdeman (Biology)
Candace Moore, Logan Johnson
Brian Bockelman and Dr. David Swanson
PGRP - 1031416 EPSCoR - URE NE-INBRE