Visualizing ENCODE Data in the UCSC Genome Browser
Pauline Fujita, Ph.D.
UCSC Genome Bioinformatics Group
Visualizing ENCODE Data in the UCSC Genome Browser Pauline Fujita, - - PowerPoint PPT Presentation
Visualizing ENCODE Data in the UCSC Genome Browser Pauline Fujita, Ph.D. UCSC Genome Bioinformatics Group Training Resources genome@soe.ucsc.edu Genomewiki: genomewiki.ucsc.edu Mailing list archives: genome.ucsc.edu/FAQ/ Training page:
Pauline Fujita, Ph.D.
UCSC Genome Bioinformatics Group
genome.ucsc.edu/cgi-bin/hgTracks?db=hg19
Track Description Item Description
A A G G A C T G A A C T C A G G C A G A G T C C A A G A G T C G C A G G A A C CTT T C G A T C T C C T
Scale chr2: GM12878 Ht 2 GM12878 Pk 2 2 kb hg19 191,876,000 191,877,000 191,878,000 191,879,000 191,880,000 191,881,000 Basic Gene Annotation Set from GENCODE Version 19 DNaseI Hypersensitivity by Digital DNaseI from ENCODE/University of Washington GM12878 DNaseI HS Raw Signal Rep 2 from ENCODE/UW K562 TFBS Uniform Peaks of Znf143_(16618-1-AP) from ENCODE/Stanford/Analysis K562 Znf143 IgG-rab ChIP-seq Peaks from ENCODE/SYDH K562 Znf143 IgG-rab ChIP-seq Signal from ENCODE/SYDH K562 polyA+ IFNa30 RNA-seq Alignments from ENCODE/SYDH Exome Aggregation Consortium (ExAC) - Variants from 60,706 Exomes STAT1 STAT1 STAT1 STAT1 STAT1 A G
C C T T C G C T GM12878 Sg 2 100 _ 1 _ K562 Z143 IgR 40 _ 3 _
Scale chr2: GM12878 Ht 2 GM12878 Pk 2 2 kb 191,876,000 191,877,000 STAT1 STAT1 STAT1 STAT1 STAT1 A G G A A C CTT
C G A T C C T T C G A A G G A C T G A A C T C A G G C A G A G T C C A A G A G T C G C A G C T T C T C C T GM12878 Sg 2 100 _ 1 _ K562 Z143 IgR 40 _ 3 _
hg19 191,878,000 191,879,000 191,880,000 191,881,000 Basic Gene Annotation Set from GENCODE Version 19 DNaseI Hypersensitivity by Digital DNaseI from ENCODE/University of Washington GM12878 DNaseI HS Raw Signal Rep 2 from ENCODE/UW K562 TFBS Uniform Peaks of Znf143_(16618-1-AP) from ENCODE/Stanford/Analysis K562 Znf143 IgG-rab ChIP-seq Peaks from ENCODE/SYDH K562 Znf143 IgG-rab ChIP-seq Signal from ENCODE/SYDH K562 polyA+ IFNa30 RNA-seq Alignments from ENCODE/SYDH Exome Aggregation Consortium (ExAC) - Variants from 60,706 Exomes
Positional annotations. (ex. Regions w/: enriched ChIP-seq signal for TF binding, Δ’l methylation, splice jxns from RNA-seq) Continuous signal data. # of reads (ex. DNase I HS and ChIP-seq signals) Alignments of seq. reads, mapped to genome (ex. RNA- seq alignments) Variation data: SNPs, indels, Copy Number Variants, Structural Variants (ex. ExAC data)
Scale chr2: GM12878 Ht 2 GM12878 Pk 2 2 kb 191,876,000 191,877,000 STAT1 STAT1 STAT1 STAT1 STAT1 A G G A A C CTT
C G A T C C T T C G A A G G A C T G A A C T C A G G C A G A G T C C A A G A G T C G C A G C T T C T C C T GM12878 Sg 2 100 _ 1 _ K562 Z143 IgR 40 _ 3 _ DNaseI Hypersensitivity by Digital DNaseI from ENCODE/University of Washington K562 Znf143 IgG-rab ChIP-seq Peaks from ENCODE/SYDH K562 polyA+ IFNa30 RNA-seq Alignments from ENCODE/SYDH 191,879,000
191,878,000 Basic Gene Annotation Set from GENCODE Version 19 GM12878 DNaseI HS Raw Signal Rep 2 from ENCODE/UW K562 TFBS Uniform Peaks of Znf143_(16618-1-AP) from ENCODE/Stanford/Analysis K562 Znf143 IgG-rab ChIP-seq Signal from ENCODE/SYDH Exome Aggregation Consortium (ExAC) - Variants from 60,706 Exomes hg19 191,880,000 191,881,000
track name=”BED_custom_track” chr7 127471196 127472363 Gene1
http://genome.ucsc.edu/cgi-bin/hgIntegrator?hgsid=43297266... 1 of 1 6/26/15, 3:20 PM
#ct_SYDHTFBS_4733.chrom ct_SYDHTFBS_4733.chromStart ct_SYDHTFBS_4733.chromEnd ct_SYDHTFBS_4733.name ct_SYDHTFBS_4733.score wgEncodeGencodeBasicV19.name wgEncodeGencodeBasicV19.name2 chr21 33031473 33032186 . 608 ENST00000449339.1 AP000253.1 chr21 33031473 33032186 . 608 ENST00000270142.6 SOD1 chr21 33031473 33032186 . 608 ENST00000389995.4 SOD1 chr21 33031473 33032186 . 608 ENST00000470944.1 SOD1
37
myHub/ - directory containing track hub files hub.txt - a short description of hub properties genomes.txt - list of genome assemblies included hg19/ - directory of data for the hg19 human assembly Data files! BAM, bigBed, bigWig, VCF
UCSC Ge UCSC Geno nome me Br Browse wser t r team am
– Da David Hau vid Haussle ssler – co r – co-PI
– Jim K im Kent – Br nt – Browse wser Co r Conce ncept, BLA pt, BLAT, T , Team Le am Leade ader, PI , PI – Bo Bob K b Kuhn hn –
– Asso
Associat ciate Dire Direct ctor, Ou , Outre treach – co ach – co-PI
– Do Donna K nna Kar arolchik lchik, Ann Z , Ann Zweig – Pr ig – Proje ject Manage ct Manageme ment nt Engine Engineering ring QA QA, Do , Docs, Su cs, Suppo pport t Sys-admins Sys-admins
Angie Hinrichs Katrina Learned Jorge Garcia Pauline Fujita Erich Weiler Kate Rosenbloom Hiram Clawson Luvina Guruvadoo Gary Moro Steve Heitner Galt Barber Brian Raney Brian Lee Max Haeussler Jonathan Caspar Matt Speir
UC Santa Cruz Genomics Institute
Na Nation tional Huma l Human Gen Genome R
esearch In h Institut stitute (NHGRI) e (NHGRI) Na Nation tional Ca l Cancer cer In Institut stitute (NCI) e (NCI) Na Nation tional In l Institut stitute f e for
Denta tal a l and d Cr Cranio-F
cial R l Resea esearch (NIDCR) h (NIDCR) Na Nation tional In l Institut stitute f e for
Child Health a lth and Huma d Human De Developmen elopment (NICHD) t (NICHD) QB3 ( QB3 (UCB UCBerkele ley, UCSF , UCSF, UCSC) , UCSC) Amer America ican R Reco ecover ery a y and R d Rein einvestmen estment A t Act (ARRA) stimulus fun ct (ARRA) stimulus funds ds
UC Santa Cruz Genomics Institute
UC Santa Cruz Genomics Institute
must include the ”http” part of this url or you will get an error) and click [submit].
the display
at the data by pasting that same url into a web browser:
existing tracks, see if you can find them, turn them on, and observe that the original tracks and custom tracks look the same in the browser:
Track (ENC TF Binding), Track (SYDH TFBS)
(1000G Ph1 Vars)
VCF track): chr21:33,034,804-33,037,719
add]
tracks to the list – ex:
Gene Prediction), track (GENCODE V19), view (Genes), subtrack (Basic) [add]
track (Common SNPs) [add] Choose which fields to include in your output: Output options -> Choose fields [Done] -> [get output]
gene track (Select Genes = “Basic Gene Annotation Set… GENCODE”)
Annotations” click the “+” button to choose which TFs to include (or select none to include all binding sites)