Developing and Using Special Developing and Using Special - - PowerPoint PPT Presentation

developing and using special developing and using special
SMART_READER_LITE
LIVE PREVIEW

Developing and Using Special Developing and Using Special - - PowerPoint PPT Presentation

Developing and Using Special Developing and Using Special Developing and Using Special Purpose Hidden Markov Model Purpose Hidden Markov Model Purpose Hidden Markov Model Databases Databases Databases Martin Gollery Associate Director of


slide-1
SLIDE 1

Martin Gollery Associate Director of Bioinformatics University of Nevada, Reno Mgollery@unr.edu

Developing and Using Special Purpose Hidden Markov Model Databases Developing and Using Special Developing and Using Special Purpose Hidden Markov Model Purpose Hidden Markov Model Databases Databases

slide-2
SLIDE 2

Today Today’ ’s Tutorial s Tutorial

  • Instructor: Martin Gollery
  • Associate Director of Bioinformatics,

University of Nevada, Reno

  • Consultant to several organizations
  • Formerly with TimeLogic
  • Developed several HMM databases
slide-3
SLIDE 3

Hidden Markov Models Hidden Markov Models

  • What HMM’s are
  • Which HMM programs are commonly used
  • What HMM databases are available
  • Why you would use one DB over another
  • Integrated Resources- InterPro and more
  • How you can build your own HMM DB
  • Problems with building your own
  • Live demonstration
slide-4
SLIDE 4

Hidden Markov Models Hidden Markov Models-

  • What are they, anyway?

What are they, anyway?

  • Statistical description of a protein family's

consensus sequence

  • Conserved regions receive highest scores
  • Can be seen as a Finite State Machine
slide-5
SLIDE 5

Representation of Family Representation of Family Members Members

  • yciH

KDGII

  • ZyciH

KDGVI

  • VCA0570 KDGDI
  • HI1225 KNGII
  • sll0546 KEDCV

0.2 0.8 5 0.2 0.4 0.2 0.2 4 0.8 0.2 3 0.2 0.2 0.6 2 1.0 1 V N K I G E D C

slide-6
SLIDE 6

Representation of gaps in Family Representation of gaps in Family Members Members

  • yciH

KDGII

  • ZyciH

KDGVI

  • VCA0570 KDGDI
  • HI1225 KNGII
  • sll0546 KED-V

0.2 0.2 V 0.8 5 0.2 0.4 0.2 4 0.8 0.2 3 0.2 0.2 0.6 2 1.0 1

  • N

K I G E D C

slide-7
SLIDE 7

For Maximum sensitivity For Maximum sensitivity-

  • 0.2

0.2 V 0.8 5 0.2 0.4 0.2 4 0.8 0.2 3 0.2 0.2 0.6 2 1.0 1

  • N

K I G E D C

No residue at any position should have a zero probability, even if it was not seen in the training data.

slide-8
SLIDE 8

Start with an MSA Start with an MSA… …

  • CLUSTAL W (1.7) multiple sequence alignment
  • yciH

KDGVIEIQGDKRDLLKSLLEAKGMKVKLAGG

  • ZyciH

KDGVIEIQGDKRDLLKSLLEAKGMKVKLAGG

  • VCA0570 KDGDIEIQGDVRDQLKTLLESKGHKVKLAGG
  • HI1225 KNGIIEIQGEKRDLLKQLLEQKGFKVKLSGG
  • sll0546 KEDCVEIQGDQREKILAYLLKQGYKAKISGG
  • PA4840 KDGVVEIQGEHVELLIDELLKRGFKAKKSGG
  • AF0914 KNGVIELQGNHVNRVKELLIKKGFNPERIKT
  • *:. :*:**: : : * :* : :
slide-9
SLIDE 9

Hidden Markov Models Hidden Markov Models

  • HMMER2.0
  • NAME example2
  • DESC

Small example for demonstration purposes

  • LENG 31
  • ALPH Amino
  • COM hmmbuild example2 example2.aln
  • NSEQ 7
  • DATE Wed Jan 08 13:33:06 2003
  • HMM A C D E F G H I

K …

  • 1 -3217 -3413 -3082 -2664 -4291 -3257 -2104 -4231 3883…
  • 2 -1938 -3859 2747 1592 -4024 -1857 -1206 -3953 -1455…
  • 3 -2160 -3144 1834 -953 -4284 3247 -2013 -4362 -2365…
  • 4 -1255 2750 436 -2789 -1273 -2972 -2049 1510 -2543…
  • 5 -2035 -1558 -4660 -4320 -2085 -4409 -4229 3081 -4224…
  • 6 -3264 -3765 -1447 3822 -4535 -2948 -2636 -4814 -2810…
  • 7 -2423 -1951 -4843 -4395 -1156 -4544 -3680 3291 -4151…
  • 8 -3220 -3396 -2530 -2667 -3851 -3171 -2735 -4442 -2277…
  • 9 -3196 -3194 -3915 -4259 -4867 3789 -4005 -5414 -4591…
  • 10 -1923 -3837 2743 2134 -4005 -1854 -1196 -3929 -1434…
  • 11 -999 -2164 -952 -353 -2483 -1909 3321 -2139 1730…
  • 12 -1629 -1909 -2827 -2102 -2279 -2588 -1442 -1012 -488…
slide-10
SLIDE 10

Emission Probabilities Emission Probabilities

  • What is the likelihood that sequence X was

emitted by HMM Y?

  • Likelihood is calculated by adding the

probability of each residue at each position, and each of the transition probabilities

slide-11
SLIDE 11

Plan7 from Outer Space Plan7 from Outer Space

(Well, from St. Louis, anyway!) (Well, from St. Louis, anyway!)

slide-12
SLIDE 12

HMM HMM’ ’s s vs BLAST vs BLAST

  • Position specific scoring vs. general matrix
  • Example:

– dDGVIvIddDKRDLLKSLiEAKkMKVKLAGG – KDGVIEIQGDKRDLLKSLLEAKGMKVKLAGG has 80% BLAST similarity, but misses highly conserved regions

  • Scoring emphasizes important locations
  • Clearer score cutoffs
  • However, it is MUCH slower!
slide-13
SLIDE 13

HMM programs HMM programs

  • HMMer -Sean Eddy, Wash U
  • SAM - Haussler, UCSC
  • Wise tools - Birney, EBI
  • SledgeHMMer - Subramaniam, SDSC
  • Meta-MEME - Noble & Bailey
  • PSI-BLAST - NCBI
  • SPSpfam - Southwest Parallel Software
  • Ldhmmer - Logical Depth
  • DeCypherHMM - TimeLogic
slide-14
SLIDE 14

What exactly do you want? What exactly do you want?

  • Are you searching thousands of sequences with
  • ne or a few models?
  • Use hmmsearch
  • Searching a few sequences with thousands of

models?

  • Use hmmpfam
  • Thousands of sequences vs. Thousands of models?
  • Use an accelerator, if you do it very often
slide-15
SLIDE 15

HMM databases HMM databases

  • PFAM
  • TIGRFAM
  • Superfamily
  • SMART
  • Panther
  • PRED-GPCR
slide-16
SLIDE 16

HMM databases at the CFB HMM databases at the CFB

  • COGfam
  • KinFam
  • HydroHMMer
  • NVfam-pro
  • NVfam-arc
  • NVfam-fun
  • NVfam-pln
slide-17
SLIDE 17

PFAM PFAM

  • From Sanger, WashU, KI, INRA
  • Version 17 has 7868 families
  • Most widely used HMM database
  • Good annotation team
slide-18
SLIDE 18

PFAM PFAM

  • PFAM-A is hand curated
  • From high quality multiple Alignments
  • PFAM-B is built automatically from ProDom
  • Generated using the Domainer algorithm
  • ProDom is built from SP/TREMBL
slide-19
SLIDE 19

PFAM PFAM

  • Pfam-ls = global alignments
  • Pfam-fs = local alignments, so that matches

may include only part of the model

  • Both the –ls and –fs versions are local

W.R.T. the sequence

slide-20
SLIDE 20

PFAM PFAM

  • Note ‘type’ annotation
  • Labeled TP
  • Family
  • Domain
  • Repeat
  • Motif
slide-21
SLIDE 21

TIGRFAMs TIGRFAMs

  • Available at (www.tigr.org/TIGRFAMs/)
  • Organized by functional role
  • Equivalogs: a set of homologous proteins

that are conserved with respect to function since their last common ancestor

  • Equivalog domains: domains of conserved

function

slide-22
SLIDE 22

TIGRFAMs TIGRFAMs

  • 2453 models in release 4.1
  • Complementary to PFAM, so run both
  • Part of the Comprehensive Microbial

Resource (CMR)

slide-23
SLIDE 23

TIGRFAMs TIGRFAMs

TIGRfam and PFAM alignments for Pyruvate carboxylase. The thin line represents the sequence. The bars represent hit regions.

slide-24
SLIDE 24

SuperFamily SuperFamily

  • By Julian Gough, formerly MRC, now Riken GSC
  • www.supfam.org
  • Provides structural (and hence implied functional)

assignments to protein sequences at the superfamily level

  • Built from SCOP (Structural Classification of

Proteins) database, which is built from PDB

  • Available in HMMer, SAM, and PSI-BLAST

formats

slide-25
SLIDE 25

SuperFamily SuperFamily

  • 1447 SCOP Superfamilies
  • Each represented by a group of HMMs
  • Over 8500 models total
  • Table provides comparison to GO, Interpro,

PFAM

slide-26
SLIDE 26

SMART SMART

  • Simple Modular Architecture Research Tool
  • Version 3.4 contains 654 HMMs
  • Emphasis on mobile eukaryotic domains
  • smart.embl-heidelberg.de
  • Annotated with respect to phyletic

distributions, functional class, tertiary structures and functionally important residues

slide-27
SLIDE 27

SMART SMART

  • Use for signaling domains or extracellular

domains

  • Normal and Genomic mode
slide-28
SLIDE 28

PRED PRED-

  • GPCR

GPCR

  • Papasaikas et al, U of Athens
  • 265 HMMs in 67 GPCR families
  • Based on TiPs Pharmacological classification.
  • Filters with CAST
  • signatures regularly updated
  • Entire system redone each year
slide-29
SLIDE 29

PRED PRED-

  • GPCR webserver

GPCR webserver

slide-30
SLIDE 30

Panther Panther

  • Protein ANalysis THrough Evolutionary Relationships
  • Family and subfamily: families are evolutionarily related

proteins; subfamilies are related proteins with the same function

  • Molecular function: the function of the protein by itself or

with directly interacting proteins at a biochemical level, e.g. a protein kinase

  • Biological process: the function of the protein in the

context of a larger network of proteins that interact to accomplish a process at the level of the cell or organism, e.g. mitosis.

  • Pathway: similar to biological process, but a pathway also

explicitly specifies the relationships between the interacting molecules.

slide-31
SLIDE 31

Panther Panther

  • (Thomas et al., Genome Research 2003; Mi

et al. NAR 2005)

  • 6683 protein families
  • 31,705 functionally distinct protein

subfamilies.

slide-32
SLIDE 32

Panther Panther

  • Due to the size, searches could be slow
  • First, BLAST against consensus seqs
  • Then, search against models represented by

those hits

  • With an accelerator, you don’t have to do

that…

slide-33
SLIDE 33

Panther Panther

  • So- how does it perform?
  • I took 3451 Arabidopsis proteins with no hit

to PFAM, Superfamily, SMART or TIGRfam

  • Ran it against Panther
  • Found 160 significant hits!
slide-34
SLIDE 34

COG COG-

  • HMMs

HMMs

  • Clusters of Orthologous Groups of proteins
  • www.ncbi.nlm.nih.gov/cog/
  • Each COG is from at least 3 lineages
  • Ancient conserved domain
  • 4873 alignments available
  • Alignments from NCBI, HMMs from me at

mgollery@unr.edu

slide-35
SLIDE 35

CDD CDD

  • Conserved Domain Database (NCBI)
  • Psi-BLAST profiles are similar to HMMs
  • 10991 PSSMs - SMART + COG +KOG+

Pfam+CD

  • Runs with RPS-BLAST
  • Much faster searches
slide-36
SLIDE 36

KinFam KinFam

  • Kinfam- models represent 53 different classes of

PKs

  • Assigns Kinase Class and Group
  • Based on Hanks’ classification scheme
  • Database is small, so searches are fast
slide-37
SLIDE 37

KinFam KinFam

  • Categorizes Kinase data
  • Available for download from

bioinformatics.unr.edu

RANK SCORE QF TARGET|ACCESSION E_VALUE DESCRIPTION 1 852.93 1 KinFam||ptkgrp15 9.3e-256 Fibroblast GF recept 2 479.14 1 KinFam||ptkgrp14 3.1e-143 Platelet derived GF 3 423.33 1 KinFam||ptkother 1.9e-126 Other membrane-span

slide-38
SLIDE 38

HydroHmmer HydroHmmer

  • Hydrohmmer finds LEAs, other hydrophilin

classes

  • Small target size makes for very fast

searches

slide-39
SLIDE 39

NVFAMs NVFAMs

  • HMM’s reflect the training data
  • Specific training sets provide better results
  • So… use Archaeal data to study Archaeons,

Fungal data to study Fungi, etc.

  • Designed for use with PFAM, not stand

alone

  • Recent redesign, name change
slide-40
SLIDE 40

NVFAMs NVFAMs

  • NVFAM-pro used to study E. faecalis
  • Demonstrated higher scores, better aligns
  • However, PFAM had more total hits
  • P.falciparum used as negative control
  • PFAM showed better scores, aligns as predicted
  • Automated design by Garrett Taylor- scripts are

available!

  • Contact me for input, collaboration, or help to

build your own

slide-41
SLIDE 41

Which database to use? Which database to use?

One Comparison Test One Comparison Test-

  • (Your results may vary

(Your results may vary… …) )

  • Compare 563 I. pini sequences to COGhmm, PFAM,

PFAMfrag, SMART, TIGRfam, TIGRfamfrag, Superfamily

  • COGs- 9
  • PFAM- 22
  • PFAMfrag- 57
  • SMART- 4
  • Superfamily- 30
  • TIGRfam- 6
  • TIGRfamfrag- 12
slide-42
SLIDE 42

Integrated Resources Integrated Resources

  • InterProscan
  • MAGPIE
  • PANAL
  • Make your own!
slide-43
SLIDE 43

InterPro InterPro

  • Database built from PFAM, Prints, Prosite,

SuperFamily, ProDom, SMART, TIGRFAMs, PANTHER, PIRsf, Gene3D & SP/TrEMBL

  • Version 10.0
  • Nearly 12,000 entries
  • http://www.ebi.ac.uk/interpro/
  • InterProScan can be installed locally
slide-44
SLIDE 44

InterProScan InterProScan

  • Splits up big jobs & reassembles them
  • Works with SGE, PBS, LSF
  • A free analysis pipeline!
  • Provides GO mappings
  • Written in PERL, so it’s easy to modify
  • Average 4 min. per NT sequence per CPU
slide-45
SLIDE 45

InterPro InterPro

InterPro release 10.0 contains 11972 entries, representing 3079 domains, 8597 families, 228 repeats, 27 active sites, 21 binding sites and 20 post-translational modification sites. Overall, there are 7521179 InterPro hits from 1466570 UniProt protein sequences. A complete list is available from the ftp site.

DATABASE VERSION ENTRIES

SWISS-PROT 46.5 180652 PRINTS 37.0 1850 TrEMBL 29.5 1689375 Pfam 17.0 7868 PROSITE patterns 18.45 1800 PROSITE preprofiles N/A 120 ProDom 2004.1 1522 InterPro 10.0 11972 SMART 4.0 663 TIGRFAMs 4.1 2454 PIRSF 2.52 962 PANTHER 5.0 438 SUPERFAMILY 1.65 1160 Gene3D 3.0 117 GO Classification N/A 18705

slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50

Modifying InterProScan Modifying InterProScan

  • Two ways to Add your own HMM database

to InterProScan:

  • Modify PERL scripts
  • Concatenate your models onto PFAM
  • Similarly, if you are looking for a specific

target, delete all the rest to speed up searches

slide-51
SLIDE 51

PANAL PANAL

  • Simultaneously searches several targets
  • Produces a nice graphical overview
  • Databases-

– PFAM – SMART – TIGRFAM – Prosite – PRINTS – BLOCKS

slide-52
SLIDE 52

PANAL PANAL

slide-53
SLIDE 53

MAGPIE MAGPIE

  • BLOCKS
  • NCBI public non-redundant DNA and protein
  • NCBI EST databases
  • NCBI Conserved Domain Database (CDD)
  • Protein Identification Resource SuperFamilies
  • PFAM
  • ProDom
  • SCOP SuperFamilies
  • SMART
  • TIGRFam
  • ProSite
slide-54
SLIDE 54

MAGPIE MAGPIE

  • Gives a putative description of the gene
  • Database search result ranking based on user

defined tool precedence and score thresholds.

  • A single graphical summary of the various search

results

  • Links to the database source entries
slide-55
SLIDE 55

MAGPIE MAGPIE

  • Gene taxonomic distribution information
  • Reporting of similar sequences in the dataset

based on hits to similar database entries

  • Annotated metabolic pathway diagrams
  • Gene Ontology (GO) term assignments
slide-56
SLIDE 56

MAGPIE MAGPIE

Terry Gaasterland et al. Genome Res. 2000; 10: 502-510

slide-57
SLIDE 57

Building Your Own HMM Building Your Own HMM Database Database

  • Why do it?
  • Greater Specificity
  • Represent your training set
  • Faster searches
  • Focus on the particular aspects that you

want

slide-58
SLIDE 58

PFAM

Y

  • ur

Data Y

  • ur

Data

Public DB

HMMsearch B LAS T

Or

Cluster S equences

Build Multiple S equence Alignments

HMMbuild HMMcalibrate

Discard S ingletons

Annotate

Check Alignments Add Desc ription L ine

slide-59
SLIDE 59

First, search against a target First, search against a target… …

slide-60
SLIDE 60

Select the hits for the model Select the hits for the model

slide-61
SLIDE 61

Build the Multiple Sequence Build the Multiple Sequence Alignment Alignment

slide-62
SLIDE 62

Run Run HMMbuild HMMbuild to make the to make the model model

slide-63
SLIDE 63

Iterate Search to Add more distant Members Iterate Search to Add more distant Members

slide-64
SLIDE 64

Design Decisions: Design Decisions:

  • Local or global models?
  • Which sequence weighting scheme?
  • What type of Prior?
slide-65
SLIDE 65

Calibration Calibration

  • Hmmcalibrate
  • Improves scoring
  • Compares to random data
  • Can be done on each model, or on the entire

collection

slide-66
SLIDE 66

Calibration Calibration

  • Very time consuming on CPU, not on

researcher

  • No acceleration available
  • Not necessary with SAM
slide-67
SLIDE 67

Meme and Meta Meme and Meta-

  • Meme

Meme

  • Meme discovers motifs in a group of related

DNA or protein sequences

  • Motifs contain no gaps- split in two instead
slide-68
SLIDE 68

Meta Meta-

  • meme

meme

  • Meta-meme takes meme motifs & related

seqs as input

  • Combines motifs into HMMs
  • Regions between motifs are modeled

imprecisely

  • Reduction in parameter space
  • Accurate models with fewer training seqs
slide-69
SLIDE 69

Meta Meta-

  • meme

meme

  • mhmm: Build a motif-based HMM from

Meme motifs.

  • mhmms: Search a sequence database using

a motif-based HMM

  • mhmmscan: Like mhmms, but allows long

seqs and multiple matches.

slide-70
SLIDE 70

Using RPS Using RPS-

  • BLAST

BLAST

  • Start with PSI-BLAST using –C
  • Prepare files with makemat and copymat
  • Compile target
  • Annotate
  • Search with RPS-BLAST
slide-71
SLIDE 71

IMPALA IMPALA

  • Also uses profiles database
  • Alignments generated by Smith-Waterman

instead of word hit initiated

  • 10-100x Slower, might be better than RPS-

BLAST

slide-72
SLIDE 72

SPEED SPEED

  • PVM version of HMMer is available, MPI is on

the way (?)

  • Other Solutions- use PSSM’s?
  • SPSpfam can speed searches 3-60X
  • SledgeHMMer claims 10X Speedup
  • Accelerators
  • Target Triage
slide-73
SLIDE 73

SPSpfam SPSpfam

  • From Southwest Parallel Software
  • Optimized HMMer code
  • Up to 60X faster
  • Works well on cluster
  • Uses binary Pfam, so you can’t drop it into

InterProScan

  • This may change soon
slide-74
SLIDE 74

HMM Accelerators HMM Accelerators

  • Can provide speedup of 100’s-1000’s X
  • TimeLogic is the only commercial one left
  • HokieGene from Virginia Tech
  • StarBridge - No HMMs yet
  • Others coming soon
  • An open-source project is in the works-

BioFPGA

slide-75
SLIDE 75

HMMs on the Web HMMs on the Web

  • SAM

http://www.cse.ucsc.edu/research/compbio/

  • HMMer http://hmmer.wustl.edu/
  • Several other HMMer servers…
  • SledgeHMMer.sdsc.edu is only unlimited

webserver- most restrict you to one sequence at a time.

slide-76
SLIDE 76

Resources Resources

  • Online Applications:
  • HMMer http://hmmer.wustl.edu/
  • SAM-T02

http://www.soe.ucsc.edu/research/compbio/ HMM-apps/HMM-applications.html

  • Pfam http://pfam.wustl.edu/
  • SledgeHMMer sledgehmmer.sdsc.edu
  • Meta-MEME http://metameme.sdsc.edu/
  • PANAL http://web.ahc.umn.edu/panal/
slide-77
SLIDE 77

Resources Resources

  • Commercial vendors of HMM systems
  • SPSpfam (www.spsoft.com)
  • Ldhmmer (www.logicaldepth.com)
  • DeCypherHMM (www.timelogic.com)
slide-78
SLIDE 78

References References

  • S.Altshul, et al. Basic Local Alignment Search Tool. JMB, 215:403{410, 1990.
  • C. Barrett, et al. Scoring hidden Markov models. CABIOS, 13(2):191{199, 1997.
  • S. R. Eddy. Profile hidden markov models. Bioinformatics, 14(9):755{63, 1998.
  • W. N. Grundy,et al. Meta-MEME: Motif-based hidden Markov models of protein families.

CABIOS, 13(4):397{406, 1997.

  • M. Gribskov, et al. Profile analysis: Detection of distantly related proteins. PNAS,

84:4355{4358, July 1987.

  • S. Henikoff and Jorja G. Henikoff. Amino acid substitution matrices from protein blocks.

PNAS, 89:10915{10919, November 1992.

  • [HH94] Steven Henikoff and Jorja G. Henikoff. Position-based sequence weights. JMB,
  • 243(4):574{578, November 1994.
slide-79
SLIDE 79
  • Jerey D. et al. Kestrel: A programmable array for sequence analysis. In Application-Specific
  • Array Processors, pages 25{34, Los Alamitos, CA, July 1996. IEEE Computer Society.
  • R. Hughey and A. Krogh. Hidden Markov models for sequence analysis: Extension and

analysis of the basic method. CABIOS, 12(2):95{107, 1996.

  • T. Hubbard, et al. SCOP: a structural classification of proteins database. NAR, 25(1):236{9,

January 1997.

  • L. Holm and C. Sander. Dali/fssp classification of three-dimensional
  • protein folds. NAR, 25:231{234, 1 Jan 1997.
  • K. Karplus, et al. Predicting protein structure using only sequence
  • information. Proteins: Structure, Function, and Genetics
  • K. Karplus, et al. Hidden markov models for detecting remote protein homologies.

Bioinformatics, 14(10):846{856, 1998.

slide-80
SLIDE 80
  • A. Krogh, et al, Hidden Markov models in computational biology: Applications to protein modeling.

JMB, 235:1501{1531, February 1994.

  • Kevin Karplus, et al. Predicting protein structure using hidden Markov models. Proteins: Str, Func, and

Genetics, Suppl. 1:134{139, 1997.

  • C. A. Orengo, et al. Cath- a hierarchic classification of protein domain structures.
  • Structure, 5(8):1093{108, August 1997.
  • J. Park, et al. Sequence comparisons using multiple sequences detect twice
  • as many remote homologues as pairwise methods. JMB, 284(4):1201{1210
  • E.L.L Sonnhammer, et al. Pfam: A comprehensive database of protein families. Proteins, 28:405{420,

1997.

  • K. Sjolander, et al. Dirichlet mixtures: A method for improving detection of weak
  • but signicant protein sequence homology. CABIOS, 12(4):327{345, August 1996.
  • Reinhard Schneider and Chris Sander. The HSSP database of protein
  • structure-sequence alignments. NAR, 24(1):201{205, 1 Jan 1996.
  • Chukkapalli G., Guda, C. and Subramaniam S. SledgeHMMER: A web server for batch searching

Pfam database, Nucleic Acids Res. , 32:W542-544

  • Schaffer, A.A., Wolf, Y.I., Ponting, C.P. Koonin, E.V., Aravind, L., Altschul, S. F., IMPALA:

Matching a Protein Sequence Against a Collection of PSI-BLAST-Constructed Position- Specific Score Matrices, Bioninformatics,

  • P. K. Papasaikas, P. G. Bagos, Z. I. Litou, V. J. Promponas and S. J. Hamodrakas

PRED-GPCR: GPCR recognition and family classification serveNucleic Acids Research 2004 32(Web Server issue):W380-W382; doi:10.1093/nar/gkh431

  • Silverstein, K.A.T., A. Kilian, J.L. Freeman, and E.F. Retzel. "PANAL: an integrated resource for

Protein sequence ANALysis," Bioinformatics, 16:1157-1158, 2000

slide-81
SLIDE 81

Thanks! Thanks!

  • Garrett Taylor, Brian Beck, Taliah Mittler,

Barrett Abel, John Cushman, Lee Weber

  • Contact me at- mgollery@unr.edu
  • Bioinformatics.unr.edu