BioNav: Effective Navigation on Query Results of Biomedical - - PowerPoint PPT Presentation

bionav effective navigation on query results of
SMART_READER_LITE
LIVE PREVIEW

BioNav: Effective Navigation on Query Results of Biomedical - - PowerPoint PPT Presentation

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work BioNav: Effective Navigation on Query Results of Biomedical Databases Abhijith Kashyap 1 Vagelis Hristridis 2 Michalis Petropoulos 1 Sotiria


slide-1
SLIDE 1

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

BioNav: Effective Navigation on Query Results of Biomedical Databases

Abhijith Kashyap 1 Vagelis Hristridis 2 Michalis Petropoulos 1 Sotiria Tavoulari3

  • 1Dept. of Computer Science and Engineering

University at Buffalo, SUNY

2School of Computing and Information Sciences

Florida International University

3Department of Pharmacology

Yale University

September 8, 2008

slide-2
SLIDE 2

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

MOTIVATION

Exploratory queries are increasingly becoming a common phenomenon in life sciences

e.g., search for citations on a given keyword on PubMed

These queries return too-many results, but only a small fraction is relevant

the user ends up examining all or most of the result tuples to find the interesting ones

Can happen when the user is unsure about what is relevant

e.g., user is looking for articles on a broad topic: ’cancer’. . . query returns over 2 million citations on PubMed

This phenomenon is commonly referred to as ’information-overload’

slide-3
SLIDE 3

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

MOTIVATION

Exploratory queries are increasingly becoming a common phenomenon in life sciences

e.g., search for citations on a given keyword on PubMed

These queries return too-many results, but only a small fraction is relevant

the user ends up examining all or most of the result tuples to find the interesting ones

Can happen when the user is unsure about what is relevant

e.g., user is looking for articles on a broad topic: ’cancer’. . . query returns over 2 million citations on PubMed

This phenomenon is commonly referred to as ’information-overload’

slide-4
SLIDE 4

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

MOTIVATION

Exploratory queries are increasingly becoming a common phenomenon in life sciences

e.g., search for citations on a given keyword on PubMed

These queries return too-many results, but only a small fraction is relevant

the user ends up examining all or most of the result tuples to find the interesting ones

Can happen when the user is unsure about what is relevant

e.g., user is looking for articles on a broad topic: ’cancer’. . . query returns over 2 million citations on PubMed

This phenomenon is commonly referred to as ’information-overload’

slide-5
SLIDE 5

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

MOTIVATION

Exploratory queries are increasingly becoming a common phenomenon in life sciences

e.g., search for citations on a given keyword on PubMed

These queries return too-many results, but only a small fraction is relevant

the user ends up examining all or most of the result tuples to find the interesting ones

Can happen when the user is unsure about what is relevant

e.g., user is looking for articles on a broad topic: ’cancer’. . . query returns over 2 million citations on PubMed

This phenomenon is commonly referred to as ’information-overload’

slide-6
SLIDE 6

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COMMON APPROACHES TO AVOID

INFORMATION-OVERLOAD

Ranking Categorization

slide-7
SLIDE 7

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COMMON APPROACHES TO AVOID

INFORMATION-OVERLOAD

Ranking Categorization

slide-8
SLIDE 8

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

CATEGORIZATION IN INFORMATION SYSTEMS

Assumptions: Tuples in the database are annotated with one or more categories or concepts The set of concepts are arranged in a concept hierarchy

Example: Each citation in PubMed is associated with several concepts from the MeSH (Medical Subject Headings) hierarchy, typically 12 to 20

Users querying the database are familiar with the controlled vocabulary of the concept hierarchy

slide-9
SLIDE 9

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

CATEGORIZATION IN INFORMATION SYSTEMS

Assumptions: Tuples in the database are annotated with one or more categories or concepts The set of concepts are arranged in a concept hierarchy

Example: Each citation in PubMed is associated with several concepts from the MeSH (Medical Subject Headings) hierarchy, typically 12 to 20

Users querying the database are familiar with the controlled vocabulary of the concept hierarchy

slide-10
SLIDE 10

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: NAIVE APPROACH

GoPubMed

Create the Navigation Tree as follows: Extract the set S of concepts annotating tuples in the query result set Q Construct the minimal sub-concept hierarchy tree T, that covers all concepts in S

slide-11
SLIDE 11

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: NAIVE APPROACH

GoPubMed

Example: Section of Navigation Tree for query ’Prothymosin’ (313 results)

slide-12
SLIDE 12

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: NAIVE APPROACH

GoPubMed

Problems: Massive size of the Navigation Tree

MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts

Large number of duplicate tuples

Each tuple is annotated with 12-20 MeSH concepts Total tuple count is over 5000

Effort required to navigate the query results increases!

slide-13
SLIDE 13

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: NAIVE APPROACH

GoPubMed

Problems: Massive size of the Navigation Tree

MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts

Large number of duplicate tuples

Each tuple is annotated with 12-20 MeSH concepts Total tuple count is over 5000

Effort required to navigate the query results increases!

slide-14
SLIDE 14

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: NAIVE APPROACH

GoPubMed

Problems: Massive size of the Navigation Tree

MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts

Large number of duplicate tuples

Each tuple is annotated with 12-20 MeSH concepts Total tuple count is over 5000

Effort required to navigate the query results increases!

slide-15
SLIDE 15

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: DYNAMIC APPROACH

BioNav

Example: Navigation steps for query ’Prothymosin’ Only a selective set of descendents is shown

slide-16
SLIDE 16

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: DYNAMIC APPROACH

BioNav

Example: Navigation steps for query ’Prothymosin’ An expand action >>> on

the root reveals next relevant set of descendants

slide-17
SLIDE 17

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: DYNAMIC APPROACH

BioNav

Example: Navigation steps for query ’Prothymosin’ User can choose to expand an internal node, to see nodes from the sub-tree rooted at the node

slide-18
SLIDE 18

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

QUERY RESULT NAVIGATION: DYNAMIC APPROACH

BioNav

BioNav Idea: At each navigation step, for a given node, instead of showing all children, reveal a selective set of descendants descendents are chosen so that the overall navigation cost is minimized, using a formal cost model

slide-19
SLIDE 19

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

CONTRIBUTIONS

Comprehensive framework for navigating large query results using extensive concept hierarchies A formal cost model for measuring the navigation cost incurred by the user Algorithms and heuristics for minimizing the expected navigation cost Experimental evaluation and system demo: http://db.cse.buffalo.edu/bionav/

slide-20
SLIDE 20

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

FRAMEWORK

DEFINITIONS

  • 1. A Concept Hierarchy H(V, E, r) is labeled tree of:

A set V of concept nodes A set E of parent/child edges A root r

According to the semantics of the MeSH concept hierarchy, a child is more specific than the parent

  • 2. A Navigation Tree T(V, E, r) is

created as a response to the user query by attaching to each node of (MeSH) concept hierarchy, a list of its associated citations and removing all nodes with no attached citations (while preserving parent/child relationship)

slide-21
SLIDE 21

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

FRAMEWORK

DEFINITIONS

User navigates the Navigation Tree by a series of ’expand’ actions on concept nodes Each expand action generates an EdgeCut on the residual navigation tree rooted at the given node

slide-22
SLIDE 22

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

FRAMEWORK

EXAMPLE: (NAVIGATION TREE, EDGECUT AND COMPONENT SUBTREES)

MESH … Biological Phenomena… Cell Death Cell Growth Processes Cell Physiology Apoptosis Autophagy Necrosis Cell Proliferation … Cell Division …

A valid EdgeCut divides the tree into a number of Component Subtrees

slide-23
SLIDE 23

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

FRAMEWORK

DEFINITIONS (CONTD):

Not all EdgeCuts are valid, the ’expand’ action generates only valid EdgeCuts

  • 3. Valid EdgeCut: An EdgeCut C is valid is no two edges in C

appear in a path from the root to a leaf node

slide-24
SLIDE 24

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

FRAMEWORK

DEFINITIONS (CONTD):

  • 4. An Active Tree TA(V, E, r) is a Navigation Tree where

each node n ∈ V is annotated with the nodeset consisting

  • f nodes in the component subtree rooted at n

MESH … Biological Phenomena… Cell Death Cell Growth Processes Cell Physiology … Cell Proliferation … … … MESH … Biological Phenomena… Cell Death Cell Growth Processes Cell Physiology … Cell Proliferation … … …

I(“Biological Phenomena…”)= {“Biological Phenomena…”, “Cell Physiology”, “Cell Death”, “Autophagy”, “Apoptosis”, “Necrosis”, “Cell Growth Processes”, “Cell Proliferation”, “Cell Division”, …} I(“Biological Phenomena…”)= {“Biological Phenomena…”, “Cell Physiology”, “Cell Growth Processes”, …} I(“Cell Death”)= {“Cell Death”, “Autophagy”, “Apoptosis”, “Necrosis”} I(“Cell Proliferation”)= {“Cell Proliferation”, “Cell Division”}

Visualization of the Active Tree as presented to the user

slide-25
SLIDE 25

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

NAVIGATION MODEL

TOP-DOWN User explores the Active Tree until she finds every relevant tuple in the query result In response to a query, BioNav presents the initial active tree to the user The user navigates the tree by performing one of the following actions:

EXPAND SHOWRESULTS IGNORE

slide-26
SLIDE 26

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

NAVIGATION MODEL

Model of exploration of node C in TOP-DOWN scenario: Algorithm 1 Explore C

1: if n is not a leaf node, then choose one of the following then 2:

SHOWRESULTS(n)

3:

IGNORE(n)

4:

S ← EXPAND(n)

5:

for each ni ∈ S do

6:

EXPLORE(ni)

7:

end for

8: else 9:

CHOOSE one of the following:

10:

a) Examine all tuples in (C)

11:

b) IGNORE C

12: end if

slide-27
SLIDE 27

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

Define cost as the total number of items, both tuples and concept labels, examined by the user Minimizing the cost also minimizes the information-overload a user encounters The choices for a given user for a given query is not known apriori

but structure of the active tree and the distribution of results

  • n the tree are known

Use this knowledge to estimate the cost for the average case

slide-28
SLIDE 28

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

Define cost as the total number of items, both tuples and concept labels, examined by the user Minimizing the cost also minimizes the information-overload a user encounters The choices for a given user for a given query is not known apriori

but structure of the active tree and the distribution of results

  • n the tree are known

Use this knowledge to estimate the cost for the average case

slide-29
SLIDE 29

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

Define cost as the total number of items, both tuples and concept labels, examined by the user Minimizing the cost also minimizes the information-overload a user encounters The choices for a given user for a given query is not known apriori

but structure of the active tree and the distribution of results

  • n the tree are known

Use this knowledge to estimate the cost for the average case

slide-30
SLIDE 30

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

Define cost as the total number of items, both tuples and concept labels, examined by the user Minimizing the cost also minimizes the information-overload a user encounters The choices for a given user for a given query is not known apriori

but structure of the active tree and the distribution of results

  • n the tree are known

Use this knowledge to estimate the cost for the average case

slide-31
SLIDE 31

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

Define cost as the total number of items, both tuples and concept labels, examined by the user Minimizing the cost also minimizes the information-overload a user encounters The choices for a given user for a given query is not known apriori

but structure of the active tree and the distribution of results

  • n the tree are known

Use this knowledge to estimate the cost for the average case

slide-32
SLIDE 32

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

INTUITION

Aim: Minimize the overall navigation cost There is a trade-off between the number of navigation actions (expand actions and viewing labels) and viewing

  • results. Factors affecting cost:

Showing a large number of results up-front increases cost A large number of navigation actions also increase the cost The active tree has a large number of duplicates, which add to cost

Assumption: Tuples in the query results are not ranked every tuple is assumed have equal relevance

slide-33
SLIDE 33

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

PROBABILITIES

The user choices in navigation model are non-deterministic and not equally likely However, a cost estimate is needed (to compute optimal navigation path) even before the user starts navigation

slide-34
SLIDE 34

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

PROBABILITIES

The user choices in navigation model are non-deterministic and not equally likely However, a cost estimate is needed (to compute optimal navigation path) even before the user starts navigation

slide-35
SLIDE 35

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

PROBABILITIES

To estimate the cost, we introduce two probabilities to capture the user’s intensions: EXPLORE Probability PE(T): Probability that the user is interested in the component sub-tree and hence will explore it

1 − PE(T) is the probability that the user would ignore

EXPAND Probability PC(T): Probability that the user executes a EXPAND component sub-tree and hence will explore it

1 − PC(T) is the probability that the user would choose to see all the tuples of T

slide-36
SLIDE 36

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

PROBABILITIES

To estimate the cost, we introduce two probabilities to capture the user’s intensions: EXPLORE Probability PE(T): Probability that the user is interested in the component sub-tree and hence will explore it

1 − PE(T) is the probability that the user would ignore

EXPAND Probability PC(T): Probability that the user executes a EXPAND component sub-tree and hence will explore it

1 − PC(T) is the probability that the user would choose to see all the tuples of T

slide-37
SLIDE 37

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

PROBABILITIES

To estimate the cost, we introduce two probabilities to capture the user’s intensions: EXPLORE Probability PE(T): Probability that the user is interested in the component sub-tree and hence will explore it

1 − PE(T) is the probability that the user would ignore

EXPAND Probability PC(T): Probability that the user executes a EXPAND component sub-tree and hence will explore it

1 − PC(T) is the probability that the user would choose to see all the tuples of T

slide-38
SLIDE 38

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

COST FORMULA

If the user explores a concept node n, she has two choices: SHOWRESULT(n): with cost (1 − PC(n)) × numRes(n) EXPAND(n): cost has 2 components

Expand action. Cost : 1 Viewing the revealed labels |S| EXPLOREing the component-subtrees

s∈S cost(s)

Total cost of exploring a node n is: costEXPLORE(n) = (1 − PC(n)) × numRes(n) + PC(n)

  • 1 + |S| +
  • s∈S

cost(s)

  • numRes(n) is the number of distinct tuples in the component subtree rooted at n

S the set of component trees generated by an EdgeCut

slide-39
SLIDE 39

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

COST FORMULA (CONTD)

For a given node n, a user can either EXPLORE or IGNORE a node Ignored nodes do not add to cost, that is, costIGNORE(n) = 0 costTOTAL = ((1 − PE(n)) × costIGNORE(n)) + (PE(n) × costEXPLORE(n)) = PE(n)×

  • (1 − PC(n)) × numRes(n) + PC(n)
  • 1 + |S| +
  • s∈S

cost(s)

slide-40
SLIDE 40

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

ESTIMATING EXPLORE PROBABILITY PE

A concept node has higher EXPLORE probability PE if it has a large number of tuples attached to it unless, the concept is too generic and non-discriminatory and appears in a large number of tuples in the database, e.g., ’cancer’ or ’water’ Therefore: PE is proportional to the number of tuples attached to a node, for the given query inversely proportional to the total number of tuples associated with the concept in the database

slide-41
SLIDE 41

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

ESTIMATING EXPLORE PROBABILITY PE

PEXPLORE(n) ∝ (numResquery(n) numResdb(n) ) Normalized over all nodes in the active tree: PEXPLORE(n) = (numResquery(n) numResdb(n) )/

  • ni∈Ntree

(numResquery(ni) numResdb(ni) )

slide-42
SLIDE 42

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

ESTIMATING EXPAND PROBABILITY PC

Intuition: Expanding a component-subtree with a ’large’ number of tuples decreases cost whereas, for sub-trees with ’small’ number of nodes, expanding increases cost For subtrees with moderate number of tuples:

If the results are widely distributed, expanding may reduce cost by narrowing down the nodes in the sub-tree

slide-43
SLIDE 43

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

COST MODEL

ESTIMATING EXPAND PROBABILITY PC

If numRes(nTn) > thresupper we set PC to 1, that is, always favor EXPAND If numRes(nTn) < threslower we set PC to 0, that is, always favor SHOWRESULTS For the remaining cases, we use (normalized) entropy(nTn) to estimate PC entropy(nTn) =

  • n∈Tn

numRes(n) numRes(Tn) log numRes(n) numRes(Tn)

− log

1 numRes(Tn) numRes(n) is the number of results in node n numRes(Tn) is the number of distinct results in sub-tree T

slide-44
SLIDE 44

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

NAIVE

Enumerate all possible sequence of EdgeCuts over the initial active tree Compute the cost as given by the cost formula, and take the minimum Complexity: O(22|E|)

slide-45
SLIDE 45

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

NAIVE

MESH … Biological Phenomena… Cell Death Cell Growth Processes Cell Physiology Apoptosis Autophagy Necrosis Cell Proliferation … Cell Division …

Example: Section of Active Tree with two subsequent cuts and the corresponding component sub-trees

slide-46
SLIDE 46

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

OPTIMAL

Enumerate only valid EdgeCuts Use dynamic programming to reduce computation cost Complexity: O(|V| × 2|E|) Still too slow to be used as real-time algorithm!

slide-47
SLIDE 47

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

OPTIMAL

Enumerate only valid EdgeCuts Use dynamic programming to reduce computation cost Complexity: O(|V| × 2|E|) Still too slow to be used as real-time algorithm!

slide-48
SLIDE 48

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

OPTIMAL

MESH … Biological Phenomena… Cell Death Cell Growth Processes Cell Physiology Apoptosis Autophagy Necrosis Cell Proliferation … Cell Division …

1 2

Example: Section

  • f Active Tree with

two possible cuts

slide-49
SLIDE 49

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ALGORITHMS FOR EDGECUT

HEURISTIC

Idea: Reduce the size of the active tree and run the

  • ptimal algorithm

Ensure: The reduced tree ’approximates’ the active tree as much as possible Method: We use the equi-partitioning algorithm proposed by [Misra77] to partition the active tree Partitions are created such that each partitioned sub-tree has approximately same SHOWRESULTS cost, that is, same number of results A representing node is created for each partition and added to the reduced tree while maintaining the parent/child relationship

[Misra77] J. Misra and M. Kundu: A linear tree partitioning algorithm, SIAM 77

slide-50
SLIDE 50

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

ARCHITECTURE

MeSH Concepts Lookup Navigation Subsystem

Keyword Query EXPAND & SHOWRESULTS Actions

BioNav Database MEDLINE Database

BioNav

On-Line Off-Line

PubMed Central

Entrez Programming Utilities (eUtils) Citations/MeSH Concepts Associations Download

  • BioNav

Web Interface Query Result Citation IDs Retrieval Navigation Tree Construction Active Tree Visualization

Concepts & Citations

Heuristic-ReducedOpt Algorithm MeSH Concept Hierarchy

slide-51
SLIDE 51

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

SETUP

Experiments to evaluate: Effect on navigation cost Performance of the system Setup: Total of ten queries considered for evaluation Queries and target concepts were sourced from expert-users from the biomedical domain Cover a range of use-cases including:

Queries with highly specific keywords with a relevant specific concept Non-specific queries with a relevant specific concept

slide-52
SLIDE 52

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

IMPROVEMENT IN OVERALL NAVIGATION COST

99 127 175 157 161 119 195 164 249 159 20 18 14 18 26 39 11 31 32 17 50 100 150 200 250 300

Overall Navigation Cost

(# of Concepts Revealed + # of EXPAND Actions)

Static Navigation Heuristic-ReducedOpt

Navigation Cost of BioNav vs. Static Navigation

slide-53
SLIDE 53

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

IMPROVEMENT IN NAVIGATION ACTIONS

5 3 7 3 4 3 5 4 7 6 4 4 4 4 5 8 2 7 5 4 1 2 3 4 5 6 7 8 9

# of EXPAND Actions

Static Navigation Heuristic-ReducedOpt

Expand actions of BioNav vs. Static Navigation

slide-54
SLIDE 54

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

QUERY WORKLOAD

Keyword(s) # of Citations in Query Result Navigation Tree Size Maximum Tree Width Maximum Tree Height Tree Citations w/ Duplicates Target Concept MeSH Level of Target Concept |L(n)| of Target Concept |LT (n)| of Target Concept LbetaT2 116 1947 1009 10 14927 Mice, Transgenic 5 11 90804 melibiose permease 160 1324 722 8 14419 Substrate Specificity 3 31 79470 varenicline 162 1830 962 6 11370 Nicotinic Agonists 7 81 18277 Na+/I symporter 163 2596 1367 6 17146 Perchloric Acid 3 7 4250 prothymosin 313 3941 2113 10 30897 Histones 4 15 22741 ice nucleation 474 3181 1776 9 27440 Plants, Genetically Modified 3 2 12330 vardenafil 486 3424 2014 8 40987 Phosphodiesterase Inhibitors 5 401 69984 dyslexia genetics 517 3056 1691 9 45079 Polymorphism, Single Nucleotide 4 18 18843 syntaxin 1A 1115 6589 3764 10 105503 GABA Plasma Membrane Transport Protein 7 11 650 follistatin 1183 6446 3656 10 102946 Follicle Stimulating Hormone 6 157 34540

slide-55
SLIDE 55

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

PERFORMANCE EXPERIMENTS

500 1000 1500 2000 2500 3000

Average Execution Time (ms)

Average execution time of navigation

slide-56
SLIDE 56

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

EXPERIMENTAL EVALUATION

PERFORMANCE EXPERIMENTS

200 400 600 800 1000 1200 1400 1st EXPAND (8 Partitions) 2nd EXPAND (7 Partitions) 3rd EXPAND (8 Partitions) 4th EXPAND (10 Partitions) 5th EXPAND (6 Partitions) prothymosin

Execution Time per EXPAND (ms)

Execution per-EXPAND action for query ’prothymosin’

slide-57
SLIDE 57

Motivation BioNav Framework Navigation & Cost Models Algorithms Experiments Future Work

BIONAV

FUTURE WORK

Fully integrate categorization and ranking methods Include user preferences in the cost model Explore query-history as a source of user-preference Leverage user preferences to suggest better query keywords Explore alternate cost model based on work on graph summarization