visualization for rich text
play

Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, - PowerPoint PPT Presentation

1 FacetAtlas: Multifaceted Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz Shixia Liu, Huamin Qu InfoVis 2010 2 Introduction 3 multiple facets 4 Symptoms Treatments multiple facets Causes Tests &


  1. 1 FacetAtlas: Multifaceted Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz Shixia Liu, Huamin Qu InfoVis 2010

  2. 2 Introduction

  3. 3 multiple facets

  4. 4 Symptoms Treatments multiple facets Causes Tests & Diagnosis Prognosis Prevention Complications

  5. 5 Diabetes

  6. 6 Type2 Metabolic Diabetes Syndrome Type1 Gestational Diabetes

  7. 7 Type2 Metabolic Diabetes Syndrome Type1 Gestational Diabetes How to visualize the relations of multifaceted document contents?

  8. 8 Type2 (Q1) How to model the document contents into Metabolic Diabetes multifaceted relation Syndrome data? (Q2) How to intuitively Type1 visualize multifaceted document contents and their relations? (Q3) How to find the Gestational insight patterns visually Diabetes driven by users’ interests?

  9. 9 Solution • Goal : – Visualize both the global (clusters) and local (relations) patterns in rich text corpora with multiple facets . • Approach : – Multifaceted entity-relational data model – Intuitive visual encoding and automatic layout – User s’ interests driven interaction for pattern detection

  10. 10 Demo

  11. 11 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to intuitively visualize multifaceted document contents and their relations? (Q3) How to find the insight patterns visually driven by users’ interests?

  12. 12 (Q1) How to model the document contents into multifaceted relational data ? facet document set segmentation entity extraction entity set multifaceted entity relational data model type 1 type 2 diabetes diabetes Internal disease thirst relations blurred symptom vision treatment take blood sugar medications control External relations

  13. 13 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to intuitively visualize multifaceted document contents and their relations? (Q3) How to find the insight patterns visually driven by users’ interests?

  14. 14 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout

  15. 15 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout

  16. 16 Encoding Multifaceted Entity Relational Model 16

  17. 17 Encoding 1 disease 1 2 Type 1 Diabetes 2 3 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 17

  18. 18 Encoding External relations Multifaceted entities symptoms treatments 1 disease 1 2 Type 1 Diabetes 2 3 Internal relations 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 18

  19. 19 Encoding External relations Group entities by external relations Multifaceted entities symptoms treatments 1 disease 1 Group internal 2 relations Type 1 Diabetes 2 3 Internal relations 4 3 Type 2 Diabetes 5 4 6 Multifaceted Entity Relational Model 19

  20. 20 Encoding Encoded external relation between disease facet and symptom facet Facet Node 1. Encode external relations by treatments neighborhood 2. Split overlap 1 Grouped internal 1, 2 entities into relation 3, 4 multiple replicas Type 1 Diabetes 2 <1, 5> 3. Group related <2, 4> entities and their replicas in into <3, 4> Overlapped the facet node <3, 5> 3 entities has multiple replicas Type 2 4. Grouping the Diabetes 3, 4 related internal 5, 6 linkages in the 4 symptom facet

  21. 21 Encoding treatments Symptom disease facet node 1, 2 3, 4 1, 2 1. Similarly groups Type 1 the treatments Diabetes <1, 5> entities into the <2, 4> <1, 2> treatment facet <3, 4> <1, 3> node <3, 5> 2. Then we encoded the Type 2 data model into Diabetes 3, 4 2, 3 visual form 5, 6 4

  22. 22 (Q2) How to visualize multifaceted document contents and their relations? 2 1 1, 2 1, 2 3, 4 <4, 2> <1, 2> <4, 3> <1, 3> <5, 3> <5, 1> 3, 4 2, 3 5, 6 4 data model encoding layout

  23. 23 Layout 10,000 entities and 30,000 external relations 23

  24. 24 Layout sampling entity layout density estimation link layout 24

  25. 25 Layout sampling entity layout density estimation link layout Sampling by DOI offline online facet document set segmentation entity extraction disease symptom treatment related samples query build indices

  26. 26 Layout sampling entity layout density estimation link layout Stabilized Layout Based on the hidden internal relations of primary facet Keep users’ mental map while data changed      1    2     2 min X X d X pre ( X )   i j ij i i 2 d     i j i j ij Cluster Together More smoothly

  27. 27 Layout Cluster Layout sampling entity layout density estimation link layout Kernel Density RNN Estimation

  28. 28 Layout sampling entity layout density estimation link layout Link Layout (1) Layout external relations rotating swapping

  29. 29 Layout sampling entity layout density estimation link layout Link Layout (2) graph partition edge bundling

  30. 30 Fever

  31. 31 Diabetes

  32. 32 HIV

  33. 33 HIV Where are our patterns? What can we find ?

  34. 34 Key Challenges (Q1) How to model the document contents into multifaceted relation data? (Q2) How to visualize multifaceted information to reveal both global and local patterns? (Q3) How to find the insight patterns visually driven by users’ interests?

  35. 35 (Q3) How to find insights via user interactions? Symptom view Disease view context switch Keyword Query Context Switch Filtering Highlighting filtering A set of interactions are designed to address users’ interests

  36. 36 Visual Patterns Symptoms of HIV • Global cluster patterns • Local multifaceted relational pattern – Co-occurrences pattern – Outlier pattern Headache Fatigue Fever Shortness of Breath Outlier Co-occurrence

  37. 37 (Q3) Interview of domain experts What did domain experts (3 physicians) say?  “enhance the current thought process of physicians, and help create the subtle associations between different concepts .”  “this will be very helpful for nurses who run the self-care education activities to better engage patients .”  “this tool has great potential as an education tool for interns and residents who have just started their medical career”  “extremely creative and has great potential for clinical therapeutic usage and diagnosis decision support ”

  38. 38 Summary • Problem : How to visualize relations of multifaceted document contents ? Global / Local patterns • Approach : • Result :

  39. 39 FacetAtlas: Multifaceted Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz, Shixia Liu, Huamin Qu InfoVis 2010

  40. 40 Related Work Visualizing Global Content Patterns S. Havre, et al. H. Strobelt, et al. InfoVis 2000 Tag Cloud InfoVis 09 Visualizing Local Relational Patterns F. van Ham, et al. M. W. Christopher, et al. A. Pere, et al. InfoVis 2009 Vast 2009 InfoVis 2006 Search Interface F. van Ham, et al. G. Smith Grokker InfoVis 2009 TVCG 2006

  41. 41 Related Work Visualizing Global Content Patterns S. Havre, et al. H. Strobelt, et al. InfoVis 2000 Tag Cloud InfoVis 09 Our Focus : Extract complex relations from document contents Visualizing Local Relational Patterns by considering F. van Ham, et al. M. W. Christopher, et al. A. Pere, et al. InfoVis 2009 Vast 2009 different aspects InfoVis 2006 Search Interface F. van Ham, et al. G. Smith Grokker InfoVis 2009 TVCG 2006

  42. 42 Evaluations 42

  43. 43 User study • Participants – 3 domain experts (2 physicians with 30 years experience in the healthcare domain, and 1 young medical professional) – 20 common users without medical background (2 groups and 10 for each) • 6 study tasks based on the Google Health online documents – T4 : identify the facet with the most cross-cluster connections. – T6 : identify the facet with the most overall connection across entities. • Baseline – Enhanced Traditional Graph Visualization – Based on the same framework with similarly interactions on the same dataset 43

  44. 44 Evaluation Results from non-experts Complete Time surveys Task Success Rate Result (based on two tail t-test) • Significant efficiency improvement in – Visualizing the clusters – Showing an overview of multiple connections across clusters – Representing the details of multifaceted connection between entities • Slight improvement in – Finding the most connective facet within a cluster

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend