Visualizing Social Media Content with SentenTree Mengdie Hu, Krist - - PowerPoint PPT Presentation

visualizing social media content with sententree
SMART_READER_LITE
LIVE PREVIEW

Visualizing Social Media Content with SentenTree Mengdie Hu, Krist - - PowerPoint PPT Presentation

Visualizing Social Media Content with SentenTree Mengdie Hu, Krist Wongsuphasawat, John Stasko. IEEE TVCG 23(1):621-630 2017 (Proc. InfoVis 2016) Presented by: David Johnson Unstructured Text Documents Twitter/Social Media collections are many


slide-1
SLIDE 1

Visualizing Social Media Content with SentenTree

Mengdie Hu, Krist Wongsuphasawat, John Stasko. IEEE TVCG 23(1):621-630 2017 (Proc. InfoVis 2016)

Presented by: David Johnson

slide-2
SLIDE 2

Unstructured Text Documents

Twitter/Social Media collections are many unstructured text documents Unstructured text documents are hard to analyze! Many authors, redundant information Can accumulate many of these documents in short time

2

slide-3
SLIDE 3

Summarizing Unstructured Documents

Could extract common information & present a world cloud Word clouds good at a glance to gain overarching theme World clouds lose concepts and structure How do we maintain semantic representation?

3

slide-4
SLIDE 4

SentenTree

4

slide-5
SLIDE 5

SentenTree

Node-link visualization with force-directed placement Edge between words indicates occurrence in same tweet Spatial arrangement is syntactic ordering Large font indicates high frequency of occurrence

5

slide-6
SLIDE 6

Frequent Sequential Patterns

Initialization steps:

  • Normalize tweets
  • Perform tokenization
  • Root node of tree of sequential patterns is initial pattern
  • Initial pattern contains no words
  • Grow new sequential patterns from the root

6

slide-7
SLIDE 7

Frequent Sequential Patterns

7

slide-8
SLIDE 8

Frequent Sequential Patterns

8

slide-9
SLIDE 9

Frequent Sequential Patterns

9

slide-10
SLIDE 10

Frequent Sequential Patterns

10

slide-11
SLIDE 11

Frequent Sequential Patterns

11

slide-12
SLIDE 12

Frequent Sequential Patterns

12

slide-13
SLIDE 13

Frequent Sequential Patterns

13

slide-14
SLIDE 14

Frequent Sequential Patterns

14

slide-15
SLIDE 15

Frequent Sequential Patterns

15

slide-16
SLIDE 16

Interaction Demo

https://twitter.github.io/SentenTree/

16

slide-17
SLIDE 17

Visual Encoding

SentenTree uses a constrained force-directed placement algorithm Placement constraints: word order, vertical, horizontal

17

slide-18
SLIDE 18

Visual Encoding

Only word order constraint applied

18

slide-19
SLIDE 19

Visual Encoding

Only word order constraint applied Horizontal and vertical constraints added

19

slide-20
SLIDE 20

Considerations: Tokenization

Stop words and punctuation removed Numbers, hashtags, urls, @ handles are matched No stemming performed

20

slide-21
SLIDE 21

Critique

The Bad: No stemmer Final visualizations are still sometimes ambiguous

21

slide-22
SLIDE 22

Critique

The Good: System accomplishes design goals Well written paper, easy to understand examples Scalable

22

slide-23
SLIDE 23

Thanks!

Questions?

23