[PPT] - How Developers Read and Comprehend Stack Overflow Questions for Tag PowerPoint Presentation

SLIDE 1

How Developers Read and Comprehend Stack Overflow Questions for Tag Prediction

Senior Capstone Project By: Ali Morris

SLIDE 2

Objectives

Determine what developers focus on when reading Stack Overflow questions

to assign tags using eye-tracking

Determine valuable areas of interest (AOIs) for tag assignment especially

keywords

Research Questions

RQ1. Which sections of postings are most valuable when assigning tags (code, title,

etc)?

RQ2. How will non-novice developers compare against novice developers in

regards to tag assignment accuracy, reading patterns, areas of interest?

RQ3. How can this information be used to enhance existing auto-generating tag

techniques?

2

SLIDE 3

Stack Overflow

The largest online community for programmers to learn & share their knowledge

○ 2 million questions, 19 million answers and 47 million comments ○ Available to download; data dump of size 70GB

Forum format where developers can post questions and others can respond
Organization of site dependent on classification scheme driven by tagging system
Why is auto-tagging important? →

○ Users may not know how to correctly categorize questions ○ Stack Overflow dependent upon this for organization, usefulness

Current auto-generation tag accuracy: 68.47% [1]

3

SLIDE 4

Related Work

Studies to auto-generate tags for Stack Overflow without eye-tracking
Current approaches all similar:

○ Data Mining & Machine Learning Algorithms [1]-[3] ■ Extract important features by tokenizing many postings ■ Train algorithms on existing data to predict tags for new postings

Can use these concepts for future work
Can improve tag accuracy with eye tracking as implicit feedback

4

SLIDE 5

Eye-Tracking

Gaze data holds information about visual attention

○ Thought processes, strategies, user technique

A new field: Eye-tracking to study how developers work
Huge amount of data per session:

○ Running at 60Hz → 60 samples per second

Different types of gaze data holding different information

5

SLIDE 6

Eye-Tracking

Types of gaze data & analysis:

○ Fixation: focus point where the eyes remain stationary for some time ○ Duration: total fixation time for an area ○ Saccade: Quick eye movement between fixations ○ Scanpath: sequences saccade-fixation-saccade that interconnect ○ Area of Interest (AOI): specific areas on the screen on which quantitative eye movements (fixation counts and durations) are calculated

6

SLIDE 7

Experiment Design

Conducted in eye-tracking lab utilizing Tobii Studio
7 participants

○ CS, CIS, & EE majors attending Youngstown State University ○ Coding experience in C/C++ of less than a year and up to 5 years ○ Each briefed on the study and participated in pre and post surveys

Participants presented with of 9 tasks from 3 different categories

○ Sourced directly from Stack Overflow ○ Questions C/C++ relevant ○ Categories increased with complexity & curated based on defined criteria

Participants assigned up to 5 tags from a Suggested Tags list

○ 10 possible tags: 5 relevant, 5 distractors ○ Participants allowed to suggest tags not in list if necessary 7

SLIDE 8

Task Categories

Simple Content commonly taught in CS1:

Simple data types
Operators
Control structures
Basic properties of C/C++ language.

Average Knowledge beyond CS1 level & comes from experience developing:

Specific details of data structures
Involved application of aspects from the simple level

Complex Applications of more difficult/compound topics:

Algorithm designs
Complicated memory management techniques
Obscure/intense properties of the C++ language.

8

SLIDE 9

Figure 1. Sample Task Representation 9

SLIDE 10

Analysis

AOI groups assigned to each task:

○ Title ○ Description ○ Code ○ Relevant Tags ○ Distractor Tags ○ Keywords 10

SLIDE 11

Figure 2. AOI Representation 11

SLIDE 12

Analysis: Tag Accuracy

Average Accuracy: 90.57%
Average Tags per Task: 3
Feedback on overall confidence

levels generally reflected accuracy

12

SLIDE 13

Analysis: Tag Accuracy

Tag accuracy decreases with difficulty

13

SLIDE 14

Analysis: Overall Fixation Duration

Averages of all recordings
Relevant and Distractor Tags

approximately equal

Most focus time on

Description & code

Least focus time on Title

14

SLIDE 15

Analysis: Overall Fixation Duration over Categories

Noticable Duration trends on Code & Title fixations

15

SLIDE 16

Analysis: Overall Fixation Count

Averages of all recordings
Approximately consistent

with Duration Fixations

16

SLIDE 17

Analysis: Overall Fixation Count over Categories

Same trends appear with changes in Code and Title

17

SLIDE 18

Analysis: Accuracy Non-novice v. Novice

Non-novice performed slightly better
Where novice excelled:

○ Average Level Tasks Also only assigned 1-2 tags in this category Average Tag Assignment

Non-novice: 3-4 tags
Novice: 2 tags

Non-novice more confident in general in tag assignment 18

SLIDE 19

Analysis: Fixation Duration Non-Novice v. Novice

19 Duration Ratios: Code: 32% Title & Description: 37% Duration Ratios: Code: 22% Title & Description: 46%

SLIDE 20

Analysis: Fixation Count Non-Novice v Novice

20 Count Ratios: Code: 32% Title & Description: 43% : Code: 13 s Title & Description: 27 s Count Ratios: Code: 24% Title & Description: 50% : Code: 13 s Title & Description: 27 s

SLIDE 21

Analysis: Keywords

First time to fixation

○ Tags not evaluated before posting

Notice: on average a quick

fixation on keywords... 21

SLIDE 22

Analysis: Keywords

Readers often go back to keyword after first fixation
Average of 26% of fixation on keywords; a small portion of screen

22

SLIDE 23

Fixation Count vs Duration vs Visits [4]

23

SLIDE 24

Conclusions

Fixation count & duration often correlates
Approximately equal time spent evaluating Relevant and Distractor tags
With an increase in difficulty →

■ Increase of fixations on Code ■ Decrease of fixations on Title (especially true for non-novice programmers)

Non-novice programmers: perform better, assigned more tags, focus more on

code in comparison to novice & use it more as questions become more difficult

Novice programmers: less accuracy in tag assignment, assigned less tags,

focus mostly on description & title

From visual and statistical analysis: developers tend to evaluate postings first

and tags after (sequential pattern)

○ Learning styles & reading patterns can affect outcome [5]

Developers quickly focus on keywords & revisit frequently throughout

evaluation

24

SLIDE 25

Future Work

Continuation of this project:

Machine algorithms (informed by eye-gaze) to predict tags:

○ Linear Support Vector Machines (SVM), Naive Bayes, Random Forest

Keyword Identification:

Identify keywords in text automatically

○ Consider existing models for tag generation compounded with eye-tracking

Recognize code as relevant keywords

○ Will differ with different languages 25

SLIDE 26

References

[1]

A. K. Saha, R. K. Saha, and K. A. Schneider, “A discriminative model approach for suggesting tags

automatically for stack overflow questions,” in Proceedings of the 10th Working Conference on Mining Software Repositories , 2013. [2]

C. Stanley and M. D. Byrne, “Predicting tags for stackoverflow posts,” in Proceedings of ICCM , 2013, vol.

2013. [3]

S. Schuster, W Zhu, Y. Cheng, “Predicting Tags for Stack Overflow Questions”, 2013.

[4] Tobii AB, “Tobii Studio User’s Manual”, Version 3.4.5, 2016. [5]

A. Goswami, G. Walia, M. McCourt, G. Padmanabhan, “Using Eye Tracking to Investigate Reading Patterns

and Learning Styles of Software Requirement Inspectors to Enhance Inspection Team Outcome”, in Proceedings of ESEM, 2016. 26