Evaluating How Developers Use General-Purpose Web-Search for Code - - PowerPoint PPT Presentation

evaluating how developers use general purpose web search
SMART_READER_LITE
LIVE PREVIEW

Evaluating How Developers Use General-Purpose Web-Search for Code - - PowerPoint PPT Presentation

Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval Md Masudur Rahman , Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois, Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray Date:


slide-1
SLIDE 1

Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval

Date: May 29, 2018

1

Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois, Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray

slide-2
SLIDE 2

Coding Task

2

Convert a date string to a time object

slide-3
SLIDE 3

3

string to time

slide-4
SLIDE 4

4

string to time

string to time Search Log

slide-5
SLIDE 5

5

string to time

Java

string to time Search Log

slide-6
SLIDE 6

6

string to time using java

string to time Search Log

slide-7
SLIDE 7

7

string to time using java

Search Log string to time using java string to time

slide-8
SLIDE 8

8

string to time using java

Search Log string to time using java string to time

slide-9
SLIDE 9

9

string to time using java

DateTime

Search Log string to time using java string to time

slide-10
SLIDE 10

10

date string to DateTime using java

string to time using java string to time Search Log

slide-11
SLIDE 11

11

date string to DateTime using java

Search Log string to time using java date string to DateTime using java string to time

slide-12
SLIDE 12

12

date string to DateTime using java

Joda Time library

Search Log string to time using java date string to DateTime using java string to time

slide-13
SLIDE 13

13

date string to DateTime using Joda Time library

string to time using java date string to DateTime using java string to time Search Log

slide-14
SLIDE 14

14

date string to DateTime using Joda Time library

Search Log string to time using java date string to DateTime using java date string to DateTime using Joda… string to time

slide-15
SLIDE 15

15

date string to DateTime using Joda Time library

slide-16
SLIDE 16

16

date string to DateTime using Joda Time library

X X X

slide-17
SLIDE 17

17

world cup fixtures

Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … string to time

slide-18
SLIDE 18

18

world cup fixtures

string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures string to time

slide-19
SLIDE 19

19

place to visit in gothenburg

Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures string to time

slide-20
SLIDE 20

20

Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures place to visit in gothenburg string to time

place to visit in gothenburg

slide-21
SLIDE 21

Code Query

Code

Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library

21

slide-22
SLIDE 22

Search Task

Code

Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library

22

Convert a date string to a DateTime object using Joda Time library

Search Task

slide-23
SLIDE 23

Code vs Non-code

Code Non-Code

Query world cup fixtures place to visit in gothenburg hotel in gothenburg

23

Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library

slide-24
SLIDE 24

General Purpose Search Engine for Code Retrieval

Code Non-Code

Query world cup fixtures place to visit in gothenburg hotel in gothenburg

24

Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library

slide-25
SLIDE 25

Research Goal

Code Non-Code

Query world cup fixtures place to visit in gothenburg hotel in gothenburg

25

๏ Query characteristics ๏ User behavior

Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library

slide-26
SLIDE 26

Dataset

26

Query Search Log

string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time

Users: 310 (mostly developer) Consist of code and non-code queries Total query: 150K Chrome plugin

hotel in gothenburg

slide-27
SLIDE 27

Dataset

27

Query Search Log

?

No label

Code or Non-code

string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time hotel in gothenburg

slide-28
SLIDE 28

Dataset

28

Query Search Log

?

No label

Code or Non-code

Query Classifier

string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time hotel in gothenburg

slide-29
SLIDE 29

29

Intent-based Query Classification

slide-30
SLIDE 30

Code Intent Analysis

30

Query: javascript function to get mp3 play length

slide-31
SLIDE 31

Code Intent Analysis

31

Query: javascript function to get mp3 play length CodeScore

?

slide-32
SLIDE 32

Code Intent Analysis

32

Token Code Intent S = set of code related tags n = popularity of a tag Query: javascript function to get mp3 play length CodeScore 17 7 6 5 8 3

?

slide-33
SLIDE 33

Code Intent Analysis

33

Query: javascript function to get mp3 play length CodeScore 17 7 6 5 8 3

46

Token Code Intent Query Code Intent

slide-34
SLIDE 34

Query Code Score

34

Query Code Score string to time 12 string to time using java 20 date string to DateTime using java 22.5 world cup fixtures messi curly goal 2.6 place to visit in gothenburg

slide-35
SLIDE 35

Query Code Score

35

Query Code Score Label string to time 12 ? string to time using java 20 ? date string to DateTime using java 22.5 ? world cup fixtures ? messi curly goal 2.6 ? place to visit in gothenburg ?

slide-36
SLIDE 36

Query Code Score

36

Query Code Score Label string to time 12 ? string to time using java 20 ? date string to DateTime using java 22.5 ? world cup fixtures ? messi curly goal 2.6 ? place to visit in gothenburg ?

Classifier Evaluation

Precision: 87% Recall: 86% F1-score: 87%

Threshold = 10

Manually annotated 380 queries

slide-37
SLIDE 37

Query Code Score

37

Query Code Score Label string to time 12 Code string to time using java 20 Code date string to DateTime using java 22.5 Code world cup fixtures Non-code messi curly goal 2.6 Non-code place to visit in gothenburg Non-code

Classifier Evaluation

Precision: 87% Recall: 86% F1-score: 87%

Threshold = 10

Manually annotated 380 queries

slide-38
SLIDE 38

Query Code Score

38

Query Code Score Label string to time 12 Code string to time using java 20 Code date string to DateTime using java 22.5 Code world cup fixtures Non-code messi curly goal 2.6 Non-code place to visit in gothenburg Non-code

Code : 89K (59%) Non-code : 61K (41%)

Annotated Data

Classifier Evaluation

Precision: 87% Recall: 86% F1-score: 87%

Threshold = 10

Manually annotated 380 queries

slide-39
SLIDE 39

Research Questions

39

Query Characteristics User Behavior

  • RQ1. How do query characteristics differ for code and

non-code queries?

  • RQ2. How do search behaviors vary for code and

non-code related queries?

  • RQ3. How do task sessions vary for code and non-

code related search tasks?

slide-40
SLIDE 40

Results

40

slide-41
SLIDE 41

RQ1: Query Characteristics

41

Code queries often longer (more tokens) than non-code

date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburg javascript function to get mp3 play length

Code Non-code

slide-42
SLIDE 42

RQ1: Query Characteristics

42

date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburg javascript function to get mp3 play length

Code Non-code

slide-43
SLIDE 43

RQ1: Query Characteristics

43

date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburg javascript function to get mp3 play length

Code Non-code

slide-44
SLIDE 44

RQ1: Query Characteristics

44

Code Non-code 16K 12K 33K

Code queries contain less vocabulary (unique tokens) than non-code

slide-45
SLIDE 45

45

RQ2: Query Search Behavior

Query # term added # term deleted Code string to time

  • string to time using java

2

  • date string to DateTime using

Joda Time library 4 2 Non-code hotel in gothenburg

  • best hotel in gothenburg

1

slide-46
SLIDE 46

46

RQ2: Query Search Behavior

Query # term added # term deleted Code string to time

  • string to time using java

2

  • date string to DateTime using

Joda Time library 4 2 Non-code hotel in gothenburg

  • best hotel in gothenburg

1

  • Edited query
slide-47
SLIDE 47

47

User often add/delete more terms (avg. 2) to a code compared to non- code (avg. 1)

RQ2: Query Search Behavior

Query # term added # term deleted Code string to time

  • string to time using java

2

  • date string to DateTime using

Joda Time library 4 2 Non-code hotel in gothenburg

  • best hotel in gothenburg

1

slide-48
SLIDE 48

48

RQ2: Query Search Behavior

Query # term added # term deleted Code Score Code string to time

  • 12

string to time using java 2

  • 20

date string to DateTime using Joda Time library 4 2 30.5

slide-49
SLIDE 49

49

RQ2: Query Search Behavior

Query # term added # term deleted Code Score Code string to time

  • 12

string to time using java 2

  • 20

date string to DateTime using Joda Time library 4 2 30.5

Edit query to increase code intent

slide-50
SLIDE 50

50

RQ3: Task Search Behavior

Query # query Task intent Code Task string to time 4 Converting a date string to a Time

  • bject

string to time using java date string to DateTime using Joda Time library Non-code Task hotel in gothenburg 2 Hotel booking in Gothenburg best hotel in gothenburg

More queries required to complete a code task

slide-51
SLIDE 51

51

RQ3: Task Search Behavior

Query Task intent Search duration (minute) # web visit Code Task string to time Converting a date string to a Time

  • bject

6 15 string to time using java date string to DateTime using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm

More time and website visit required to complete code related tasks

slide-52
SLIDE 52

Summary

Code Non-Code

52

Code queries are linguistically different Users modify code queries more often Users give significantly more effort for code task General Search Engine

slide-53
SLIDE 53

Summary

Code Non-Code

General Search Engine

53

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective

slide-54
SLIDE 54

Summary

Code Non-Code

General Search Engine

54

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective Special treatment required to improve code retrieval

slide-55
SLIDE 55

Question?

Code Non-Code

General Search Engine

55

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective Special treatment required to improve code retrieval