user feedback in software evolution Andrew J. Ko with Parmit - - PowerPoint PPT Presentation

user feedback in software evolution
SMART_READER_LITE
LIVE PREVIEW

user feedback in software evolution Andrew J. Ko with Parmit - - PowerPoint PPT Presentation

user feedback in software evolution Andrew J. Ko with Parmit Chilana based on Ko, A.J. and Chilana, P. (2010). How power users help and hinder open bug reporting. ACM Conference on Human Factors in Computing (CHI). Chilana, P., Ko, A.J.,


slide-1
SLIDE 1

Andrew J. Ko

user feedback in software evolution

with Parmit Chilana

Ko, A.J. and Chilana, P. (2010). How power users help and hinder open bug reporting. ACM Conference on Human Factors in Computing (CHI). Chilana, P., Ko, A.J., & Wobbrock, J.O. (2010). Understanding expressions

  • f unwanted behaviors in open bug reporting. IEEE Symposium on Visual

Languages and Human-Centric Computing (VL/HCC).

based on

slide-2
SLIDE 2

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

2

finding and fixing everyday software defects

helping developers diagnose failures detecting usability defects automatically helping users diagnose failures where do defects come from? what makes debugging difficult? what do software teams need to fix defects?

slide-3
SLIDE 3

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

3

finding and fixing everyday software defects

software team user community

slide-4
SLIDE 4

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

4

finding and fixing everyday software defects

software team user community

slide-5
SLIDE 5

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

finding and fixing everyday software defects

crowdsourced help in support forums web-based user feedback perpetual beta status

  • pen development processes

users have more opportunities than ever to influence what defects are found and fixed

  • pen bug reporting

5

software team

slide-6
SLIDE 6

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

6

users have more opportunities than ever to influence what defects are found and fixed who contributes such feedback? what do they write about? what feedback is addressed? why is some feedback ignored? how can teams get better feedback? how can teams use feedback better?

slide-7
SLIDE 7

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

7

who contributes such feedback? what do they write about? what feedback is addressed? why is some feedback ignored? how can teams get better feedback? how can teams use feedback better? research from the past two years research in progress

slide-8
SLIDE 8

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

}

8

  • pen bug reporting

in the Mozilla project

who contributes such feedback? what do they write about? what feedback is addressed? why is some feedback ignored? research from the past two years

slide-9
SLIDE 9

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

9

  • utline

who contributes such feedback? what do they write about? what feedback is addressed? why is some feedback ignored? a brief introduction to bug reporting

1 3 4 2

slide-10
SLIDE 10

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

10

a user’s expectations are violated

I wanted this to be a new tab, not a new window! “

slide-11
SLIDE 11

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

11

user visits bugzilla.mozilla.org

slide-12
SLIDE 12

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

12

users are encouraged to look for an existing bug report that describes their issue

slide-13
SLIDE 13

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

13

if they don’t find one, they submit a new bug report...

Options for where to open URLs from other applications (reuse tab, new window)

This is a dupe of http://bugzilla.mozilla.org/show_bug.cgi?id=75138 (for Mozilla). There’s a great discussion of it over there. However, that RFE has been listed for

  • ver a year now and is still marked for “future”.
slide-14
SLIDE 14

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

14

slide-15
SLIDE 15

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

15

FIXED DUPLICATE INCOMPLETE WONTFIX WORKSFORME INVALID

resolution reporter

the reason the report was closed who created the report

assignee

the developer assigned to resolve this report

date

when the report was created

slide-16
SLIDE 16

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

16

attachment attacher

screenshots, patches, test cases, mockups, etc.

date

slide-17
SLIDE 17

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

17

comments and commenters

anyone interested in this bug can add information to the report

slide-18
SLIDE 18

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

18

  • riginal report

comment comment screenshot comment comment patch 1.0 comment patch 2.0 code review patch 3.0 closed

a typical report lifespan

slide-19
SLIDE 19

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

19

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

slide-20
SLIDE 20

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

20

  • btained all 496,766 reports except for private security patches

15 years of reports, including Netscape years 152,877 unique e-mail addresses 64% addresses only authored, attached to, or commented on 1 report who’s e-mail addresses were these? who contributes bug reports?

1

slide-21
SLIDE 21

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

21

CORE developers, drivers, super reviewers,

1%

module owners, peers

who’s e-mail addresses were these?

slide-22
SLIDE 22

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

22

ACTIVE developers assigned bug reports

1% 1%

who was behind these 152,877 addresses?

CORE developers, drivers, super reviewers, module owners, peers

slide-23
SLIDE 23

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

23

ACTIVE developers assigned bug reports REPORTERS reported and commented on bug reports responsible for 54% of reports

1% 1% 80%

who was behind these 152,877 addresses?

CORE developers, drivers, super reviewers, module owners, peers

slide-24
SLIDE 24

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

24

ACTIVE developers assigned bug reports REPORTERS reported and commented on bug reports responsible for 54% of reports USERS

  • nly

commented

  • n bug reports

1% 1% 80% 18%

who was behind these 152,877 addresses?

CORE developers, drivers, super reviewers, module owners, peers

slide-25
SLIDE 25

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

!" #$!!!" %$!!!" &$!!!" '$!!!" (!$!!!" (#$!!!" )*+",-'" ./0",--" 123",--" 456",!!" )*+",!(" ./0",!#" 123",!#" 456",!7" 489",!%" :*;",!<" =>?",!<" :5@",!&" 489",!A" :*;",!'" =>?",!'" 9/829?/9B" 5B/9B" *>C3/" >29/"

Firefox(0.1 Firefox(1 Firefox(2 Firefox(3

25

# of active contributors by type, per 6 month period

reporters and users fluctuate spike before a release

slide-26
SLIDE 26

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

26

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

slide-27
SLIDE 27

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

27

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

slide-28
SLIDE 28

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

28

what do they write about?

2

selected a sample of 50 REPORTER reports inductively analyzed titles and descriptions

slide-29
SLIDE 29

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

29

what do they write about?

2

the most salient feature was the source of expectation what group or person believed the behavior was defective?

slide-30
SLIDE 30

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

30

what do they write about?

2

iterated through 3 samples of 100 reports, converging towards a set

  • f 7 sources

runtime logic specifications standards individual community genre prior

slide-31
SLIDE 31

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

31

runtime logic specifications standards individual community genre prior

developer intents user expectations

what do they write about?

2

slide-32
SLIDE 32

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

runtime violations

errors, warnings, assertion violations, crashes, hangs, and language-defined invalid states

“…scary deadlock assertions exiting mozilla after referencing nsInstallTrigger…”

32

language

slide-33
SLIDE 33

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

specification violations

an agreed upon functional requirement among the application developers

“There's an incorrectly placed PR_MAX in the code for pref width distribution of colspanning cells.”

33

application language

slide-34
SLIDE 34

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

standards violations

industry-wide functional specifications, reaching beyond the application’s developer community

“'codebase' attribute of the HTML 4.0 OBJECT element is not supported…”

34

industry application language

slide-35
SLIDE 35

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

violations of a reporter expectations

a reporter’s personal perspective about what the system should do

“Every time I Sort By Name by Bookmarks Firefox sorts and closes my Bookmark menu... Why does it do this??”

35

app user

slide-36
SLIDE 36

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

violation of a community’s expectations

a reporter’s belief about a “typical” user’s expectations

“The preference to not show the tab bar when

  • nly one tab is open could be set to false by
  • default. This would at least alert a new user to

the possibility that tabs exist) The old tabbed browsing preferences could be returned.”

36

app user community app user

slide-37
SLIDE 37

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

violation of genre conventions

inconsistency with the behavior of a similar application

“Firefox does not limit the slideshow horizontal size to the window width. The same source works correctly in IE.”

37

industry user community

a

app user community app user

slide-38
SLIDE 38

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

inconsistency with prior behavior

community expectation that behavior of previous versions would be preserved

“The latest version of Firefox only imports

  • ne certificate from each file. I used to

import all certificates previously.”

38

industry user community

a

app user community app user

t

slide-39
SLIDE 39

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

39

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

expectations from developer and user communities of varying population scope

slide-40
SLIDE 40

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

40

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

expectations from developer and user communities of varying population scope

slide-41
SLIDE 41

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%# reporter# ac4ve# core# duplicate# incomplete# won=ix# worksforme# invalid# fixed#

most DEVELOPER reports are fixed

41

what are the outcomes

  • f their reports?

3

slide-42
SLIDE 42

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%# reporter# ac4ve# core# duplicate# incomplete# won=ix# worksforme# invalid# fixed#

  • nly 13% of REPORTER reports were fixed

42

what are the outcomes

  • f their reports?

3

slide-43
SLIDE 43

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%# reporter# ac4ve# core# duplicate# incomplete# won=ix# worksforme# invalid# fixed#

half of fixed reports were reported by just ~8,000 (6% of) REPORTERs

43

what are the outcomes

  • f their reports?

3

slide-44
SLIDE 44

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%# reporter# ac4ve# core# duplicate# incomplete# won=ix# worksforme# invalid# fixed#

most REPORTER reports were duplicate, worksforme, or invalid

44

what are the outcomes

  • f their reports?

3

slide-45
SLIDE 45

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

45

were the duplicates useful?

73% of REPORTERs’ duplicates referred to fixed reports 70% of REPORTERs’ duplicates referred to issues known for > 1 month 66% of REPORTER duplicates of fixed reports were created after a patch was attached most REPORTER reports identified issues that were already known already patched

slide-46
SLIDE 46

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

# of REPORTER reports by resolution (per 3 months)

!" #!!!" $!!!" %!!!" &!!!" '!!!!" '#!!!" ()*"+,&"

  • ./"+,&"

0.1"+,," 234"+!!" ()*"+!'" 567"+!'" 839"+!#" ():"+!;" <.="+!$" 567"+!$" 839"+!>" ():"+!%" <.="+!?" 567"+!?" 839"+!&" @31AB/)C." B4/6D1A.C." E64FBG" E6*HIJ6*D." B47)AB@" KG.@"

Firefox(0.1 Firefox(1 Firefox(2 Firefox(3

dropped since the alpha version...

46

slide-47
SLIDE 47

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

% of REPORTER resolution types (per 3 months)

likelihood of invalid or incomplete

!"# $!"# %!"# &!"# '!"# (!"# )!"# *!"# +!"# ,!"# $!!"#

  • ./#0,+#

123#0,+# 425#0,,# 678#0!!#

  • ./#0!$#

9:;#0!$# <7=#0!%#

  • .>#0!&#

?2@#0!'# 9:;#0!'# <7=#0!(#

  • .>#0!)#

?2@#0!*# 9:;#0!*# <7=#0!+# A75BC3.D2# C83:E5B2D2# F:8GCH# F:/IJK:/E2# C8;.BCA# LH2A#

Firefox(0.1 Firefox(1 Firefox(2 Firefox(3

47

incomplete invalid

likelihood of fixed

slide-48
SLIDE 48

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

48

% fixed reports by contributor type (per 3 months)

proportion by REPORTERs has dropped since1.0

!"# $!"# %!"# &!"# '!"# (!!"# )*+#,-'# ./0#,-'# 1/2#,--# 345#,!!# )*+#,!(# 678#,!(# 94:#,!$# )*;#,!<# =/>#,!%# 678#,!%# 94:#,!?# )*;#,!&# =/>#,!@# 678#,!@# 94:#,!'# 07+/# *0A8/# +/27+B/+#

Firefox(0.1 Firefox(1 Firefox(2 Firefox(3

slide-49
SLIDE 49

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

49

sampled 1,000 REPORTER reports selected the most salient expectation cited in each report classified each as

  • ne of the 7 sources

redundant coding agreement was ~75%

runtime logic specifications standards individual community genre prior sources of expectations

what was the effect of the expectation source in report resolution?

slide-50
SLIDE 50

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

50

expectation source had a significant association with resolution χ2(7, N=1000) = 35.8, p<.001

what was the effect of the expectation source

  • n REPORTER report resolutions?
slide-51
SLIDE 51

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

51

genre and individual expectations least likely to be fixed

what was the effect of the expectation source

  • n REPORTER report resolutions?
slide-52
SLIDE 52

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

52

multiple reporters filing duplicates from any source of expectation sometimes led to fixes

what was the effect of the expectation source

  • n REPORTER report resolutions?
slide-53
SLIDE 53

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

53

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

expectations from developer and user communities of varying population scope mostly not fixed, unless grounded in community expectations or in large numbers

slide-54
SLIDE 54

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

54

mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

expectations from developer and user communities of varying population scope mostly not fixed, unless grounded in community expectations or in large numbers

slide-55
SLIDE 55

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

55

why do their reports have these outcomes?

a qualitative analysis of reports with REPORTER and USER comments 100

fixed

100

incomplete

100

invalid

100

worksforme

100

duplicate

100

wontfix

+

40 reports with USER comments

(5% of all reports)

4

slide-56
SLIDE 56

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

fixed reports

56

terse, productive an obvious shared understanding of process usually a single REPORTER followed by a patch some involved diagnosis by the REPORTER before a patch could be written

13% of REPORTER reports

slide-57
SLIDE 57

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

incomplete, invalid, worksforme reports

57

problems were ambiguous and ego-centric 1/3 identified issues already resolved in a recent build

“have you tried the latest nightly build?”

2/3 two thirds identifed problems caused by exotic configurations:

“so i trashed the preferences and all was fine again. thank you all for your time. everyday mozilla is getting better, thank to people like you!” (104347:7) 38% of REPORTER reports

slide-58
SLIDE 58

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

duplicate reports

58

most were about widely experienced problems with nightly builds 88/100 were marked duplicate on the same day most had only 1 comment, reminding reporters to check for duplicates

  • nly 12/100 had attachments, such as logs and

screenshots

42% of REPORTER reports

slide-59
SLIDE 59

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

wontfix reports

59

narrow expert feature requests

“it would be nice if I could...” half explained the resolution, saying the feature was not broadly useful to “regular” users the rest were denied because the request was supported through other means (e.g., plug-ins)

some REPORTERs expressed frustration

“If you don't change Thunderbird, then Firefox on Mac must be changed, it must be done the same way. (383036:3)”

3% of REPORTER reports

slide-60
SLIDE 60

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

reports with USER comments

60

regarded contentious Firefox design choices

bookmarks, location bar, file handling, keyboard shortcuts, tabs, security, history

most REPORTER and USER contributions expressed

agreement (“me too!”) frustration (“this is ridiculous!”) confusion about the process (“why was it closed?”)

5% of all reports

slide-61
SLIDE 61

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

61

Forcing my users to retype the filename (presuming they even know what it should be) is just plain oppressive, IMHO... The

  • rganizations (large choral groups) for

which I'm creating sites use the right- click save extensively to download (instead of play in their browsers) audio files for rehearsing music. ... It's just *code*, guys. Figure it out. (299372:49:reporter)

contentious reports with USERs

5% of all reports, 40 sampled

REPORTERS trivialized engineering work

slide-62
SLIDE 62

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

62

Over two months ago I gave complete information on when and how I got the error AND spent a great deal of time isolating the messages that caused it ... Which part of that is just saying "me too"? For crying out loud, I'm a nursing student, not a programmer. Do you do your own x-rays before going to the doctor? (252697:10:user)

contentious reports with USERs

5% of all reports, 40 sampled

REPORTERS viewed it as a service

slide-63
SLIDE 63

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

63

Mozilla "Foundation", you have cash, you have the resources. FIX IT. PEOPLE DONATED MONEY TO HAVE YOU *FIX THIS KIND-OF SHIT*...This is the *EXACT* sort-of situation that shows why open-source *FAILS*.

contentious reports with USERs

5% of all reports, 40 sampled

REPORTERS expected service for payment

slide-64
SLIDE 64

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

64

OK, calm down everyone. How many times do I have to say it? It's what we want to try out to start

  • with. That is not code for "we've

made a decision" or "your arguments all suck and we're going to ignore them"... That's what the trunk is

  • for. Experimentation. We want to do

a UI experiment.

contentious reports with USERs

5% of all reports, 40 sampled

developers tried to explain the process

slide-65
SLIDE 65

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

65

REPORTERS understood little about the context of resolution decisions, leading to missing information and egocentric requests mostly non-developer, one-time contributors who were active pre-release (REPORTERS)

who contributes bug reports? what are the outcomes

  • f their reports?

why do their reports have these outcomes?

1 3 4

what do they write about?

2

expectations from developer and user communities of varying population scope mostly not fixed, unless grounded in community expectations or in large numbers

slide-66
SLIDE 66

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

66

how well did this dialog between users and developers work?

slide-67
SLIDE 67

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

67

the ~8,000 successful REPORTERs act like closed source beta testers

most active before a release effective at writing good reports trained over several years

slide-68
SLIDE 68

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

68

all other REPORTERs were far less effective

most of their issues should have been triaged by a tech support misunderstandings about process sometimes led to friction

the ~8,000 successful REPORTERs act like closed source beta testers

most active before a release effective at writing good reports trained over several years

✔ ✕

slide-69
SLIDE 69

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

69

bug reporting is a skill

  • pen bug reporting

communities appear to cultivate this skill few contributors acquire these skills, perhaps because are early negative experiences bug reporting is an ineffective place to gather user feedback

were far less

should have tech support about process iction

l REPORTERs closed source beta testers

a release good reports trained over several years

✔ ✕ ✕

slide-70
SLIDE 70

Andrew J. Ko – IBM Research Symposium, Sept. 30, 2010

do these trends occur in non- FLOSS projects via user feedback? do explicitly user-centered software teams treat user feedback differently? are there better ways of gathering and aggregating user feedback from help forums, new help tools? what kind of data would help teams interpret user feedback when triaging bugs?

future work

skill eporting r to uire because eporting is not an ather

+

Paul Li