Mining the Mind, Minding the Mine
Grand Challenges in Comprehension and Mining Andy J. Ko, Ph.D.
Mining the Mind, Minding the Mine Grand Challenges in Comprehension - - PowerPoint PPT Presentation
Mining the Mind, Minding the Mine Grand Challenges in Comprehension and Mining Andy J. Ko, Ph.D. Inter disciplin arity Drawing upon two or more branches of knowledge Andrew J. Ko 2 About me Associate Professor at the UW Information
Mining the Mind, Minding the Mine
Grand Challenges in Comprehension and Mining Andy J. Ko, Ph.D.
Andrew J. Ko
Drawing upon two or more branches of knowledge
2
Andrew J. Ko
disciplines
3
Andrew J. Ko
testing skills?
4
Andrew J. Ko
methods from human-computer interaction?
5
Come to my Most Influential Paper award talk at ICSE on Friday The Whyline
Andrew J. Ko
Information School (plus 4 years at AnswerDash, a startup I co-founded)
failures at scale?
triage evidence-based?
6
Andrew J. Ko
7
Andrew J. Ko
My history with comprehension and mining
attended my first IWPC in 2003 (Portland, OR, USA)
downloaded my first dump of the Linux, Apache, and Firefox bug repositories
haven’t ever attended MSR!
8
Andrew J. Ko
9
Andrew J. Ko
10
Andrew J. Ko
11
Andrew J. Ko
12
Andrew J. Ko
communities, I’m going to identify weaknesses in each community
same weaknesses.
even greater, we must surface our disciplinary shortcomings.
13
14
Andrew J. Ko
software engineering more effective, efficient, enjoyable, and successful
discovery, of new tools, processes, insights
(methods), and what we believe will make a difference (phenomena)
15
Andrew J. Ko
16
Andrew J. Ko
program comprehension
developer’s program comprehension
and weaknesses of comprehension tools
17
Andrew J. Ko
18
foo(); bar(); baz(); foo(); bar(); baz(); foo(); bar(); baz(); foo(); bar(); baz();
Andrew J. Ko
process, method, architecture, domain, defects, debt
19
foo(); bar(); baz(); foo(); bar(); baz(); foo(); bar(); baz(); foo(); bar(); baz();
Andrew J. Ko
Two sides of the same phenomenon
20
Comprehension perception cognition decisions collaboration contexts
foo(); bar(); baz(); bar(); foo(); baz();
Mining code commits issues dependencies defects
Andrew J. Ko
foo(); bar(); baz(); bar(); foo(); baz();
Comprehension = better decisions
21
Comprehension perception cognition decisions collaboration contexts
comprehension
streamline collaboration
theories of comprehension that support design and education
Andrew J. Ko
22
foo(); bar(); baz(); bar(); foo(); baz();
Mining code commits issues dependencies defects
software process
software analytics
Andrew J. Ko
developers’ understanding of complex systems
developers’ processes
from the other to be valuable
23
Andrew J. Ko
24
Andrew J. Ko
25
Andrew J. Ko
human subjects
ecologically valid contexts to strongly support their claims
26
Andrew J. Ko
Developers like about Static Analysis.” ICPC ’18.
a static analysis tool
about the tool
because it relied on retrospective self-report
27
Andrew J. Ko
ground truth contexts in which to test hypotheses
actually used repositories—just not to understand program comprehension.
28
Andrew J. Ko
comprehension of code
comprehension
29
Andrew J. Ko
comprehension of API semantics, who then write brittle code
don’t understand which calls are asynchronous, which leads to code that seems correct with shallow testing
30
Andrew J. Ko
semantic facts are by counting the number of Stack Overflow questions about that API
31
Andrew J. Ko
Software Projects: Atoms of Confusion in the Wild.” MSR 2018
patterns and bug-fix commits
32
Andrew J. Ko
program comprehension
productivity, and other outcomes
33
Andrew J. Ko
34
Andrew J. Ko
feasibility, correctness, coverage, accuracy of tools
intended for developers, only one evaluated usefulness
questions about how these tools would be used by developers, managers, and teams to actually improve software engineering.
mining tools untested.
35
Andrew J. Ko
Traceability Information to Improve Bug Localization” MSR 2018.
localization!
useful to developers in comprehending, localizing, or repairing defects.
36
Andrew J. Ko
developers on real teams
37
Andrew J. Ko
and teams
used to impact software engineering practice
evaluating how they do and do not support software engineering
38
Andrew J. Ko
use data to make decisions, they use it confirm prior beliefs
predictions as evidence of their prior knowledge about components, and see little actionable insight
39
Andrew J. Ko
insights from the data
40
Andrew J. Ko
predictions will be viewed as useless
valuable to developers and managers
achieve usefulness
41
Andrew J. Ko
discussed is a perfect example of a human subjects study of developers’ perception of value
were valuable, which rules were not, and why
42
Andrew J. Ko
evaluate tools with people.
LaToza and Margaret Burnett and I have written down many of these skills for you.
(2015). A practical guide to controlled experiments of software engineering tools with human
Engineering, 20(1), 110-141.
43
Andrew J. Ko
44
Andrew J. Ko
45
Andrew J. Ko
strategies, effects of tools; few explain.
and trends; few explain.
informal theories that informed tool or empirical study design,
46
Andrew J. Ko
phenomena in software engineering (e.g., comprehension, process, coordination, defects)
fail, why decisions are poor, why projects are late, etc.
hypotheses, and test them in the lab and the field, with developers and with data.
communicate greater truths to industry about software
47
Andrew J. Ko
Coordination (STTC) (Herbsleb 2016) .
constraints imposed by technical dependencies
coordination requirements determined by these constraints with actual coordination between individuals.
48
Andrew J. Ko
foo() and bar() to coordinate the dependency.
49
Andrew J. Ko
social and technical constraints causes defects and delays by limiting the information that developers have for decision making.
resolve modification requests
increases in software failures over time
50
Andrew J. Ko
coordination should be attempting to falsify this theory:
greater whole, explaining the work of software engineering
51
Andrew J. Ko
TeX” (1989) is one of my favorite qualitative empirical studies from SE
study of defects
theory of how defects arise in practice
52
Andrew J. Ko
more basic research on human error (Reason 1990), which I adapted into a theory of defects
framework and methodology for studying the causes of software errors in programming
Languages & Computing, 16(1-2), 41-84.
53
Andrew J. Ko
in code completion)
increment template for a decrement problem)
knowledge of an API’s expressiveness)
human behavior in a driverless car context)
54
Andrew J. Ko
When those loops deviate from routine, developers will
55
Andrew J. Ko
theory, we may produce a grand theory of where all defects come from
process, and tools
56
Andrew J. Ko
that developers and teams need help localizing defects and prioritizing testing
57
Andrew J. Ko
empirical or technical should advance or falsify a theory about software engineering
framework in which to combine our individual discoveries into greater truths
58
Andrew J. Ko
59
Andrew J. Ko
(Lo, Nagappan, Zimmermann 2015)
not generalizable, or too costly
and what we cite in research papers
60
Andrew J. Ko
61 Ko, A. J. (2017, May). A three-year participant observation of software startup software evolution. ICSE, SEIP.
Andrew J. Ko
We can answer these, but we need both comprehension and mining
process will help?
developers faster?
faster?
decisions with business priorities?
in the field if no one reports it?
62
scientists, management scientists, and learning scientists
together interdisciplinary teams to answer these big questions
Andrew J. Ko
aren’t productive for months.
2008) that suggest organizational management theories of “newcomer socialization” best explain learning needs
behavior, connections to expertise about architecture, code review practices, norms about meetings, walkthroughs of feature implementations, and much more
63
Andrew J. Ko
which a new hire meets with a developer to learn about:
architectural walkthrough tools that situate features in architectures, provide rationale, reveal business goals, and surface practices and norms around process
64
Andrew J. Ko
information in a feature interview?
produce effective comprehension of architecture?
such a walkthrough?
questions that go well beyond reading code.
65
Andrew J. Ko
comprehending architectural rationale?
predict what code is necessary to discuss?
is qualified to author a feature walkthrough?
and mining that go well beyond repositories.
66
Andrew J. Ko
(organizations, learning, teaching, business decisions)
interdisciplinary expertise
empirical contributions we typically value in software engineering research
learning, designs in HCI, strategies in computing education
67
Andrew J. Ko
68
Andrew J. Ko
69
than the next paper or promotion
years before we have progress worth reporting
and longer
Andrew J. Ko
70
must value other forms of scholarship (theory, development of instruments, etc.)
Andrew J. Ko
71
HCI, organizational science, management science, cognitive psychology, social psychology, and others are explaining the software engineering phenomena that our field is investigating. We should know what they’ve discovered, and build upon it.
Andrew J. Ko
72
Visit local meetups. Talk to them about what’s hard about their jobs. Discover what questions they have. You’ll be surprised how little their needs align with our questions.
Andrew J. Ko
73
You have more in common than you think. Use this week to find a shared project.
Andrew J. Ko
become irrelevant
software engineering matter not only to CS, but software engineering practice
software development:
help, but we put most of our attention on a few specific safety- critical domains
74
Andrew J. Ko
75
Andrew J. Ko
76
Summary
Andy J. Ko, Ph.D.