When Not to Comment Questions and Tradeoffs with API Documentation - - PowerPoint PPT Presentation

when not to comment
SMART_READER_LITE
LIVE PREVIEW

When Not to Comment Questions and Tradeoffs with API Documentation - - PowerPoint PPT Presentation

When Not to Comment Questions and Tradeoffs with API Documentation for C++ Projects Andrew Head, Caitlin Sadowski, Emerson Murphy-Hill, Andrea Knight Google, UC Berkeley, NC State University Developers Use APIs std ::string s = absl ::


slide-1
SLIDE 1

When Not to Comment

Questions and Tradeoffs with API
 Documentation for C++ Projects

Andrew Head, Caitlin Sadowski, Emerson Murphy-Hill, Andrea Knight Google, UC Berkeley, NC State University

slide-2
SLIDE 2

Developers Use APIs

std::string s = absl::FormatTime( "My flight lands in Göteborg on %Y-%m-%d at %H:%M", landing, timezone); A programmer calls the function FormatTime from the C++ absl API. Programmers use APIs all the time to save time, improve code consistency, etc.

slide-3
SLIDE 3

Writing API Documentation

std::string FormatTime(const std::string& format, ...); // Formats the given `absl::Time`... Behavior Usage // std::string f = absl::FormatTime("%H:%M:%S", ...

... and best practices, special cases, design rationale, etc. To help developers use APIs, tech writers and maintainers decide when and how to describe:

slide-4
SLIDE 4

Is the documentation answering the right questions? ... We don't know... How can we know?

A Dilemma with Designing Docs

What methods can we use to collect developer questions about API documentation?

slide-5
SLIDE 5

Our Research Questions

  • Q1. Are C++ API header comments answering

developers' questions?

  • Q2. Why might answers be missing from the headers?
  • Q3. When does it matter that comments are missing?
slide-6
SLIDE 6
  • Unanswered API Questions. 9 types of questions about

low-level usage, high-level usage, and implementation.

  • Why comments are missing? Resistance to update

comments for abandoned or young projects, or concerns about bloat and confusion.

  • When comments matter? If answers can't be recovered

from code, and if developers trust comments.

Findings

slide-7
SLIDE 7

Bug reports? Infrequently submitted for docs. Survey? Developers forget their questions. Observation? Time-consuming.

Challenges to Finding API Questions

slide-8
SLIDE 8

When to Prompt API Clients for Questions

slide-9
SLIDE 9

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h)

When to Prompt API Clients for Questions

slide-10
SLIDE 10

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h)

Method signature

When to Prompt API Clients for Questions

slide-11
SLIDE 11

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h)

Comments, low-level usage documentation

When to Prompt API Clients for Questions

slide-12
SLIDE 12

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h) An implementation file (time.cc)

std::string FormatTime(const std::string& format, if (t == absl::InfiniteFuture()) return kInfiniteFutureStr; if (t == absl::InfinitePast()) return kInfinitePastStr; const auto parts = Split(t); return cctz::detail::format(format, parts.sec, parts.fem, cctz::time_zone(tz)); }

When to Prompt API Clients for Questions

slide-13
SLIDE 13

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h) An implementation file (time.cc)

std::string FormatTime(const std::string& format, if (t == absl::InfiniteFuture()) return kInfiniteFutureStr; if (t == absl::InfinitePast()) return kInfinitePastStr; const auto parts = Split(t); return cctz::detail::format(format, parts.sec, parts.fem, cctz::time_zone(tz)); }

When to Prompt API Clients for Questions

slide-14
SLIDE 14

// FormatTime // // Formats the given `absl::Time`... // provided format std::string. U... // the following extensions: // std::string FormatTime( const std::string& format, ...);

A header file (time.h) An implementation file (time.cc)

std::string FormatTime(const std::string& format, if (t == absl::InfiniteFuture()) return kInfiniteFutureStr; if (t == absl::InfinitePast()) return kInfinitePastStr; const auto parts = Split(t); return cctz::detail::format(format, parts.sec, parts.fem, cctz::time_zone(tz)); }

This transition sometimes indicates an API question.

When to Prompt API Clients for Questions

slide-15
SLIDE 15

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Code Search interface

Prompting for API Questions in Code Search

slide-16
SLIDE 16

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Code Search interface

Prompting for API Questions in Code Search

Query for code

slide-17
SLIDE 17

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Code Search interface

Prompting for API Questions in Code Search

Navigate files

slide-18
SLIDE 18

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Code Search interface

Prompting for API Questions in Code Search

Inspect Code

slide-19
SLIDE 19

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Code Search interface

Prompting for API Questions in Code Search

slide-20
SLIDE 20

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Which best describes the information you're looking for? After a header-to-implementation transition, Code Search asked if a searcher had API questions.

Code Search interface

Prompting for API Questions in Code Search

slide-21
SLIDE 21

time.h Files std::string FormatTime(const std::string& format, Time t, TimeZone tz);

Which best describes the information you're looking for? What question are you trying to answer about this API? What .cc file are you looking at? What would be the most convenient location for this information? If API question...

Code Search interface

Prompting for API Questions in Code Search

slide-22
SLIDE 22

Benefits and Limitations of "Header-to-Implementation" Detection

+ Timely: Captures ephemeral questions. + Scalable: Deployable within search infrastructure,

and can be run on search logs.

  • Imperfect: Needs developer input to confirm the

transition was about API usage.

  • Incomplete: Currently only covers low-level

documentation in header files.

slide-23
SLIDE 23

Time Path 8:00:30 time/clock.h 12:00:00 time/time.h 12:00:10 time/time.cc

Monitor Search Behavior Survey in Code Search

?

1,147 respondents

60 API usage questions

(full C++ code base)

Interview Searchers

What were you looking for? How?

18 searchers

Interview Maintainers

Should your docs answer this question?

8 maintainers

Mixed Methods Study Design

slide-24
SLIDE 24

Qualitative Analysis

  • API Questions: Card-sorting (2 authors)
  • Interviews: Verbatim transcription, open

and axial coding of themes


(1 author, checked by another author)

slide-25
SLIDE 25
  • Q1. Are C++ API header comments answering

developers' questions?

  • Q2. Why might answers be missing from the headers?
  • Q3. When does it matter that comments are missing?

Closer Look at Results

slide-26
SLIDE 26

Why Developers Visited Implementation

% respondents

Behavior implementation Where to make change Non-functional API details Who's working on the code

0% 20% 40% 60% 80%

API usage

5% (60 / 1,147 responses)

  • Q1. API questions
slide-27
SLIDE 27

Sample: Collected API Questions

“What does the return value mean and how can this method fail” “What method to use to convert the current timestamp into a string” “How this API passes data to TensorFlow session run calls in C”

  • Q1. API questions

... and 50+ others

slide-28
SLIDE 28

Types of API Questions

Input Values Return Values How Do I...? Recommended Use Hidden Contracts Implementation Details Side Effects Extension Points Inconsistency # respondents 5 10 15

  • Q1. API questions
slide-29
SLIDE 29

Types of API Questions

Input Values Return Values How Do I...? Recommended Use Hidden Contracts Implementation Details Side Effects Extension Points Inconsistency # respondents 5 10 15

Low-Level Usage { High-Level Usage { Implementation{

  • Q1. API questions
slide-30
SLIDE 30
  • Q1. Do the header comments answer

the right questions?

Clearly not always. We collected 60 cases where developers

  • pened implementation code to check on API usage.

Writers should consider at least 9 types of questions. Most of these aren't reported in past studies.

slide-31
SLIDE 31
  • Q2. Why are comments missing?

Maintainer Interviews

Should your docs answer this question?

7 maintainers

(+1 C++ core libraries maintainer)

Code Search Click Event Analysis + Manual Verification

2 questions 3 questions 1 question

Offline Code Search log analysis Searcher Interviews

What were you looking for? How? Other cases of missing comments?

18 developers

slide-32
SLIDE 32
  • Q2. Why are comments missing?

Reason 1: Not the Right Time

slide-33
SLIDE 33
  • Q2. Why are comments missing?

Reason 1: Not the Right Time

Too late. "It’s unlikely this will ever get changed again... ostensibly it’s my team that’s responsible for it, but... if you didn’t schedule this meeting I would have forgotten this file existed."

slide-34
SLIDE 34
  • Q2. Why are comments missing?

Reason 1: Not the Right Time

Too late. "It’s unlikely this will ever get changed again... ostensibly it’s my team that’s responsible for it, but... if you didn’t schedule this meeting I would have forgotten this file existed." Too early. Reluctance to invest in comments when the current focus was evolving and fixing the code.

slide-35
SLIDE 35
  • Q2. Why are comments missing?

Reason 2: Minimal Explanations

slide-36
SLIDE 36
  • Q2. Why are comments missing?

Reason 2: Minimal Explanations

Avoiding bloat. "How often do you want to go into details, which can be easily too much?"

slide-37
SLIDE 37
  • Q2. Why are comments missing?

Reason 2: Minimal Explanations

Avoiding bloat. "How often do you want to go into details, which can be easily too much?" Avoiding misunderstanding. "...if you say something is slow, you’ll get people writing alternatives first of all, or not using it... "

slide-38
SLIDE 38
  • Q2. Why might answers be missing

from the headers?

The project may be abandoned, too young, or maintainers believe answers could add bloat or confusion.

slide-39
SLIDE 39

Survey Respondents Preferred Answers in Headers

Input Values Return Values How Do I...? Recommended Use Hidden Contracts Implementation Details Side Effects Extension Points Inconsistency

5 10 15

# API questions from survey

  • Q3. When do comments matter?
slide-40
SLIDE 40

Survey Respondents Preferred Answers in Headers

Input Values Return Values How Do I...? Recommended Use Hidden Contracts Implementation Details Side Effects Extension Points Inconsistency

5 10 15

.h .cc g3doc

# API questions from survey

  • Q3. When do comments matter?

In 61.7% of cases, it would have been most convenient to find an answer in a header—even for some high-level usage questions and implementation questions.

slide-41
SLIDE 41

When Comments Could Help Interviewees

  • Q3. When do comments matter?
  • To avoid involved code inspection. 2 / 6 interviewees

with API questions searched through multiple files,

  • ne of whom gave up.
  • To understand recommended use. Protototyping

"messy code" by looking at an API's implementation, then "making it clean" by looking in its comments.

slide-42
SLIDE 42

When Comments Wouldn't Have Helped

  • Q3. When do comments matter?
  • "... I have stopped reading comments, because the

comments are just lies."

  • One implementation visit because "it was actually

documented properly, but I didn’t believe it."

Interviewees often didn't trust comments, and sometimes skipped them, or disregarded them after reading them.

slide-43
SLIDE 43

Trust Depends on Project

  • Q3. When do comments matter?

“... there’s those sorts of [general utilities], and those tend to be very well documented. And then there’s the team-specific internal code, which is all very horribly documented.”

slide-44
SLIDE 44
  • Q3. When does it matter that

comments are missing?

When answers can't easily be recovered from code, and when developers trust comments (which isn't always).

slide-45
SLIDE 45

Takeaways

  • Methods. Piloted a technique that collects API questions

professional software developers ask.

  • API Questions. Revealed 9 types of questions developers

asked about APIs when opening implementation code.

  • Stakes. Helped document costs, benefits, and factors

influencing whether maintainers will and should update documentation comments.

slide-46
SLIDE 46

Looking Forward

  • Maintainers should refer to the questions developers ask

about APIs when writing documentation.

  • Stakeholders should consider relative gain and barriers to

updates when choosing where to answer API questions.

  • Others should extend our methods to gain insight into

questions developers ask during their work, and design tools and artifacts that provide the answers.

slide-47
SLIDE 47

Read the paper at https://tinyurl.com/icse18-comment