Bimodal Software Documentation Software Documentation [1985] - - PowerPoint PPT Presentation

bimodal software documentation software documentation
SMART_READER_LITE
LIVE PREVIEW

Bimodal Software Documentation Software Documentation [1985] - - PowerPoint PPT Presentation

Christoph Treude Bimodal Software Documentation Software Documentation [1985] University of Adelaide 2 Software Documentation is everywhere [C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE 11: 2nd Intl. Workshop


slide-1
SLIDE 1

Bimodal Software Documentation

Christoph Treude

slide-2
SLIDE 2

University of Adelaide

[1985]

Software Documentation

2

slide-3
SLIDE 3

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

3

slide-4
SLIDE 4

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100%

4

slide-5
SLIDE 5

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74%

5

slide-6
SLIDE 6

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74% 59%

6

slide-7
SLIDE 7

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74% 59% 44%

7

slide-8
SLIDE 8

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74% 59% 44% 37%

8

slide-9
SLIDE 9

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74% 59% 44% 37% 162 different domains in the top 10 for 99 queries

9

slide-10
SLIDE 10

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 59% 36% Tensorflow Python API: 309 different domains in the top 10 for 2,192 queries

10

slide-11
SLIDE 11

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

jQuery Event API: 75 different domains in the top 10 for 57 queries 100% 59% 36% Tensorflow Python API: 309 different domains in the top 10 for 2,192 queries 100% 100% 98%

11

slide-12
SLIDE 12

University of Adelaide

Navigating documentation is not trivial

12

slide-13
SLIDE 13

University of Adelaide

Navigating documentation is not trivial

13

Common Tasks

Link Link Link Link Link Link Link Link

slide-14
SLIDE 14

University of Adelaide

verb noun adjective

Extracting tasks from documentation

[C. Treude, M. P. Robillard, and B. Dagenais. Extracting Development Tasks to Navigate Software Documentation. IEEE

  • Trans. on Software Engineering, 41, 6, p. 565-581]

14

slide-15
SLIDE 15

University of Adelaide

Grammatical dependencies

direct object: generate confirmation direct object: generate receipt

[C. Treude, M. P. Robillard, and B. Dagenais. Extracting Development Tasks to Navigate Software Documentation. IEEE

  • Trans. on Software Engineering, 41, 6, p. 565-581]

15

slide-16
SLIDE 16

University of Adelaide

Grammatical dependencies

passive nominal subject: set size

[C. Treude, M. P. Robillard, and B. Dagenais. Extracting Development Tasks to Navigate Software Documentation. IEEE

  • Trans. on Software Engineering, 41, 6, p. 565-581]

16

slide-17
SLIDE 17

University of Adelaide

Grammatical dependencies

adjective modifier: set thumbnail size passive nominal subject: set size

[C. Treude, M. P. Robillard, and B. Dagenais. Extracting Development Tasks to Navigate Software Documentation. IEEE

  • Trans. on Software Engineering, 41, 6, p. 565-581]

17

slide-18
SLIDE 18

University of Adelaide

Grammatical dependencies

preposition: set thumbnail size in templates passive nominal subject: set size adjective modifier: set thumbnail size

[C. Treude, M. P. Robillard, and B. Dagenais. Extracting Development Tasks to Navigate Software Documentation. IEEE

  • Trans. on Software Engineering, 41, 6, p. 565-581]

18

slide-19
SLIDE 19

University of Adelaide

[C. Treude, M. Sicard, M. Klocke, and M. P. Robillard. TaskNav: Task-based Navigation of Software Documentation. ICSE ’15: 37th Int’l. Conf. on Software Engineering, p. 649-652]

19

slide-20
SLIDE 20

University of Adelaide

Software Documentation is everywhere

[C. Parnin and C. Treude. Measuring API Documentation on the Web. Web2SE ’11: 2nd Int’l. Workshop on Web 2.0 for Software Engineering, p. 25-30]

100% 74% 59% 44% 37%

20

slide-21
SLIDE 21

University of Adelaide 21

[C. Treude and M. P. Robillard. Augmenting API Documentation with Insights from Stack Overflow. ICSE ’16: 38th Int’l. Conference on Software Engineering, p. 392-403]

slide-22
SLIDE 22

insight sentence a sentence from Stack Overflow that is related to a particular API type and that provides insight not contained in the API documentation of that type

slide-23
SLIDE 23

Supervised Insight Sentence Extractor

Augment API documentation with insights from Stack Overflow

23

slide-24
SLIDE 24

University of Adelaide

Bimodal software documentation

[B. A. Campbell and C. Treude. NLP2Code: Code Snippet Content Assist via Natural Language Tasks. ICSME ’17: 33rd Int’l.

  • Conf. on Software Maintenance and Evolution, to appear]

24

slide-25
SLIDE 25

Challenges in Analyzing Documentation

University of Adelaide 25

  • Software documentation is technical and often contains

references to code elements

  • Natural language text written by software developers

may not obey all grammatical rules, e.g.,

– sentences that are grammatically incomplete – content that has not authored by a native speaker

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-26
SLIDE 26

Comparing NLP libraries

University of Adelaide 26

CoreNLP SyntaxNet spaCy NLTK Returns the C++ variable. Returns the C++ variable. Returns the C++ variable. Returns the C++ variable.

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-27
SLIDE 27

Comparing NLP libraries

University of Adelaide 27

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable .

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-28
SLIDE 28

Comparing NLP libraries

University of Adelaide 28

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable .

  • 1. different

tokenization

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-29
SLIDE 29

Comparing NLP libraries

University of Adelaide 29

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable . NNS DT NN JJ CC JJ . VBZ DT NNP NN . VBZ DT NNP NN . NNS DT NN JJ .

  • 1. different

tokenization

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-30
SLIDE 30

Comparing NLP libraries

University of Adelaide 30

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable . NNS DT NN JJ CC JJ . VBZ DT NNP NN . VBZ DT NNP NN . NNS DT NN JJ .

  • 1. different

tokenization

  • 2. general

part of speech

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-31
SLIDE 31

Comparing NLP libraries

University of Adelaide 31

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable . NNS DT NN JJ CC JJ . VBZ DT NNP NN . VBZ DT NNP NN . NNS DT NN JJ .

  • 1. different

tokenization

  • 2. general

part of speech

  • 3. specific

part of speech

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-32
SLIDE 32

Comparing NLP libraries

University of Adelaide 32

CoreNLP SyntaxNet spaCy NLTK Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable . NNS DT NN JJ CC JJ . VBZ DT NNP NN . VBZ DT NNP NN . NNS DT NN JJ .

  • 1. different

tokenization

  • 2. general

part of speech

  • 3. specific

part of speech

Only between 60% and 71% of tokens from Stack Overflow, GitHub, and the Java API Documentation were assigned the same part-of-speech tag by all four libraries.

[F. N. A. Al Omran and C. Treude. Choosing an NLP Library for Analyzing Software Documentation: A Systematic Literature Review and a Series of Experiments. MSR '17: 14th Int’l. Conf. on Mining Software Repositories, p. 187-197]

slide-33
SLIDE 33

University of Adelaide

Bimodal software documentation

[B. A. Campbell and C. Treude. NLP2Code: Code Snippet Content Assist via Natural Language Tasks. ICSME ’17: 33rd Int’l.

  • Conf. on Software Maintenance and Evolution, to appear]

33

slide-34
SLIDE 34

University of Adelaide

Bimodal software documentation

tasks tasks code

[B. A. Campbell and C. Treude. NLP2Code: Code Snippet Content Assist via Natural Language Tasks. ICSME ’17: 33rd Int’l.

  • Conf. on Software Maintenance and Evolution, to appear]

34

slide-35
SLIDE 35

University of Adelaide

Code Snippet Content Assist

tasks tasks code

[B. A. Campbell and C. Treude. NLP2Code: Code Snippet Content Assist via Natural Language Tasks. ICSME ’17: 33rd Int’l.

  • Conf. on Software Maintenance and Evolution, to appear]

35

slide-36
SLIDE 36

University of Adelaide

[B. A. Campbell and C. Treude. NLP2Code: Code Snippet Content Assist via Natural Language Tasks. ICSME ’17: 33rd Int’l.

  • Conf. on Software Maintenance and Evolution, to appear]

36

slide-37
SLIDE 37

The integration of natural language and code in documentation

University of Adelaide 37

slide-38
SLIDE 38

The integration of natural language and code in documentation

University of Adelaide

creates challenges & opportunities for software engineering tools.

38

CoreNLP SyntaxNet spaCy NLTK

Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable .

slide-39
SLIDE 39

The integration of natural language and code in documentation

University of Adelaide

creates challenges & opportunities for software engineering tools.

39

CoreNLP SyntaxNet spaCy NLTK

Returns the C + + variable . Returns the C++ variable . Returns the C++ variable . Returns the C++ variable .

Thank you! christoph.treude@adelaide.edu.au