Specifications A Controlled Experiment REFSQ18, Utrecht, The - PowerPoint PPT Presentation

Using Tools to Assist Identification of Non-Requirements in Requirements Specifications – A Controlled Experiment REFSQ’18, Utrecht, The Netherlands Jonas Paul Winkler, Andreas Vogelsang DCAITI, Technische Universität Berlin March 20, 2018

Background – Requirements vs Information SRS The device must respond within 200ms. requirement information The intelligent light system is a system that ensures optimal road illumination … … Why is this important? 1) Test case creation 2) Document change management automotive supplier company SRS Test case SRS agree on Test case 2

Background – Classifying Requirements • Explicit labelling of requirements specification content elements at our industry partner („object type“) • Quality reviews: requirement documents are manually inspected for defects – Common quality criteria: correct, unambiguous, complete, verifiable… – Also: correct labelling regarding object type • Manual labelling is time-consuming and error-prone Our goal: Assist requirements engineers in verifying correct labelling of requirements and non-requirements 3

Background – Automatic Classification • ~10000 requirements and ~10000 information • Extracted from various system requirements dataset specifications at our industry partner NN trained NN SRS classify training elements • We did: Integration into a tool that issues warnings on incorrectly labelled items (“defects”) Main question: Does using such a tool provide benefits? Winkler, Jonas P .; Vogelsang, Andreas (2016): Automatic Classification of Requirements Based on Convolutional Neural Networks. In : 3rd IEEE International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). Beijing. 4

Research Questions 1. Does the usage of our tool enable users to detect more defects? 2. Does the usage of our tool reduce the number of defects introduced by users? 3. Are users of our tool prone to ignoring actual defects because no warning was issued? 4. Are users of our tool faster in processing the documents? 5. Does our tool motivate users to rephrase requirements and information content elements? 5

Experiment Design • Two-by-two crossover study with students • Students search and correct defects in a given SRS • Control Group: Students without tool (manual review) • Treatment Group: Students with tool (tool-assisted review) Group 1 Group 2 Session 1 (SRS #1) Manual Tool-assisted Session 2 (SRS #2) Tool-Assisted Manual • Compare the performance of students from both groups 6

Experiment Materials • Excerpts from actual work-in-progress SRS Document Name Total Elements Accuracy Wiper Control 115 82.6% Window Lift 261 75.8% Hands Free Access 147 85.0% • Size reduced to fit our experiment schedule • Anonymized names as requested by our industry partner • Determined true object type of all content elements • Experiment was repeated after publishing – Presented in paper: Wiper Control, Window Lift – Performed after publishing: Wiper Control, Hands Free Access 7

Evaluation Metrics & Hypotheses • Defect Correction Rate: 𝐸𝐷𝑆 = 𝐸𝑓𝑔𝑓𝑑𝑢𝑡 𝐷𝑝𝑠𝑠𝑓𝑑𝑢𝑓𝑒 𝐸𝑓𝑔𝑓𝑑𝑢𝑡 𝐽𝑜𝑡𝑞𝑓𝑑𝑢𝑓𝑒 • Defect Introduction Rate: 𝐸𝐽𝑆 = 𝐸𝑓𝑔𝑓𝑑𝑢𝑡 𝐽𝑜𝑢𝑠𝑝𝑒𝑣𝑑𝑓𝑒 𝐹𝑚𝑓𝑛𝑓𝑜𝑢𝑡 𝐽𝑜𝑡𝑞𝑓𝑑𝑢𝑓𝑒 • Unwarned Defect Miss Rate: 𝑉𝑜𝑥𝑏𝑠𝑜𝑓𝑒 𝐸𝑓𝑔𝑓𝑑𝑢𝑡 𝑁𝑗𝑡𝑡𝑓𝑒 𝑉𝐸𝑁𝑆 = 𝑉𝑜𝑥𝑏𝑠𝑜𝑓𝑒 𝐸𝑓𝑔𝑓𝑑𝑢𝑡 𝐽𝑜𝑡𝑞𝑓𝑑𝑢𝑓𝑒 • Time Per Element: 𝑈𝑝𝑢𝑏𝑚 𝑈𝑗𝑛𝑓 𝑇𝑞𝑓𝑜𝑢 𝑈𝑄𝐹 = 𝐹𝑚𝑓𝑛𝑓𝑜𝑢𝑡 𝐽𝑜𝑡𝑞𝑓𝑑𝑢𝑓𝑒 • Element Rephrase Rate: 𝐹𝑆𝑆 = 𝐹𝑚𝑓𝑛𝑓𝑜𝑢𝑡 𝑆𝑓𝑞ℎ𝑠𝑏𝑡𝑓𝑒 𝐹𝑚𝑓𝑛𝑓𝑜𝑢𝑡 𝐽𝑜𝑡𝑞𝑓𝑑𝑢𝑓𝑒 8

Result Overview • Total number of students per experiment: – ~25 (experiment #1), ~20 (experiment #2) Document Manual group Tool-assisted group # reviews # elements # reviews # elements Exp #1 (Wiper Control) 7 506 7 749 Exp #1 (Window Lift) 4 772 3 435 Exp #2 (Wiper Control) 5 575 4 460 Exp #2 (Hands Free) 4 588 5 691 Total 20 2441 19 2335 9

Defect Correction Rate 10

Defect Introduction Rate 11

Unwarned Defect Miss Rate 12

Time Per Element 13

Element Rephrase Rate 14

Summary of Results • RQ1 : Users of our tool detect more defects, given that the accuracy is high enough. • RQ2 : Less defects are introduced when our tool is used. • RQ3 : Users are more likely to miss unwarned defects. • RQ4 : On our group of students, time did not improve significantly. • RQ5 : Students were not inclined to rephrase more elements when the tool was used. 15

Threats to Validity • Construct validity – Number of Participants – Definition of gold standard • Internal validity – Maturation – Communication between groups – Time limit • External validity – Students are no RE experts 16

Summary & Future Work • Tool support enables users to find more defects • Repeated tool usage may also improve review time (maturation) • Tool usefulness largely depends on classifier accuracy • Future Work – Collect more data points – Repeat experiment with RE experts jonas.winkler@tu-berlin.de Thank you. 17

Specifications A Controlled Experiment REFSQ18, Utrecht, The - PowerPoint PPT Presentation

Using Tools to Assist Identification of Non-Requirements in Requirements Specifications A Controlled Experiment REFSQ18, Utrecht, The Netherlands Jonas Paul Winkler, Andreas Vogelsang DCAITI, Technische Universitt Berlin March 20, 2018

EPUB in the Wild Liz Castro @lizcastro http://PigsGourdsandWikis.com

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

MEDICAL SOLUTIONS Controlled Power Company MEDICAL SOLUTIONS Controlled Power Company MEDICAL

Count Controlled CSCI-UA.0002-008 Loops Count Controlled Loops A count controlled loop is a

Introduction Introduction Batteries Battery Pack 24V/DC 5.5Ah Technical specifications

Sodium Reactor Experiment Accident Sodium Reactor Experiment Accident Sodium Reactor Experiment

Future Outlook: Experiment Future Outlook: Experiment Future Outlook: Experiment Future Outlook:

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

Remote controlled tractor TECHNICAL with attachments SPECIFICATIONS LV300 Green Climber: REMOTE

PHYSICS PROSPECTS OF THE PHYSICS PROSPECTS OF THE JUNO EXPERIMENT JUNO EXPERIMENT Monica Sisti

Methods and Specifications Practical considerations for methods, specifications and applications of

2014 SPECIFICATIONS UPDATE Book Cover for the 2014 Standard Specifications for Construction and

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

STORAGE STORAGE RACKS RACKS PREMIUM SERIES SPECIFICATIONS SPECIFICATIONS Welded connection

A Crash Course on A Crash Course on Temporal Specifications Temporal Specifications [Kansas

Specifications Introduction to the Module This module is dedicated to specifications The

A SYSTEMATIC STUDY OF AUTOMATED PROGRAM REPAIR: FIXING 55 OUT OF 105 BUGS FOR $8 EACH Claire

Randomization methods Tamuno Alfred, PhD Biostatistician DataCamp Designing and Analyzing

trial Jim Bolognese www.cytel.com Email: bolognese@cytel.com OUTLINE Overview of Adaptive

in Rare Patients " Some day risk will be zero My, oh my Some day pills will be magic And

Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli

Metastability for Interacting Particle Systems Introduction Frank den Hollander, Leiden

The QCD crossover from Lattice QCD July 25, 2018 Patrick Steinbrecher HotQCD collaboration The

Evaluating Magnetic Fields for the Helical Kink Instability Ajeeta Khatiwada Linfield College,

Specifications A Controlled Experiment REFSQ18, Utrecht, The - PowerPoint PPT Presentation

Using Tools to Assist Identification of Non-Requirements in Requirements Specifications A Controlled Experiment REFSQ18, Utrecht, The Netherlands Jonas Paul Winkler, Andreas Vogelsang DCAITI, Technische Universitt Berlin March 20, 2018

EPUB in the Wild Liz Castro @lizcastro http://PigsGourdsandWikis.com

Screening Controlled Substance Screening Controlled Substance Screening Controlled Substance

MEDICAL SOLUTIONS Controlled Power Company MEDICAL SOLUTIONS Controlled Power Company MEDICAL

Count Controlled CSCI-UA.0002-008 Loops Count Controlled Loops A count controlled loop is a

Introduction Introduction Batteries Battery Pack 24V/DC 5.5Ah Technical specifications

Sodium Reactor Experiment Accident Sodium Reactor Experiment Accident Sodium Reactor Experiment

Future Outlook: Experiment Future Outlook: Experiment Future Outlook: Experiment Future Outlook:

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

Remote controlled tractor TECHNICAL with attachments SPECIFICATIONS LV300 Green Climber: REMOTE

PHYSICS PROSPECTS OF THE PHYSICS PROSPECTS OF THE JUNO EXPERIMENT JUNO EXPERIMENT Monica Sisti

Methods and Specifications Practical considerations for methods, specifications and applications of

2014 SPECIFICATIONS UPDATE Book Cover for the 2014 Standard Specifications for Construction and

M Squared Engineering M Squared Engineering PLAN REVIEW AND PLAN REVIEW AND SPECIFICATIONS

STORAGE STORAGE RACKS RACKS PREMIUM SERIES SPECIFICATIONS SPECIFICATIONS Welded connection

A Crash Course on A Crash Course on Temporal Specifications Temporal Specifications [Kansas

Specifications Introduction to the Module This module is dedicated to specifications The

A SYSTEMATIC STUDY OF AUTOMATED PROGRAM REPAIR: FIXING 55 OUT OF 105 BUGS FOR $8 EACH Claire

Randomization methods Tamuno Alfred, PhD Biostatistician DataCamp Designing and Analyzing

trial Jim Bolognese www.cytel.com Email: bolognese@cytel.com OUTLINE Overview of Adaptive

in Rare Patients &quot; Some day risk will be zero My, oh my Some day pills will be magic And

Adaptive Operator Selection via Online Learning and Fitness Landscape Metrics Pietro Consoli

Metastability for Interacting Particle Systems Introduction Frank den Hollander, Leiden

The QCD crossover from Lattice QCD July 25, 2018 Patrick Steinbrecher HotQCD collaboration The

Evaluating Magnetic Fields for the Helical Kink Instability Ajeeta Khatiwada Linfield College,

in Rare Patients " Some day risk will be zero My, oh my Some day pills will be magic And