Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng - PowerPoint PPT Presentation

Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng chen CS 846 project Presentation Department of Computer Science

Outline ▪ Motivation ▪ Introduction ▪ Related works ▪ Case study 1 ▪ Case study 2 ▪ Conclusion

Motivation ▪ In some scenarios, for some tasks, any tool with less than 100% recall is not helpful and the user may be better off doing the task entirely manually. ▪ The trade off between precision and recall may make it difficult to interpret the true result. ▪ Improper use of precision and recall may affect evaluation. ▪ Different tasks need different weight for F-measure

Introduction – Recall and Precision ▪ Precision (P) is the ▪ Recall (R) is the percentage of the tool- percentage of the returned answers that correct answers that are correct. the tool returns ▪ Precision is the ▪ Which is the percentage of the found percentage of the right stuff that is right stuff that is found.

Introduction – F-Measure ▪ F-measure: harmonic mean of Precision and Recall ▪ Weighted F-Measure: For situations in which R and P are not equally important. β is the ratio by which it is desired to weight Recall more than Precision.

Case Study 1: ▪ Using Tools to Assist Identification of Non-requirements in Requirements Specifications – A Controlled Experiment(Jonas Paul Winkler and Andreas Vogelsang) ▪ Categorizing textual fragments into requirements and non-requirements. ▪ In practice, this categorization is performed manually ▪ Developed a tool to assist users in this task by providing warnings based on classification. ▪ Performed a controlled experiment with two groups of students. ▪ The results show that the application of an automated classification approach may provide benefits, given that the accuracy is high enough.

Case Study 1: ▪ Using Tools to Assist Identification of Non-requirements in Requirements Specifications – A Controlled Experiment(Jonas Paul Winkler and Andreas Vogelsang) ▪ Investigation of the effectiveness of automated tools for RE tasks ▪ Their experiment supports that claim that the accuracy of the tool may have an effect on the observed performance. ▪ A human working with the tool on the task should at least achieve better recall than a human working on the task entirely manually. ▪ The experimental setup follows this idea by comparing tool-assisted and manual reviews.

Case Study 2: ▪ Evaluation of Techniques to Detect Wrong Interaction Based Trace Links(Paul Hubner and Barbara Paech) ▪ Trace links are created and used continuously during the development ▪ Support developers with an automatic trace link creation approach with high precision. ▪ In their previous study we showed an interaction based trace link creation approach which is better than traditional IR based approaches. Performed a controlled experiment with two groups of students. ▪ Performed the study within a student project. ▪ Evaluated different techniques to identify relevant trace link candidates such as focus on edit interactions or thresholds for frequency and duration of trace link candidates.

Conclusion ▪ Most RE and SE tasks involving NL documents are hairy tasks and need tools support. ▪ We may evaluate these tools with the different F-measure because the importance of recall and precision may be different for different tasks. ▪ We must to research and understand which measures are appropriate to evaluate any tool for the task.

THANK YOU! QUESTIONS?

Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng - PowerPoint PPT Presentation

Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng chen CS 846 project Presentation Department of Computer Science Outline Motivation Introduction Related works Case study 1 Case study 2 Conclusion

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

The Tools of the Trade: How to The Tools of the Trade: How to Find or Create the Evaluation Find

An Example for An Example for An Example for An Example for An Example for An Example for An

I nsulated Tools Presents KLEIN I nsulated Tools 2 KLEIN I nsulated Tools Topics Who needs

Example 1 ln x x dx Example 1 ln x x dx We make the substitution: Example 1 ln x

Part I Baseball Pennant Race Pennant Race: Example Another Example Example Example Team Won

Evaluation of Ontology Evaluation of Ontology Merging Tools in Merging Tools in Bioinformatics

Quota Assessment Tools Evaluation April 4, 2017 Agenda 1. Opening Remarks 2. History of Quota

SENSORY EVALUATION .. Basics of Sensory evaluation, Tools, Techniques, Methods and

Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Progress on

Evidence evaluation for discrete data Evidence evaluation for discrete data Evidence evaluation

The most important free tools for any website owner Google Webmaster Tools & Google Analytics

Distribution, Abstractions, Processes, and Links A Course on Distributed Algortihms, Spring

Link Analysis CSE545 - Spring 2020 Stony Brook University H. Andrew Schwartz Big Data

mation Design mation Design We e k 8 Infor Infor MSDN Ac c o unt MSDN Ac c o unt All the a

React Native Navigation Screens, moving, parameters React Navigation React Navigation is not

Part I: OUTER SPACE SECURITY and An Update on Outer Space Security A BRIEF

SMA Space Panel Discussion Outer Space; Earthly Escalation? Chinese Perspectives on Space

Quantum limits of deep space optical communication Konrad Banaszek, Ludwig Kunz, Marcin Jarzyna,

Voskhod, Soyuz and Zond INST 154 Apollo at 50 Voskhod Every Soviet Human Spaceflight Before

Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng - PowerPoint PPT Presentation

Evaluation of Example Tools For Hairy Tasks Presenter: Changsheng chen CS 846 project Presentation Department of Computer Science Outline Motivation Introduction Related works Case study 1 Case study 2 Conclusion

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

The Firefighter Problem on Trees David Ellison RMIT School of Science Co-authors: Pierre

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

Toy Example Toy Example Toy Example Toy Example Toy Example D 1 weak classifiers = vertical or

The Tools of the Trade: How to The Tools of the Trade: How to Find or Create the Evaluation Find

An Example for An Example for An Example for An Example for An Example for An Example for An

I nsulated Tools Presents KLEIN I nsulated Tools 2 KLEIN I nsulated Tools Topics Who needs

Example 1 ln x x dx Example 1 ln x x dx We make the substitution: Example 1 ln x

Part I Baseball Pennant Race Pennant Race: Example Another Example Example Example Team Won

Evaluation of Ontology Evaluation of Ontology Merging Tools in Merging Tools in Bioinformatics

Quota Assessment Tools Evaluation April 4, 2017 Agenda 1. Opening Remarks 2. History of Quota

SENSORY EVALUATION .. Basics of Sensory evaluation, Tools, Techniques, Methods and

Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Evaluation Map Guide Progress on

Evidence evaluation for discrete data Evidence evaluation for discrete data Evidence evaluation

The most important free tools for any website owner Google Webmaster Tools &amp; Google Analytics

Distribution, Abstractions, Processes, and Links A Course on Distributed Algortihms, Spring

Link Analysis CSE545 - Spring 2020 Stony Brook University H. Andrew Schwartz Big Data

mation Design mation Design We e k 8 Infor Infor MSDN Ac c o unt MSDN Ac c o unt All the a

React Native Navigation Screens, moving, parameters React Navigation React Navigation is not

Part I: OUTER SPACE SECURITY and An Update on Outer Space Security A BRIEF

SMA Space Panel Discussion Outer Space; Earthly Escalation? Chinese Perspectives on Space

Quantum limits of deep space optical communication Konrad Banaszek, Ludwig Kunz, Marcin Jarzyna,

Voskhod, Soyuz and Zond INST 154 Apollo at 50 Voskhod Every Soviet Human Spaceflight Before

The most important free tools for any website owner Google Webmaster Tools & Google Analytics