do developers feel emotion an exploratory analysis of
play

Do Developers Feel Emotion? An Exploratory Analysis of Emotions. - PowerPoint PPT Presentation

Do Developers Feel Emotion? An Exploratory Analysis of Emotions. Motivation Feelings and emotions dictate to a large extent our actions and decisions. Developers potential and productivity is fully unlockable if people feel safe


  1. Do Developers Feel Emotion? An Exploratory Analysis of Emotions.

  2. Motivation • Feelings and emotions dictate to a large extent our actions and decisions. • Developers ʼ potential and productivity is fully unlockable if people feel safe and happy. • It is important to support managers and project leaders in detecting emotions

  3. Final Goal • Building a tool for automatic emotion detection. A first step: • Can emotions actually be detected from issue reports? • If so, can human actually agree on the identified emotions?

  4. Our approach • A significant sample of developers’ comments of the Apache were analyzed based on Parrott’s emotional framework. • Can human raters, without any training, agree on the presence of emotions in issue reports? • Dose training improve the agreement of human raters? • Dose context improve the agreement of human raters?

  5. Related Work Ahmed Hassan et. al tried to answering these questions: • What is the personality type of OSS developers? • Dose the language and attitude of a developer change as moves from being a current, to a departing developer?

  6. Related Work • Guzman et. al proposed an approach to improve emotional awareness in software development teams by means of quantitative emotion summaries. • Their approach automatically extracts and summarizes emotions expressed in collaboration artifacts by combining probabilistic topic modelling with lexical sentiment analysis techniques.

  7. Emotion Mining • Emotion mining tries to identify the presence of emotions like joy or fear • Sentiment analysis evaluates a given emotion as being positive or negative

  8. Emotion Mining in Software Engineering • Applied to text artifacts can be used to provide hints on factors responsible for joy and satisfaction, or fear and anger among developers. • It provides a different perspective to interpret productivity and job satisfaction.

  9. Parrott’s Framework

  10. Issue Tracking System • A repository used by software companies to organize software maintenance and evolution. • Team members submit and discuss issues including bugs and feature requests, ask for advice or share opinions • It might reveal how committers feel towards a bug, feature, project or even their colleagues. • Each issue is characterized by several attributes like: priority, status, type(improvement, perfective maintenance, new feature, corrective maintenance, adaptive maintenance)

  11. Experimental Setup • Goal: Understand the kinds of emotions found in issue reports • Four authors rated issue reports from open source systems • Analyzing the identified emotions and rater’s agreement

  12. Dataset • Issue repository of the Apache software foundation • host of 117 open source projects rating large long-lived to small representative data

  13. Dataset • Issue reports since 19th of October 2000 till July 2013 • Developers’ comments + issue report attributes • No distinction between bugs, new features, and enhancements • Granularity: issue comment level • Enough number of issue commits to obtain 95% confidence level.

  14. Emotion Mining • Each rater identified emotions associated to each comment according to Parrott’s six emotions: love, joy, surprise, anger, sadness, fear • Personal rate • Based on common understanding of Parrott’s framework • No ground true: agreement is considered as correct , agreement: majority vote

  15. Examples • I'm not so convinced that moving all the static methods out is useful (Fear). • How is a bunch of static methods on a utility class easier than a bunch of static methods within the HtmlCalendarRenderer better? (Anger) • The risk of introducing new bugs for no great benefit (Fear). • Previously almost all these helper methods were private; this \textbf{patch} makes them all public [...]} (Neutral)

  16. Measuring Agreement • Degree of inter-rare agreement • Cohen’s for two raters • Fleiss’s k value for more than two raters

  17. Question 1 • Can human raters, without any training, agree on the presence of emotions in issue reports? • Motivation: Emotion mining from software development artifacts is not trivial, since they consist of unstructured data, they are relatively short, written in informal way.

  18. Question 1: Approach • 400 issue report comments were arbitrary assigned to two of the raters. • Each author selected the emotions that were present in the comment • Once all comments had been annotated, the four files were collected and analyzed using Cohen’s K.

  19. Question1: Result • In 41% of the comments, the raters agreed on all 6 emotions whereas 85% of comments do not contain any emotion • Only for Love, the raters achieved more than slight agreement, moderate value. • 6.5% agreed on the presence of a particular emotion, Love, 96.75+5 on the absence, Surprise.

  20. Result • While some emotions obtain higher agreement than others, only one emotion obtained moderate agreement, and raters agree the most on the absence of an emotion.

  21. Question 2 • Dose training improve the agreement of human raters on the presence of emotions in issue reports? • Motivation: Without thorough training, raters achieve only a slight agreement. This leads to the current question.

  22. Question 2: Approach • Each rater compiled a list of generic expressions he or she felt insecure • A general example and emotion added • 144 expressions were obtained • Meeting for discussion • Replication and refinement study performed

  23. Question 2: Replication and Refinement Study • Replicated our study of RQ1 on a second sample. • Refinement study revisited 235 comments of RQ1 with at least one emotion disagreement, all four authors decide about occurrence of emotion. • Why refinement was done?

  24. Question 2: Results • 65% of comments, the raters agreed on all 6 emotions • Four out of six emotions improve from slight to fair agreement. Joy, Anger, Sadness and Fear • 4.17% agreed on the presence of an emotion, Love • 72.76 obtained agreement by at least 3 raters.

  25. Result • Training improves the overall agreement on emotions, as well as for most of the individual emotions. Love, joy and sadness are the most common emotions.

  26. Question 3 • Dose context improve the agreement of human raters on the presence of emotions in issue reports? • Motivation: previous experiments can be compared to eavesdropping on a group, and catching just one phrase. • Due to technical and unstructured nature of software development artifacts, the impact of context might be different than in literary English.

  27. Question3: example • Sentence: “yeah right” • “moving to java 8 we solve all problems” • “breaking backward compatibility is risky”

  28. Question3: Approach • Experiment with two steps: • Replication of study RQ2: 384 comments, two raters • Same analyze with the context of those comments

  29. Question3:Results • Adding context reduces rater agreement for love • More raters change their mind for comments with context • Context seems to make raters doubt about t h e i r r a t i n g , i n t r o d u c i n g m o r e disagreement.

  30. Discussion • A. Impact of Context: • at first, our findings seem counter-intuitive. • Using a simple yes/no decision as rating is too large as simplification. Instead, multiple rating. • B. Do Emotions Really Matter for Issue Reports: • Our finding suggests there is link between emotions and software development. Reports with “love” emotion tend to have a lower number of comments and fixing time.

  31. Threats to Validity • Internal validity: We rely on the presence of a casual relationship between a developer’s emotions and what he or she writes in issue report comments. • Construct validity: Ambiguity of messages and subjectivity of emotions. To reduce: • Parrott’s framework is adopted • explanation and clarifying of framework • each commit was analyzed by at least two authors

  32. Threats to Validity • External validity: Replication of this work on other open source systems and on commercial projects are needed to confirm our findings. • Reliability Validity: No ground truth exist to compare our findings. Different groups of raters overall will obtain the same results as well.

  33. Conclusion • Software development, as collaborative activity of developers, is influenced by human emotions. • Issue reports do express emotions towards design choices, maintenance activity or colleagues. • Love, joy and sadness are easier to agree on. • Emotion mining can improve through training • Some challenges like the impact of context need to be studied more, on more data sources and systems.

  34. 34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend