automatic identification of bug fix commits
play

Automatic Identification of Bug-fix Commits: The Case of GitHub - PowerPoint PPT Presentation

Automatic Identification of Bug-fix Commits: The Case of GitHub Projects Yujuan Jiang, Rodrigo Morales, Bram Adams, Foutse Khom 1 Case study projects Approach Research questions Result (so far) 2 Case Study Projects key words:


  1. Automatic Identification of Bug-fix Commits: The Case of GitHub Projects Yujuan Jiang, Rodrigo Morales, Bram Adams, Foutse Khom 1

  2. • Case study projects • Approach • Research questions • Result (so far) 2

  3. Case Study Projects key words: GitHub, C language 3

  4. Approach • Data Collection • Feature Extraction (Text & Source code) • Model Training • Evaluation 4

  5. Approach: Data collection 5

  6. Approach: Feature Extraction Textual Analysis: keywords Code Analysis 6

  7. Approach: Feature Extraction 1) Textual Analysis: 7

  8. Approach: Feature Extraction 1) Textual Analysis: keywords 7

  9. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words 7

  10. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words All words 7

  11. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words remove stop words 7

  12. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words Filter remove stop words 7

  13. Approach: Feature Extraction 1) Textual Analysis: keywords + feature words Stem + All words Filter remove stop words 7

  14. Approach: Feature Extraction 2) Source Code Analysis: 8

  15. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser 8

  16. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script 8

  17. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits 8

  18. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits Parser 8

  19. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script Commits Parser Commit Profile 8

  20. Approach: Feature Extraction 2) Source Code Analysis: Patch Parser + re Script # of while loops # of ifs # of boolean ...... Commits Parser Commit Profile Features 8

  21. Approach: Feature Extraction 9

  22. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) Grey data (Unlabelled) 10

  23. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) Grey data LPU (Unlabelled) 10

  24. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) White data (Bottom k) Grey data LPU (Unlabelled) Black data 10

  25. Approach: Model Training Black data (Manually label 300 bug fixing commits for each project) White data (Bottom k) Grey data + LPU (Unlabelled) Black data SVM Random Forest 10

  26. Approach: Evaluation 11

  27. Research Questions • Does our classifier work better than the baseline: keyword-based approach? • How does the parameter k impact the classifier? • What kind of metrics play more important roles in identifying bug-fixing commits? • Is the hybrid approach (namely the combination of the LPU and SVM) more effective than a single classifier approach? • Which combination of the options of the tool LPU makes the classifier work best? 12

  28. Result (so far): recall • Libgit2: 76.95% • openFrameworks: 96.67% 13

  29. Result (so far): key features X5 ● X6 ● X7 ● X22 ● X20 ● X21 ● X23 ● X31 ● X12 ● X50 ● X27 ● X16 ● X10 ● X16676 ● X51 ● X49 ● X48 ● X47 ● X46 ● X45 ● X44 ● X43 ● X42 ● X40 ● X39 ● X36 ● X35 ● X34 ● X32 ● X30 ● X29 ● X28 ● X25 ● X24 ● X19 ● X18 ● X17 ● X15 ● X14 ● X13 ● X11 ● X9 ● X4 ● X3 ● X2 ● X26 ● X37 ● X33 ● X41 ● X38 ● 0.000 0.005 0.010 0.015 0.020 0.025 0.030 Libgit2 14

  30. 15

  31. 15

  32. 15

  33. LPU SVM 15

  34. X5 ● X6 ● X7 ● X22 ● X20 ● X21 ● X23 ● X31 ● X12 ● X50 ● X27 ● X16 ● X10 ● X16676 ● X51 ● X49 ● X48 ● X47 ● X46 ● X45 ● X44 ● X43 ● X42 ● X40 ● X39 ● X36 ● X35 ● X34 ● X32 ● X30 ● X29 ● X28 ● X25 ● X24 ● X19 ● X18 ● X17 ● X15 ● X14 ● X13 ● X11 ● X9 ● X4 ● X3 ● X2 ● X26 ● X37 ● X33 ● X41 ● X38 ● LPU SVM 0.000 0.005 0.010 0.015 0.020 0.025 0.030 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend