Neural Attribution for Semantic Bug-Localization in Student Programs
Rahul Gupta, Aditya Kanade, Shirish Shevade
Computer Science & Automation Indian Institute of Science Bangalore, India NeurIPS 2019
Neural Attribution for Semantic Bug-Localization in Student - - PowerPoint PPT Presentation
Neural Attribution for Semantic Bug-Localization in Student Programs Rahul Gupta , Aditya Kanade, Shirish Shevade Computer Science & Automation Indian Institute of Science Bangalore, India NeurIPS 2019 Problem statement Bug root
Rahul Gupta, Aditya Kanade, Shirish Shevade
Computer Science & Automation Indian Institute of Science Bangalore, India NeurIPS 2019
technique
…
…
Neural Network P1 P2 Pn 1 1 … … Input: <Program, test> Output: success:0, failure:1 Neural Network Neural Network
[Sundararajan et al., 2017]
AST for code snippet: int even=!(num%2); AST Encoding as a 2D matrix
1 x 1 convolutions 1 x max_nodes convolutions 3 x max_nodes convolutions Feature concatenation
Program embedding
Embedding layer 1
Encoded program AST
Embedding layer 2
Test ID Test ID embedding Three layered fully connected neural network Failure prediction
Feature concatenation
absence of the feature is required as a baseline for comparing outcomes.
signal
input embedding vectors for text-based networks
the input of interest and the baseline) to the individual input features
a 1 = IG Max-pool Mean-pool
that pass all the tests
5% set aside for validation
max_subtrees: 249
and not a partial program completion
Validation: 96%
(42%), when reporting the top-10 suspicious lines
Evaluation Metric Localization queries Bug-localization result Top-10 Top-5 Top-1 <P,t> pairs 4117 3134 (76.12%) 2032 (49.36%) 561 (13.63%) Lines 2071 1518 (73.30%) 1020 (49.25%) 301 (14.53%) Programs 1449 1164 (80.33%) 833 (57.49%) 294 (20.29%)
Technique & configuration Bug-localization result Top-10 Top-5 Top-1 NBL 1164 (80.33%) 833 (57.49%) 294 (20.29%) Tarantula-1 964 (66.53%) 456 (31.47%) 6 (0.41%) Ochiai-1 1130 (77.98%) 796 (54.93%) 227 (15.67%) Tarantula-* 1141 (78.74%) 791 (54.59%) 311 (21.46%) Ochiai-* 1151 (79.43%) 835 (57.63%) 385 (26.57%) Diff-based 623 (43.00%) 122 (8.42%) 0 (0.00%)
Tarantula [Jones et al., 2001], Ochiai [Abreu et al., 2006]
allow efficient batch training for arbitrarily shaped trees
ground truth
localized a wide variety of semantic bugs, including wrong conditionals, assignments, output formatting and memory allocation, etc. https://bitbucket.org/iiscseal/NBL