COMP 762: Paper Presentation (Technical Details) Alexander - - PowerPoint PPT Presentation

comp 762 paper presentation technical details alexander
SMART_READER_LITE
LIVE PREVIEW

COMP 762: Paper Presentation (Technical Details) Alexander - - PowerPoint PPT Presentation

COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu Overview: Big question in the title Creates a


slide-1
SLIDE 1

Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu

COMP 762: Paper Presentation (Technical Details) Alexander Nicholson

slide-2
SLIDE 2

Overview:

▪ Big question in the title ▪ Creates a robust baseline model (Important!) ▪ Optimizations for a specific, practical task

slide-3
SLIDE 3

Scope Language specific scoping? Static vs. Dynamic

slide-4
SLIDE 4

Why online training/dynamism with (R)NN is hard: E.g. 50000 * 512 = 25600000 parameters between first and second layer.

Image From: Towards Deep Learning Software Repositories, White et al.

slide-5
SLIDE 5

Smoothing (Discounting/Correction and Interpolation)

Resources: https://www.youtube.com/watch?v=FUS7XkhYBLo&list=PLBv09BD7ez_7Ke6U7yGBvfP4_Hau3ZGj2&index =5 https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

Discounting/Correction: Subtract/Add a number from the counts. Interpolation: “Add information from known distributions” Weight additional distributions using

slide-6
SLIDE 6

Laplace/Lidstone Correction Add and re-normalize

Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

slide-7
SLIDE 7

Absolute Discounting Subtract and re-normalize Paper’s modification uses three values of

Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

slide-8
SLIDE 8

Kneser-Ney Smoothing Paper’s modification uses three values of

Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

slide-9
SLIDE 9

Jelinek-Mercer Smoothing General Interpolation: P(X) + (1-)P(Z) In J-M: is a constant.

slide-10
SLIDE 10

Witten-Bell Smoothing

Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

slide-11
SLIDE 11

Trie Data-structure

From Wikipedia: Children have common prefix.

slide-12
SLIDE 12

Trie Data-structure Each scope has its own trie

slide-13
SLIDE 13

Zipf’s Law

Image From: https://phys.org/news/2017-08-unzipping-zipf-law-solution-century-old.html

slide-14
SLIDE 14

Memoization

  • Optimization technique commonly used in dynamic

programming

  • Cache-(ish) to avoid multiple recalculations.

http://cs.mcgill.ca/~jcheung/teaching/fall-2017/comp550/index.html

slide-15
SLIDE 15

Dependency Models (Trees)

https://en.wikibooks.org/wiki/LaTeX/Linguistics

slide-16
SLIDE 16

Dropout

http://cs.mcgill.ca/~hvanho2/comp551/

slide-17
SLIDE 17

Evaluation Terms

  • Mean Reciprocal Rank
  • Explained well in Sec. 4.1
  • Two-tailed t-test
  • Statistical significance test
  • Tests if target is higher OR lower than reference.
  • Cohen’s D
  • Effect Size
  • Used with t-test
slide-18
SLIDE 18

Thanks!