Exploring API Embedding for API Usages and Applications Yi Chang - - PowerPoint PPT Presentation

exploring api embedding for api usages and applications
SMART_READER_LITE
LIVE PREVIEW

Exploring API Embedding for API Usages and Applications Yi Chang - - PowerPoint PPT Presentation

Exploring API Embedding for API Usages and Applications Yi Chang Trong Duc Nguyen, Anh Tuan Nguyen, Hung Dang Phan, and Tien N. Nguyen. 2017. Exploring API embedding for API usages and applications. In Proceedings of the 39th International


slide-1
SLIDE 1

Exploring API Embedding for API Usages and Applications

Yi Chang

Trong Duc Nguyen, Anh Tuan Nguyen, Hung Dang Phan, and Tien N. Nguyen. 2017. Exploring API embedding for API usages and applications. In Proceedings of the 39th International Conference on Software Engineering (ICSE '17). IEEE Press, Piscataway, NJ, USA, 438-449. DOI: https://doi.org/10.1109/ICSE.2017.47

slide-2
SLIDE 2

Stochastic Gradient Descent

statistical gradient descent is for minimizing an objective function that is written as a sum of differentiable functions.

source: https://en.wikipedia.org/wiki/Stochastic_gradient_descent

slide-3
SLIDE 3

AST (Abstract Syntax Tree)

Source: https://en.wikipedia.org/wiki/Abstract_syntax_tree

Tree representation of the abstract syntactic structure

  • f source code. The author traverses an AST to build

API sequences according to the syntactic units related to APIs. Each node of the tree denotes a construct occurring in the source code. The ‘abstract’ here means that the tree does not represent every details appearing in the real syntax. For example, grouping parentheses are already implicit in the tree structure.

slide-4
SLIDE 4

Word2Vec — CBOW model

Multiple degrees of similarity: simple syntactic regularities, simple algebraic operations. like vector(‘King’) - vector(‘man’) + vector(‘woman’) Continuous Bag-of-Words models: predicting/generating a center word from the surrounding context

  • 1. Generate index vector (1*V) for the input context of size 2n (context

window’s size) and get encoded into the model

  • 2. Compute the hidden layer h. Average these vectors. Computes the

product of the average vector and the input-to-hidden weights matrix.

  • 3. Inputs to each node in the output layer.
  • 4. Pass the above through the soft-max function.

Learn the weight matrix with back-propagation: Stochastic gradient descent. Loss function

Source: http://mccormickml.com/assets/word2vec/ Alex_Minnaar_Word2Vec_Tutorial_Part_II_The_Continuous_Bag-of-Words_Model.pdf

slide-5
SLIDE 5

Word2Vec — CBOW model (cont)

Characteristics of API embeddings:

  • 1. Nearby vectors represent APIs with similar context
  • 2. Similar vector offsets reflect similar relations
slide-6
SLIDE 6

Word2Vec — Skip-gram model

Given the center word, the model predicts/generates the surrounding words.

Source: http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/

Architecture: 300 features is what Google used in their published model trained on the Google news dataset

Output layer is actually a soft-max regression classifier window size of two train the neural network by feeding it word pairs,

slide-7
SLIDE 7

BLEU (Bilingual Evaluation Understudy)

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL '02). Association for Computational Linguistics, Stroudsburg, PA, USA, 311-318. DOI: https:// doi.org/10.3115/1073083.1073135 Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N. Nguyen. 2013. Lexical statistical machine translation for language migration. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2013). ACM, New York, NY, USA, 651-654. DOI: https://doi.org/10.1145/2491411.2494584

It is a popular metrics in Statistical Machine Translation (SMT) that measures translation quality by the accuracy of translating n-grams to n-grams with various n. The BLEU is applied for evaluating lexical matching when migrating equivalent API usage sequence

Usually, the N is set to be 4 and Wn = 1/N BP is the brevity penalty value, c is the length if the candidate translation and the r is the length of reference corpus length. P_i is the metrics for the overlapping between the bag of i-grams appearing in the candidate sentences and that appearing in the reference sentences.

slide-8
SLIDE 8

Thanks