Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P - PowerPoint PPT Presentation

Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor

The pre v io u s machine translator model The pre v io u s model Encoder GRU Cons u mes English w ords O u tp u ts a conte x t v ector Decoder GRU Cons u mes the conte x t v ector O u tp u ts a seq u ence of GRU o u tp u ts Decoder Prediction la y er Cons u mes the seq u ence of GRU o u tp u ts O u tp u ts prediction probabilities for French w ords MACHINE TRANSLATION IN PYTHON

Analog y: Training w itho u t Teacher Forcing MACHINE TRANSLATION IN PYTHON

Analog y: Training w ith Teacher Forcing MACHINE TRANSLATION IN PYTHON

The pre v io u s machine translator model The pre v io u s model Teacher - forced model MACHINE TRANSLATION IN PYTHON

Implementing the model w ith Teacher Forcing Encoder en_inputs = layers.Input(shape=(en_len, en_vocab)) en_gru = layers.GRU(hsize, return_state=True) en_out, en_state = en_gru(en_inputs) Decoder GRU de_inputs = layers.Input(shape=(fr_len-1, fr_vocab)) de_gru = layers.GRU(hsize, return_sequences=True) de_out = de_gru(de_inputs, initial_state=en_state) MACHINE TRANSLATION IN PYTHON

Inp u ts and o u tp u ts Encoder inp u t - e . g . I , like , dogs Decoder inp u t - e . g . J'aime , les Decoder o u tp u t - e . g . les , chiens MACHINE TRANSLATION IN PYTHON

Implementing the model w ith Teacher Forcing Encoder en_inputs = layers.Input(shape=(en_len, en_vocab)) en_gru = layers.GRU(hsize, return_state=True) en_out, en_state = en_gru(en_inputs) Decoder GRU de_inputs = layers.Input(shape=(fr_len-1, fr_vocab)) de_gru = layers.GRU(hsize, return_sequences=True) de_out = de_gru(de_inputs, initial_state=en_state) Decoder Prediction de_dense = layers.TimeDistributed(layers.Dense(fr_vocab, activation='softmax')) de_pred = de_dense(de_out) MACHINE TRANSLATION IN PYTHON

Compiling the model nmt_tf = Model(inputs=[en_inputs, de_inputs], outputs=de_pred) nmt_tf.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["acc"]) MACHINE TRANSLATION IN PYTHON

Preprocessing data Encoder Inp u ts - All English w ords ( onehot encoded ) en_x = sents2seqs('source', en_text, onehot=True, reverse=True) Decoder de_xy = sents2seqs('target', fr_text, onehot=True) Inp u ts - All French w ords e x cept the last w ord ( onehot encoded ) de_x = de_xy[:,:-1,:] O u tp u ts / Targets - All French w ords e x cept the � rst w ord ( onehot encoded ) de_y = de_xy[:,1:,:] MACHINE TRANSLATION IN PYTHON

Let ' s practice ! MAC H IN E TR AN SL ATION IN P YTH ON

Training the model w ith Teacher Forcing MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor

Model training in detail Model training req u ires : A loss f u nction ( e . g . categorical crossentrop y) An optimi z er ( e . g . Adam ) MACHINE TRANSLATION IN PYTHON

Model training in detail To comp u te loss , follo w ing items are req u ired : Probabilistic predictions generated u sing inp u ts ( [batch_size, seq_len, vocab_size] ) e . g . [[0.11,...,0.81,0.04], [0.05,...,0.01, 0.93], ..., [0.78,..., 0.03,0.01]] Act u al onehot encoded French targets ( [batch_size, seq_len, vocab_size] ) e . g . [[0, ..., 1, 0], [0, ..., 0, 1],..., [0, ..., 1, 0]] Crossentrop y: di � erence bet w een the targets and predicted w ords The loss is passed to an optimi z er w hich w ill change the model parameters to minimi z e the loss MACHINE TRANSLATION IN PYTHON

Training the model w ith Teacher Forcing n_epochs, bsize = 3, 250 for ei in range(n_epochs): for i in range(0,data_size,bsize): # Encoder inputs, decoder inputs and outputs en_x = sents2seqs('source', en_text[i:i+bsize], onehot=True, reverse=True) de_xy = sents2seqs('target', fr_text[i:i+bsize], onehot=True) # Separating decoder inputs and outputs de_x = de_xy[:,:-1,:] de_y = de_xy[:,1:,:] # Training and evaulating on a single batch nmt_tf.train_on_batch([en_x,de_x], de_y) res = nmt_tf.evaluate([en_x,de_x], de_y, batch_size=bsize, verbose=0) print("{} => Train Loss:{}, Train Acc: {}".format(ei+1,res[0], res[1]*100.0)) MACHINE TRANSLATION IN PYTHON

Arra y slicing in detail de_x = de_xy[:,:-1,:] de_y = de_xy[:,1:,:] MACHINE TRANSLATION IN PYTHON

Creating training and v alidation data train_size, valid_size = 800, 200 # Creating data indices inds = np.arange(len(en_text)) np.random.shuffle(inds) # Separating train and valid indices train_inds = inds[:train_size] valid_inds = inds[train_size:train_size+valid_size] # Extracting train and valid data tr_en = [en_text[ti] for ti in train_inds] tr_fr = [fr_text[ti] for ti in train_inds] v_en = [en_text[vi] for vi in valid_inds] v_fr = [fr_text[vi] for vi in valid_inds] print('Training (EN):\n', tr_en[:2], '\nTraining (FR):\n', tr_fr[:2]) print('\nValid (EN):\n', tr_en[:2], '\nValid (FR):\n', tr_fr[:2]) MACHINE TRANSLATION IN PYTHON

Training w ith v alidation for ei in range(n_epochs): for i in range(0,train_size,bsize): en_x = sents2seqs('source', tr_en[i:i+bsize], onehot=True, reverse=True) de_xy = sents2seqs('target', tr_fr[i:i+bsize], onehot=True) de_x, de_y = de_xy[:,:-1,:], de_xy[:,1:,:] nmt_tf.train_on_batch([en_x, de_x], de_y) v_en_x = sents2seqs('source', v_en, onehot=True, reverse=True) v_de_xy = sents2seqs('target', v_fr, onehot=True) v_de_x, v_de_y = v_de_xy[:,:-1,:], v_de_xy[:,1:,:] res = nmt_tf.evaluate([v_en_x, v_de_x], v_de_y, batch_size=valid_size, verbose=0) print("Epoch {} => Loss:{}, Val Acc: {}".format(ei+1,res[0], res[1]*100.0)) Epoch 1 => Loss:4.784221172332764, Val Acc: 1.4999999664723873 Epoch 2 => Loss:4.716882228851318, Val Acc: 44.458332657814026 Epoch 3 => Loss:4.63267183303833, Val Acc: 47.333332896232605 MACHINE TRANSLATION IN PYTHON

Let ' s train ! MAC H IN E TR AN SL ATION IN P YTH ON

Generating translations from the model MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor

Pre v io u s model v s ne w model MACHINE TRANSLATION IN PYTHON

Trained model MACHINE TRANSLATION IN PYTHON

Decoder of the inference model Takes in A onehot encoded w ord A state inp u t ( gets the state from pre v io u s timestep ) Prod u ces A ne w state A prediction ( i . e . a w ord ) Rec u rsi v el y feed the predicted w ord and the state back to the model as inp u ts MACHINE TRANSLATION IN PYTHON

F u ll inference model Inference model w ith the rec u rsi v e decoder Inference model from the pre v io u s chapter MACHINE TRANSLATION IN PYTHON

Val u e of sos and eos tokens sos marks beginning of a translation ( i . e . a French sentence ). Feed in sos as the � rst w ord to the decoder and keep predicting eos marks the end of a translation . Predictions stop w hen the w ord predicted b y the model is eos As a safet y meas u re u se a ma x im u m length the model can predict for MACHINE TRANSLATION IN PYTHON

Defining the generator encoder Importing la y ers and Model # Import Keras layers import tensorflow.keras.layers as layers from tensorflow.keras.models import Model De � ning model la y ers en_inputs = layers.Input(shape=(en_len,en_vocab)) en_gru = layers.GRU(hsize, return_state=True) en_out, en_state = en_gru(en_inputs) De � ning Model object encoder = Model(inputs=en_inputs, outputs=en_state) MACHINE TRANSLATION IN PYTHON

Defining the generator decoder De � ning the decoder Input la y ers de_inputs = layers.Input(shape=(1, fr_vocab)) de_state_in = layers.Input(shape=(hsize,)) De � ning the decoder ' s interim layers de_gru = layers.GRU(hsize, return_state=True) de_out, de_state_out = de_gru(de_inputs, initial_state=de_state_in) de_dense = layers.Dense(fr_vocab, activation='softmax') de_pred = de_dense(de_out) De � ning the decoder Model decoder = Model(inputs=[de_inputs, de_state_in], outputs=[de_pred, de_state_out]) MACHINE TRANSLATION IN PYTHON

Cop y ing the w eights Get w eights of the la y er l1 w = l1.get_weights() Set the w eights of the la y er l2 w ith w l2.set_weights(w) In o u r model , there are three la y ers w ith w eights Encoder GRU , Decoder GRU and Decoder Dense en_gru_w = tr_en_gru.get_weights() en_gru.set_weights(en_gru_w) Which can also be w ri � en as , en_gru.set_weights(tr_en_gru.get_weights()) MACHINE TRANSLATION IN PYTHON

Generating translations en_sent = ['the united states is sometimes chilly during december , but it is sometimes freezing in june .'] Con v erting the English sentence to a seq u ence en_seq = sents2seqs('source', en_st, onehot=True, reverse=True) Ge � ing the conte x t v ector de_s_t = encoder.predict(en_seq) Con v erting " sos " ( initial w ord to the decoder ) to a seq u ence de_seq = word2onehot(fr_tok, 'sos', fr_vocab) MACHINE TRANSLATION IN PYTHON

Generating translations fr_sent = '' for _ in range(fr_len): de_prob, de_s_t = decoder.predict([de_seq,de_s_t]) de_w = probs2word(de_prob, fr_tok) de_seq = word2onehot(fr_tok, de_w, fr_vocab) if de_w == 'eos': break fr_sent += de_w + ' ' MACHINE TRANSLATION IN PYTHON

Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P - PowerPoint PPT Presentation

Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor The pre v io u s machine translator model The pre v io u s model Encoder GRU Cons u mes English w ords O u tp u ts a conte

Introd u ction to the dataset W OR K IN G W ITH G E OSPATIAL DATA IN P YTH ON Joris Van den

Topological Ramsey spaces in creature forcing Natasha Dobrinen University of Denver Toposym,

Coherent adequate forcing and preserving CH Miguel Angel Mota Joint work with John Krueger

Computational interpretation of classical forcing Lionel R ieg Collge de France July 22 nd ,

Forcing axioms in P max extensions Paul Larson Department of Mathematics Miami University

Integer-Forcing for Channels, Sources and ADCs Or Ordentlich Tel Aviv University Or Ordentlich

Teacher Teacher-Student Data Link Teacher Teacher Student Data Link Student Data Link Student

INTROD TRODUCT CTION TO TO PRI RIOR ORITY TY-BASED ED B BUDGET ET BUDGETI TING F FOR

Introd u ction to a u dio data in P y thon SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON

Introd u ction to P y D u b SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON Daniel Bo u

Introd u ction IN TE R ME D IATE IN TE R AC TIVE DATA VISU AL IZATION W ITH P L OTLY IN R

Introd u ction VISU AL IZIN G G E OSPATIAL DATA IN P YTH ON Mar y v an Valkenb u rg Data

Introd u ction to signals FIN AN C IAL TR AD IN G IN R Il y a Kipnis Professional Q u antitati

Introd u ction to E x plorator y Data Anal y sis STATISTIC AL TH IN K IN G IN P YTH ON ( PAR T 1

Introd u ction to iterators P YTH ON DATA SC IE N C E TOOL BOX ( PAR T 2 ) H u go Bo w ne -

Introd u ction to EFA FAC TOR AN ALYSIS IN R Jennifer Br u sso w Ps y chometrician Ps y cho +

LTI system response Daniele Carnevale Dipartimento di Ing. Civile ed Ing. Informatica (DICII),

Tokai Super Kamiokande R OBERT J. W ILSON FOR THE T2K C

Real-space Manifestations of Bottlenecks in Turbulence Spectra Rahul Pandit Centre for Condensed

23s L l.l Pl*"', lw : tVca- a, r l aq ao [.,rk Npyt etl. /r'..t f,"o. i r4pp

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #24:

A Metal-Only-ECO S olver for Input-S lew and Output-Loading Violations Chien-Pang Lu, Mango

Analysis and Visualization Philipp Koehn 10 November 2020 Philipp Koehn Machine Translation:

Upstream Forcing of Tidewater Glacier Retreat Ian Hewitt , University of Oxford Tidewater glaciers

Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P - PowerPoint PPT Presentation

Introd u ction to Teacher Forcing MAC H IN E TR AN SL ATION IN P YTH ON Th u shan Ganegedara Data Scientist and A u thor The pre v io u s machine translator model The pre v io u s model Encoder GRU Cons u mes English w ords O u tp u ts a conte

Introd u ction to the dataset W OR K IN G W ITH G E OSPATIAL DATA IN P YTH ON Joris Van den

Topological Ramsey spaces in creature forcing Natasha Dobrinen University of Denver Toposym,

Coherent adequate forcing and preserving CH Miguel Angel Mota Joint work with John Krueger

Computational interpretation of classical forcing Lionel R ieg Collge de France July 22 nd ,

Forcing axioms in P max extensions Paul Larson Department of Mathematics Miami University

Integer-Forcing for Channels, Sources and ADCs Or Ordentlich Tel Aviv University Or Ordentlich

Teacher Teacher-Student Data Link Teacher Teacher Student Data Link Student Data Link Student

INTROD TRODUCT CTION TO TO PRI RIOR ORITY TY-BASED ED B BUDGET ET BUDGETI TING F FOR

Introd u ction to a u dio data in P y thon SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON

Introd u ction to P y D u b SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON Daniel Bo u

Introd u ction IN TE R ME D IATE IN TE R AC TIVE DATA VISU AL IZATION W ITH P L OTLY IN R

Introd u ction VISU AL IZIN G G E OSPATIAL DATA IN P YTH ON Mar y v an Valkenb u rg Data

Introd u ction to signals FIN AN C IAL TR AD IN G IN R Il y a Kipnis Professional Q u antitati

Introd u ction to E x plorator y Data Anal y sis STATISTIC AL TH IN K IN G IN P YTH ON ( PAR T 1

Introd u ction to iterators P YTH ON DATA SC IE N C E TOOL BOX ( PAR T 2 ) H u go Bo w ne -

Introd u ction to EFA FAC TOR AN ALYSIS IN R Jennifer Br u sso w Ps y chometrician Ps y cho +

LTI system response Daniele Carnevale Dipartimento di Ing. Civile ed Ing. Informatica (DICII),

Tokai Super Kamiokande R OBERT J. W ILSON FOR THE T2K C

Real-space Manifestations of Bottlenecks in Turbulence Spectra Rahul Pandit Centre for Condensed

23s L l.l Pl*&quot;', lw : tVca- a, r l aq ao [.,rk Npyt etl. /r'..t f,&quot;o. i r4pp

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #24:

A Metal-Only-ECO S olver for Input-S lew and Output-Loading Violations Chien-Pang Lu, Mango

Analysis and Visualization Philipp Koehn 10 November 2020 Philipp Koehn Machine Translation:

Upstream Forcing of Tidewater Glacier Retreat Ian Hewitt , University of Oxford Tidewater glaciers

23s L l.l Pl*"', lw : tVca- a, r l aq ao [.,rk Npyt etl. /r'..t f,"o. i r4pp