Using Chapel for Natural Language Processing And Interaction Brian - PowerPoint PPT Presentation

Using Chapel for Natural Language Processing And Interaction Brian Guarraci CTO @ Cricket Health

Motivation • Augment chat bot Human-created rulesets with data • ChatScript provides a powerful rule engine, but making Human-created rules is unscalable and limited • Use Chapel as a power-tool to create datasets which can be plugged into ChatScript engine • Focus on two main types of custom datasets • Chord: Use word2vec for language support • Chriple: Use RDF triple stores for knowledge

Chord: Chapel + Word2Vec • Word embeddings are vectors computed with a Neural Network Language Model (NNLM) • Each word vector characterizes the associated word in relation to training data and other words in the vocabulary • Vectors have interesting and useful NLP features • King - Man + Woman = Queen • Tokyo - Japan + France = Paris • Replace Human-derived rules for certain NLP tasks

Chord: Path to Distributed • First: Port Google’s single-locale classic word2vec and validate • Second: Port classic model to a multi-locale model • Maintain single-locale performance in multi-locale version • Preserve Asynchronous SGD (race conditions by design) • Encapsulate globals to ensure locale-local only access • Experiment with dmapped and other distributed memory strategies to find a fast method for cross machine data sharing

Chord: Path to Distributed • Distributed models require periodic model sharing across locales • Naïve dmapped approach is very slow due to model specific behavior yielding excessive cross-machine data transfers • Use a variant of Google’s Downpour SGD • Reserve some locales as “parameter locales” and others as compute locales which train on data shards • Each compute locale diverges with it’s training data and updates the parameter locales after each training iteration • Use AdaGrad to perform model updates on param locales

Chord: Architecture Parameter 1 … P Locales Δ w Δ w Δ w w’ w’ w’ Compute P+1 … N Locales Data Shards 1 … K Locales are partitioned into param and compute roles

Chord: Single vs Multi-Locale Training Speed Model Accuracy 1400 90 1050 67.5 Percent Correct Seconds 700 45 350 22.5 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Iterations Iterations Multi-Locale Single-Locale Multi-Locale Single-Locale Multi-Locale version > 3x faster with similar accuracy (eventually). Multi-locale configuration: • 8 locales: single parameter locale with seven compute locales • Machine type: EC2 m4.2xlarge (8 vCPU 16GB RAM)

Chriple: Chapel + Triple Store • Keep it simple to learn what’s useful • Naïve implementation inspired by TripleBit • Reasonably memory efficient • Predicate-based hash partitions on locales • CHASM (from Chearch) stack-based integer query language • Supports essential distributed query primitives (AND/OR) • Supports sub-graph extraction

Chriple: Architecture Predicate Entry S ubject- O bject Index O bject- S ubject Index 64-bit Index Entry 32-bit O bjectID 32-bit S ubject ID Predicate Hash Table Locale Predicate Hash Partition

Chriple: Distributed Queries Top-level Query Q top In-memory partition holds results from partition queries. Partition Queries Q 1 Q N … Predicate 1 … N Partitions (locales)

Chriple: Current Results • Memory requirements • ~16 bytes per triple • 2B triples require ~64GB RAM across cluster • Performance (8 x EC2 m4.2xlarge [8 vCPU 32GB RAM]) • 1.1M inserts / s (~137K / locale) • 40K reads / s [via parallel iterator] (~5K / locale)

AllegroGraph Benchmark http://franz.com/agraph/allegrograph/agraph_benchmarks.lhtml

Conclusion • Work in progress • Many opportunities for optimization • Useful for generating data and experimentation • Code is available on Github • https://github.com/briangu/chord • https://github.com/briangu/chriple

Using Chapel for Natural Language Processing And Interaction Brian - PowerPoint PPT Presentation

Using Chapel for Natural Language Processing And Interaction Brian Guarraci CTO @ Cricket Health Motivation Augment chat bot Human-created rulesets with data ChatScript provides a powerful rule engine, but making Human-created rules is

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

CHAPEL + LAPACK Ian Bertolacci NEW DOG, MEET OLD DOG. INTRO: WHAT IS CHAPEL Chapel is a

Chapel: Global HPCC Benchmarks and Status Update Brad Chamberlain Chapel Team CUG 2007 May 7,

Natural Language Processing 1 Lecture 11: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 10: Language generation and summarisation Katia Shutova

Natural Language Processing 1 Lecture 8: Compositional semantics and discourse processing Katia

Natural Language Processing Fall 2018 Frank Ferraro Natural language processing ITE 358

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2019 Natural Language

Introduction to Natural Language Processing CMSC 470 Marine Carpuat Natural Language Processing

Chapel: Status/Community Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010 Outline Chapel

Advanced Natural Language Processing: What is Natural Language Processing (NLP)? Background

Hello World! The Microsoft Bot Ecosystem Bot Service / Bot Builder SDK Bot Builder SDK

AI Neural Bots By: Machine Learning Group @ CLT https://machinelearning.group/ Thanks to our

How machine learning is used in www.coach-bot.de processing text Fabian Reich www.coach-bot.de

Data Security And Privacy Of Chatbots @electrobabe Background 27.2.19 sec4dev 27.2.19

Asynchronous WebSockets using Django 2017-05-18 W EB S OCKETS D JANGO CHANNELS E XAMPLE O UTLINE

HOWDY! DSA IT Liaisons Communications Committee 4/1/2020 Agenda DSA Works Remotely Survey

Never leave an IRC channel again with ZNC Justin W. Flory RITlug, 2016 License: CC-BY-SA What

Software Architecture Presentation Day Arie van Deursen and Casper Boone Procedure For each