Building a chatbot: NLP pipeline and dependency parsing By: Andrei - - PowerPoint PPT Presentation

building a chatbot nlp pipeline and dependency parsing
SMART_READER_LITE
LIVE PREVIEW

Building a chatbot: NLP pipeline and dependency parsing By: Andrei - - PowerPoint PPT Presentation

Building a chatbot: NLP pipeline and dependency parsing By: Andrei uiu meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/ What Is a Chatbot? Chat ro bots are computer programs powered by rules and sometimes artificial intelligence, that mimic


slide-1
SLIDE 1

Building a chatbot: NLP pipeline and dependency parsing

facebook.com/AI.in.Iasi/ By: Andrei Şuiu meetup.com/IASI-AI/

slide-2
SLIDE 2

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

What Is a Chatbot?

Chat robots are computer programs powered by rules and sometimes artificial intelligence, that mimic conversation with people via a chat interface. Applications:

  • Legal consultancy
  • HR services
  • Customer Services
  • Call centres
  • Banks
  • Restaurants
  • Travel Services & Hotels
  • Medical services
slide-3
SLIDE 3

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

History: first chatbot

ELIZA Created from 1964 to 1966 @MIT AI Laboratory by Joseph Weizenbaum

slide-4
SLIDE 4

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications: virtual lawyer

slide-5
SLIDE 5

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications: virtual lawyer

DoNotPay - a chatbot that provides free legal advices using AI invented by British entrepreneur Joshua Browder. It can assist with writing letters and filling out forms.

https://donotpay-search-master.herokuapp.com

By June of 2016, DoNotPay had successfully contested 160,000 parking tickets - a 64% success rate - and earlier this year, Browder added capabilities to assist asylum seekers in the US, UK and Canada. Now, the bot is able to assist with over 1,000 different legal issues in all 50 states and across the UK.

slide-6
SLIDE 6

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications: Human Resources

slide-7
SLIDE 7

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications: Human Resources

slide-8
SLIDE 8

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications: Human Resources

Help new employees to learn & find:

  • Kitchen, coffee
  • company main

internal policies

  • Printer/xerox
  • Company

structure

  • Main business

processes

  • etc.
slide-9
SLIDE 9
  • You can think of a bot just as of another user
  • Bot can be invited to a group and post messages with the help of keywords
  • Bots can have many of the same qualities as their human counterparts:

○ names ○ profile photos ○ can be direct messaged or mentioned ○ can post messages or initiate conversation ○ upload files, etc... meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Perception of chatbots

slide-10
SLIDE 10

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Perception of chatbots: responsibility

slide-11
SLIDE 11
  • Human language is a natural way to command and ask questions
  • Single point of navigation that offers contextual&personalized information
  • Chatbots give you the opportunity to serve more clients with less human

resources

  • Chatbots are often more cost effective and faster than their human

counterparts.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Why use chatbots?

slide-12
SLIDE 12

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Messaging platforms are opening their APIs

slide-13
SLIDE 13
  • Legal consultancy
  • HR services
  • Customer Services (Emag)

cross selling/up-selling, help make purchase decisions ○ Handle objections personally, get customer feedback ○ Offer discount codes ○ Deliver shipping notifications, out-of-stock notificatoins

  • Call centres
  • Banks (Livia de la BT)
  • Restaurants
  • Travel Services & Hotels (Uber chatbot)
  • Medical services

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Applications

slide-14
SLIDE 14

The key for a bot to efficiently communicate with humans is its ability to understand the intentions of humans and extraction of relevant information from that intention and of course relevant action against that information. One of the main concerns of NLP science is to extract the intentions and other relevant information from text.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Building a chatbot

slide-15
SLIDE 15

Below I propose a simple method for identification of some types of intentions. Generally, you'll get a unicode string out of the user input, either this is written at keyboard, either it's a string generated by a speech recognition engine from the audio stream received from a phone line. We'll use a technique called semantic role labeling. Semantic role labeling is a task in NLP consisting of the detection of the semantic arguments associated with the predicate/verb of a sentence and their classification into their specific roles. This is an important step towards making sense of the meaning of a sentence.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Intention identification

slide-16
SLIDE 16

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Semantic Role Labeling

Predicates sold, bought, purchase represent an event. Semantic roles express the abstract role that arguments of a predicate can take in event. Gates - agent that sells Google - agent that buys Microsoft stock - the object being transacted Can we figure out that these sentences have the same meaning?

  • Gates sold Microsoft stock to Google.
  • Google bought Microsoft stock from Gates.
  • The Microsoft stock was sold to Google by Gates.
  • The Microsoft stock was purchased by Google from Gates.
slide-17
SLIDE 17

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-18
SLIDE 18

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-19
SLIDE 19

How would you split sentences in a text?

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Sentence tokenization

We know that the period in Mr. Smith and Google Inc. do not mark sentence boundaries.

  • a period may denote an abbreviation, decimal point, an ellipsis(...), or an email address – not

the end of a sentence.

  • About 47% of the periods in the Wall Street Journal corpus denote abbreviations.

And sometimes sentences can start with non-capitalized words. i is a good variable name. And some sentences are not separated by periods!

Sentence Boundary Disambiguation: you can use PTBTokenizer from Stanford CoreNLP for Java, or Punkt Sentence Tokenizer from NLTK for Python.

slide-20
SLIDE 20

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-21
SLIDE 21

How would you split words in a sentence?

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word tokenization

We don't want to lose the negative particle. And usually, punctuation marks are not part of the words!

slide-22
SLIDE 22

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-23
SLIDE 23

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

PoS Tagging

Part-Of-Speech Tagger is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'proper-noun-plural' or 'verb-past-gerund'. Usually taggers use PoS abbreviations like:

  • NN - noun, singular
  • NNPS - proper noun, plural
  • VBZ - verb, 3rd person singular present
  • JJR - adjective, comparative
  • RBS - adverb, superlative

Usually PoS taggers performs tokenization and lemmatization in the same time.

slide-24
SLIDE 24

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-25
SLIDE 25

The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. For instance:

  • am, are, is → be
  • car, cars, car's, cars' → car

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Lemmatization & Stemming

Stemming usually refers to a crude heuristic process that chops off the ends of words in the hope of achieving this goal correctly most of the time, and often includes the removal of derivational affixes. Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma.

slide-26
SLIDE 26

An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

NLP Pipeline

slide-27
SLIDE 27

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing

A dependency parse connects words according to their relationships. It generates a directed acyclic graph where nodes are words that are dependent

  • n the parent, and edges are labeled by the relationship.

Above is an example of a graph generated by Stanford CoreNLP parser

slide-28
SLIDE 28

Another way to represent dependencies. Note the root relation. The quick brown fox jumps over the lazy dog.

  • root(ROOT-0, jumps-5)
  • det(fox-4, The-1)
  • det(dog-9, the-7)
  • amod(fox-4, brown-3)
  • amod(dog-9, lazy-8)
  • nsubj(jumps-5, fox-4)
  • case(dog-9, over-6)
  • amod(fox-4, quick-2)

_

  • nmod(jumps-5, dog-9)

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing

slide-29
SLIDE 29

Another way to represent dependencies. Note the root relation. The quick brown fox jumps over the lazy dog.

  • root(ROOT-0, jumps-5)
  • det(fox-4, The-1)

determiner

  • det(dog-9, the-7)

determiner

  • amod(fox-4, brown-3)

adjectival modifier

  • amod(dog-9, lazy-8)

adjectival modifier

  • nsubj(jumps-5, fox-4)

nominal subject is the proto-agent of a clause

  • case(dog-9, over-6)

The case relation is used for any preposition in English.

  • amod(fox-4, quick-2)

An adjectival modifier of a nominal is any adjective that serves to modify the meaning of the nominal.

  • nmod(jumps-5, dog-9)

nominal modifier relation is used for nominal modifiers of nouns or clausal predicates

  • meetup.com/IASI-AI/

facebook.com/AI.in.Iasi/

Dependency parsing

slide-30
SLIDE 30

Consider next sentences:

  • I suspect that he is the offender.
  • I suspect the truthfulness of his words.
  • I suspect Osama to be the terrorist.
  • Osama is the main suspect.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation

slide-31
SLIDE 31

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation: using Ontologies

Hyponym - is a word or phrase whose semantic field is included within that of another word, its hyperonym or hypernym. In simpler terms, a hyponym shares a type-of relationship with its hypernym. Verb hypernymy is also called troponymy. Wordnet is a large lexical database of English, and because it has hypernym/hyponym relationships among the synsets, it can be used as a lexical ontology. Ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse.

slide-32
SLIDE 32

Ontologies

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

slide-33
SLIDE 33

Ontologies

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

slide-34
SLIDE 34

In WordNet, word meanings are represented by synonym sets called synsets - lists of synonymous word forms that are interchangeable in some context. Examples: (suspect, surmise), (suspect, distrust, mistrust), (suspect, believe_to_be_guilty) meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation: Verb frames

Each verb synset contains a list of generic sentence frames illustrating the types of simple sentences in which the verbs in the synset can be used. There are total of 35 frames. Some examples:

  • Something ----s

(Ex: vegetate)

  • Somebody ----s

(Ex: respire, sleep)

  • It is ----ing

(Ex: rain, snow)

  • Somebody ----s VERB-ing

(Ex: continue/proceed/keep, avoid )

  • Somebody ----s something

(Ex: manipulate, wave, insuflate)

  • Something ----s Adjective/Noun

(Ex: become/go/get)

  • Somebody ----s something to somebody

(Ex: dedicate, delegate/depute)

slide-35
SLIDE 35

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation

nsubj - nominal subject is the proto-agent of a clause ccomp - clausal complement of a verb is a dependent clause with an internal subject which functions like an object of the verb. dobj - direct object is the entity that is acted upon by the subject.

slide-36
SLIDE 36

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation: Verb frames

Synset Verb Frames Hypernym synset Sentence suspect, surmise Somebody ----s something Somebody ----s that CLAUSE guess, venture, pretend, hazard I suspect that he is the

  • ffender.

suspect, distrust, mistrust Somebody ----s somebody Somebody ----s something disbelieve, discredit I suspect the truthfulness of his words. suspect, believe_to_be_guilty Somebody ----s somebody to INFINITIVE Somebody ----s that CLAUSE think, opine, suppose, imagine, reckon, guess I suspect Osama to be the terrorist.

slide-37
SLIDE 37

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Word sense disambiguation: Verb frames

  • 1. suspect = suppose
  • 2. suspect = believe to be guilty
  • 3. suspect = disbelieve
  • 4. suspect = accused/defendant
slide-38
SLIDE 38

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Handling diathesis

Passive voice vs Active voice: dobj ↔ nsubjpass, nsubj ↔ nmod:agent

slide-39
SLIDE 39

Ben sees John climbing the mountain with his telescope.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-40
SLIDE 40

Ben sees John climbing the mountain with his telescope. His telescope has been installed on the mountain last month.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-41
SLIDE 41

Ben sees John climbing the mountain with his telescope. His telescope has been installed on the mountain last month.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-42
SLIDE 42

Ben sees John climbing the mountain with his telescope. The telescope is heavy and John gets tired.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-43
SLIDE 43

Ben sees John climbing the mountain with his telescope. His telescope helps Ben see John from 10km away.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-44
SLIDE 44

Ben sees John climbing the mountain with his telescope. His telescope helps Ben see John from 10km away. So whose is the telescope?

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

slide-45
SLIDE 45

Mary sees John climbing the mountain with his telescope. His telescope helps Mary see John from 10km away. So whose is the telescope?

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

Dependency parsing pitfalls: syntactical ambiguity

Anaphora/Capthora is the use of an expression whose interpretation depends upon another expression in context (its antecedent or postcedent). These are types of coreference. Take a look over coreference resolution techniques.

slide-46
SLIDE 46

Interested in learning more about chatbots, artificial intelligence or machine learning? Join IAȘI AI meetups and workshops, the latest technical community from Iași counting more than 250 members.

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

slide-47
SLIDE 47

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/

slide-48
SLIDE 48

meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/