Chatbots as active members of our society Proseminar Data Mining - - PowerPoint PPT Presentation

chatbots as active members of our society
SMART_READER_LITE
LIVE PREVIEW

Chatbots as active members of our society Proseminar Data Mining - - PowerPoint PPT Presentation

Chatbots as active members of our society Proseminar Data Mining Luca Dombetzki Fakult fr Informatik Technische Universitt Mnchen Email: luca.dombetzki@tum.de AGENDA Introduction Definition Brief history of chatbots Use cases Main


slide-1
SLIDE 1

Chatbots as active members of our society

Proseminar Data Mining Luca Dombetzki Fakulät für Informatik Technische Universität München Email: luca.dombetzki@tum.de

slide-2
SLIDE 2

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 2

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-3
SLIDE 3

Chatbots today: Microsoft Tay

slide-4
SLIDE 4

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 4

Introduction

What can Chatbots do? How far have they come? What limits still constrain them while impacting our society?

Fig2

slide-5
SLIDE 5

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 5

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-6
SLIDE 6

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 6

Definition - Chatbot

  • Alias: Chatterbot
  • Computer program
  • textual methods
  • Interact with human being
  • Aim 1: Tool, known as a bot
  • Aim 2: convincingly participate in human conversation
slide-7
SLIDE 7

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 7

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-8
SLIDE 8

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 8

Brief history of chatbots

1950 2025 1965 1980 1995 2010 Turing Test

slide-9
SLIDE 9

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 9

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 Turing Test

slide-10
SLIDE 10

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 10

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY Turing Test

slide-11
SLIDE 11

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 11

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY Loebner Prize Turing Test

slide-12
SLIDE 12

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 12

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY A.L.I.C.E (AIML) Loebner Prize Turing Test

slide-13
SLIDE 13

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 13

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY A.L.I.C.E (AIML) Loebner Prize Cleverbot Turing Test

slide-14
SLIDE 14

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 14

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY A.L.I.C.E (AIML) Loebner Prize Cleverbot Facebook Turing Test

slide-15
SLIDE 15

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 15

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY A.L.I.C.E (AIML) Loebner Prize Cleverbot Facebook Whatsapp Turing Test

slide-16
SLIDE 16

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 16

Brief history of chatbots

1950 2025 ELIZA DOCTOR 1965 1980 1995 2010 PARRY A.L.I.C.E (AIML) Loebner Prize Cleverbot Facebook Whatsapp Turing Test Tay

slide-17
SLIDE 17

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 17

Now:

  • companies interested in chatbots
  • many different chatbots on the market
  • several use cases
  • deep learning
  • Malicious bots

Brief history of chatbots

1950 2025 1965 1980 1995 2010

slide-18
SLIDE 18

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 18

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-19
SLIDE 19

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 19

Use Cases

  • Customer service
  • Information acquisition
  • Research
  • Malicious intent
slide-20
SLIDE 20

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 20

Use Cases – Customer Service

  • Goals:
  • Customer closeness
  • reliably understand customer
  • Integrate seamlessly

=> human like appearance not necessary

  • Implementation:
  • Pattern based approach
  • Instant messaging platform APIs with extra features
  • Closed domain
slide-21
SLIDE 21

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 21

Use Cases – Information Acquisition

  • Goals:
  • Simple implementation
  • Ease of use for customer

=> human like appearance not necessary

  • Implementation:
  • Pattern based approach
  • Instant messaging platform APIs with extra features
  • Closed domain
slide-22
SLIDE 22

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 22

Use Cases – Research

  • Natural Language Processing as main topic

=> Access to a lot of data to train, analyze and learn from

  • Opinion mining / sentiment analysis

=> Negobot, a chatbot trained to find pedophiles

slide-23
SLIDE 23

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 23

Use Cases – Malicious Intent

  • Advertisement / spam
  • Phishing attacks

=> Disclosure of private information

  • Spreading of bad information

=> manipulation of public opinion => the better click bots

slide-24
SLIDE 24

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 24

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-25
SLIDE 25

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 25

Main Problems

  • Validation
  • Coherent personality

(same answer to semantically same questions)

  • Context
  • Intention and diversity
slide-26
SLIDE 26

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 26

AGENDA

Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion

slide-27
SLIDE 27

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 27

Mechanics

Complex chatbot broken down in categories of interest:

  • Response
  • Intent
  • Context
  • Domain
slide-28
SLIDE 28

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 28

Mechanics - Response

Retrieval based:

  • Database as a Backend
  • Retrieval algorithm
  • No new text generated

+ Spelling mistakes preventable + reliable

  • open domain impossible

Generative based:

  • Generate complete text
  • Recurrent Neural

Networks (LSTM / GRU) + Open domain learnable (in theory)

  • Unreliable
slide-29
SLIDE 29

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 29

Mechanics - Intent

Pattern approach (AIML):

  • Symbolic reduction
  • Divide and conquer
  • Synonyms
  • Spelling / Grammar

correction + reliable, verifiable – manual Classification approach:

  • E.g. Recurrent Neural

Networks (LSTM / GRU) produce a “intent-vector” + Fully automatic and scaleable – Intent vector not human readable => decoder required

slide-30
SLIDE 30

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 30

Mechanics – Intent (Code)

<category> <pattern>DO YOU KNOW WHO * IS</pattern> <template><srai>WHO IS <star/></srai></template> </category> <category> <pattern>YES *</pattern> <template><srai>YES</srai> <sr/></template> </category>

slide-31
SLIDE 31

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 31

Mechanics - Context

Rule based (AIML):

  • State machine, variables
  • Conditionals

+ human readable

  • human planning

Machine learning based:

  • Context Layer in RNN
  • Context vector together

with input data + artificial intelligence => human behaviour – unverifiable, unstable

slide-32
SLIDE 32

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 32

Mechanics - Context

slide-33
SLIDE 33

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 33

Mechanics- Domain

  • Closed domain

=> less possibilities => more fitting replies

  • Open domain

=> infinite possibilities + topic switches

slide-34
SLIDE 34

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 34

Mechanics - Architectures

  • Chatbot API: A lot of ready-to-use features
  • Seq2Seq:

Two RNN connected

  • Cleverbot: Search on a database of

human responses

  • A.L.I.C.E:

AIML script

slide-35
SLIDE 35

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 35

Detection

Passive Detection:

  • Message sizes
  • Inter message delay
  • Repetition
  • Evasiveness

Social Detection:

  • Followers to following

ratio

  • Activity

Active Detection:

  • General questions
  • URL probes
  • Subcognitive probes
  • Rating games
  • Social/Emotional probes
  • Ambiguity probes /

Keyword targeting

slide-36
SLIDE 36

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 36

Example of a Seq2Seq Model using TF

1) Cornell Movie Corpus 2) Transform to data accepted by TensorFlow (https://github.com/b0noI/dialog_converter) 3) Train TF-translate model with this data 20000 it.: Intent and diversity problem (underfit) 45000 it.: Long sentences that make sence 60000+ it.: Special answers exactly from the training data (overfit)

slide-37
SLIDE 37

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 37

Conclusion

Most of the value of deep learning today is in narrow domains where you can get a lot of data. Here’s one example of something it cannot do: have a meaningful

  • conversation. There are demos, and if you cherry-pick the

conversation, it looks like it’s having a meaningful conversation, but if you actually try it yourself, it quickly goes off the rails. How well do chatbots work today?

Andrew Ng, chief scientist of Baidu

slide-38
SLIDE 38

9th June, 2017 Luca Dombetzki, Proseminar Datamining, TU Munich 38

Sources

  • Fig1:

http://static6.businessinsider.de/image/56f3d057dd08955 e258b4762-1400-621/screen_shot_2016-03- 24_at_11_12_04.jpg

  • Fig2:

http://static6.businessinsider.com/image/5645ffe92491f9 48008b4e21-960/mavssn.png

  • Fig3: http://zonaguadalajara.com/wp-

content/uploads/2013/06/PizzaHut.gif