Using Natural Language Processing and Machine Learning to Assist - - PowerPoint PPT Presentation

using natural language processing and machine learning to
SMART_READER_LITE
LIVE PREVIEW

Using Natural Language Processing and Machine Learning to Assist - - PowerPoint PPT Presentation

Using Natural Language Processing and Machine Learning to Assist First-Level Customer Support for Contract Management Master thesis Final presentation Michael Legenc Advisor: Daniel Braun Munich, 08.01.2018 Software Engineering


slide-1
SLIDE 1

Software Engineering betrieblicher Informationssysteme (sebis) Fakultät für Informatik Technische Universität München wwwmatthes.in.tum.de

Using Natural Language Processing and Machine Learning to Assist First-Level Customer Support for Contract Management

Master thesis – Final presentation Michael Legenc Advisor: Daniel Braun Munich, 08.01.2018

slide-2
SLIDE 2

Introduction

Master thesis – Final presentation – Michael Legenc

2

§ Scalability issues of email customer supports. § Acceleration by automation and assistance features. Hurdle: Free text. Machine Learning and Natural Language Processing.

slide-3
SLIDE 3

Demo

Master thesis – Final presentation – Michael Legenc

3

API: Classification, Information Extraction

{} JSON

Automation, Assistance

Cancellation High priority Contract team

Cancel contract

slide-4
SLIDE 4

Demo

4

Master thesis – Final presentation – Michael Legenc

slide-5
SLIDE 5

Classification: Machine Learning

5

Training set Old, labeled emails Supervised machine learning New email Predicted classification Preprocessing, Vectorization

Master thesis – Final presentation – Michael Legenc

slide-6
SLIDE 6

Training set creation

Master thesis – Final presentation – Michael Legenc

6

§ Dataset: § 18 Mio. unlabeled Emails. § Filtering § Retrievable by an API. § Caching § Command Line Tool: Demo.

slide-7
SLIDE 7

Preprocessing, Vectorization

Master thesis – Final presentation – Michael Legenc

7

§ Vectorization: Tf-idf

slide-8
SLIDE 8

Classification: Evaluation

Master thesis – Final presentation – Michael Legenc

8

§ Final configuration: § Stochastic gradient descent § Tf-idf thresholds: max. 0.4, min. 0.001

slide-9
SLIDE 9

Information extraction

9

§ Implemented types: § Person § Date § Time § Postcode, City § Money § Vendor § Order-, account- and contractnumber

Master thesis – Final presentation – Michael Legenc

slide-10
SLIDE 10

Information extraction

10

Ø Non-ML: Regex, keyword lists and rule-based.

Master thesis – Final presentation – Michael Legenc

slide-11
SLIDE 11

11

§ Not much support. § Learns from the sentence context. Ø Unknown or misspelled words are recognized by their context. § Training set: § Needs massive input. Public data is not modifiable. § Creation supported by Command Line Tool and special file format.

Master thesis – Final presentation – Michael Legenc

Information extraction – Machine Learning

slide-12
SLIDE 12

Evaluation

12

§ Used training set: Created by non-ml approach. § 1000 emails. 3361 entities. § Test set: Manually created. Stanford NER: Non-ML:

Master thesis – Final presentation – Michael Legenc

slide-13
SLIDE 13

Future work

13

§ Transparent training. § Topic segmentation. § Better understanding of coherences. § Automation and assistance features.

Master thesis – Final presentation – Michael Legenc

slide-14
SLIDE 14

Conclusion

14

§ Email customer support benefits from automation and assistance: § Time, thus cost saving. § Increased employee and customer satisfaction. § Free text accessibility by ML and NLP . § Selection of algorithms, parameters and preprocessors depends

  • n data set and concrete application:

§ Interchangeable toolset approach. § Evaluation-based selection.

Master thesis – Final presentation – Michael Legenc

slide-15
SLIDE 15

Thank you

15

Master thesis – Final presentation – Michael Legenc

slide-16
SLIDE 16

16

Master thesis – Final presentation – Michael Legenc

slide-17
SLIDE 17

17

Master thesis – Final presentation – Michael Legenc

slide-18
SLIDE 18

18

Master thesis – Final presentation – Michael Legenc

slide-19
SLIDE 19

Current customer support team organization

at check24´s gas and electricity department

19

.. ..

First-level support Trying to solve all kinds of problems. Incoming email gas@check24.de or strom@check24.de

..

Second-level support Long-time employees.

Requesting if and only if the problem is too complex. Very exceptional.

Master thesis – Final presentation – Michael Legenc

slide-20
SLIDE 20

Current email preprocessing

20

Spam detection Microsoft exchange server Incoming email gas@check24.de or strom@check24.de Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam. Storage MySQL Server Long-time storing of all incoming and outgoing emails with the determined folder and customer mappings.

Master thesis – Final presentation – Michael Legenc

slide-21
SLIDE 21

Software used by the customer support

21

Spam detection Microsoft exchange server Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam.

Webmail: Superficial mail inspection. Button (e.g. in the red box) leads to a more sophisticated solution.

Master thesis – Final presentation – Michael Legenc

slide-22
SLIDE 22

Software used by the customer support

22

Spam detection Microsoft exchange server Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam.

Proprietary software: View mail with further information and processing features. Button (e.g. in the red box) leads to a more sophisticated solution.

Master thesis – Final presentation – Michael Legenc

slide-23
SLIDE 23

Contribution

Master thesis – Final presentation – Michael Legenc

23

§ Toolset: Classification, Information Extraction § Command Line Tool: § Training set creation § Inspection § Optimization § Evaluation § API: Enables integration