Using Natural Language Processing and Machine Learning to Assist - - PowerPoint PPT Presentation
Using Natural Language Processing and Machine Learning to Assist - - PowerPoint PPT Presentation
Using Natural Language Processing and Machine Learning to Assist First-Level Customer Support for Contract Management Master thesis Final presentation Michael Legenc Advisor: Daniel Braun Munich, 08.01.2018 Software Engineering
Introduction
Master thesis – Final presentation – Michael Legenc
2
§ Scalability issues of email customer supports. § Acceleration by automation and assistance features. Hurdle: Free text. Machine Learning and Natural Language Processing.
Demo
Master thesis – Final presentation – Michael Legenc
3
API: Classification, Information Extraction
{} JSON
Automation, Assistance
Cancellation High priority Contract team
Cancel contract
Demo
4
Master thesis – Final presentation – Michael Legenc
Classification: Machine Learning
5
Training set Old, labeled emails Supervised machine learning New email Predicted classification Preprocessing, Vectorization
Master thesis – Final presentation – Michael Legenc
Training set creation
Master thesis – Final presentation – Michael Legenc
6
§ Dataset: § 18 Mio. unlabeled Emails. § Filtering § Retrievable by an API. § Caching § Command Line Tool: Demo.
Preprocessing, Vectorization
Master thesis – Final presentation – Michael Legenc
7
§ Vectorization: Tf-idf
Classification: Evaluation
Master thesis – Final presentation – Michael Legenc
8
§ Final configuration: § Stochastic gradient descent § Tf-idf thresholds: max. 0.4, min. 0.001
Information extraction
9
§ Implemented types: § Person § Date § Time § Postcode, City § Money § Vendor § Order-, account- and contractnumber
Master thesis – Final presentation – Michael Legenc
Information extraction
10
Ø Non-ML: Regex, keyword lists and rule-based.
Master thesis – Final presentation – Michael Legenc
11
§ Not much support. § Learns from the sentence context. Ø Unknown or misspelled words are recognized by their context. § Training set: § Needs massive input. Public data is not modifiable. § Creation supported by Command Line Tool and special file format.
Master thesis – Final presentation – Michael Legenc
Information extraction – Machine Learning
Evaluation
12
§ Used training set: Created by non-ml approach. § 1000 emails. 3361 entities. § Test set: Manually created. Stanford NER: Non-ML:
Master thesis – Final presentation – Michael Legenc
Future work
13
§ Transparent training. § Topic segmentation. § Better understanding of coherences. § Automation and assistance features.
Master thesis – Final presentation – Michael Legenc
Conclusion
14
§ Email customer support benefits from automation and assistance: § Time, thus cost saving. § Increased employee and customer satisfaction. § Free text accessibility by ML and NLP . § Selection of algorithms, parameters and preprocessors depends
- n data set and concrete application:
§ Interchangeable toolset approach. § Evaluation-based selection.
Master thesis – Final presentation – Michael Legenc
Thank you
15
Master thesis – Final presentation – Michael Legenc
16
Master thesis – Final presentation – Michael Legenc
17
Master thesis – Final presentation – Michael Legenc
18
Master thesis – Final presentation – Michael Legenc
Current customer support team organization
at check24´s gas and electricity department
19
.. ..
First-level support Trying to solve all kinds of problems. Incoming email gas@check24.de or strom@check24.de
..
Second-level support Long-time employees.
Requesting if and only if the problem is too complex. Very exceptional.
Master thesis – Final presentation – Michael Legenc
Current email preprocessing
20
Spam detection Microsoft exchange server Incoming email gas@check24.de or strom@check24.de Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam. Storage MySQL Server Long-time storing of all incoming and outgoing emails with the determined folder and customer mappings.
Master thesis – Final presentation – Michael Legenc
Software used by the customer support
21
Spam detection Microsoft exchange server Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam.
Webmail: Superficial mail inspection. Button (e.g. in the red box) leads to a more sophisticated solution.
Master thesis – Final presentation – Michael Legenc
Software used by the customer support
22
Spam detection Microsoft exchange server Customer detection PHP-Server Mapping mails to customers and simple folders like gas/electricity (unseen) inbox/spam.
Proprietary software: View mail with further information and processing features. Button (e.g. in the red box) leads to a more sophisticated solution.
Master thesis – Final presentation – Michael Legenc
Contribution
Master thesis – Final presentation – Michael Legenc