thomas wood nlp data science consultant past projects
play

Thomas Wood NLP/data science consultant Past projects Boehringer - PowerPoint PPT Presentation

Thomas Wood NLP/data science consultant Past projects Boehringer Ingelheim - pharma CV Library: Predict industries/salaries from CV - word2vec + CNN Predict search terms from CV - LSTM Forensic stylometry demo


  1. Thomas Wood NLP/data science consultant

  2. Past projects ● Boehringer Ingelheim - pharma ● CV Library: Predict industries/salaries from CV - word2vec + CNN ○ ○ Predict search terms from CV - LSTM ● Forensic stylometry demo Identifying author ○ ● Chatbots Intelligent home etc ○ ○ Question answering about products ● Document clustering, classification, trend detection, sentiment analysis Cambridge Masters: anaphora resolution it’s raining ●

  3. Boehringer Ingelheim ● Before running a clinical trial a pharma company writes a 200 page PDF called a protocol. I developed an ML model which ● extracts important data from the protocol: type of treatment, toxicity, number of subjects, etc.

  4. Boehringer Ingelheim (2) ● Company has factories all over the world. Most medicines go through multiple facilities and countries before going to market. When manufacturing defect occurs it is written in free text in local ● language by factory worker, e.g. temperature deviation of 5 degrees due to crack in vial probably occurring in transit ● I ran unsupervised topic detection to identify commonest problems in various categories of products from the unstructured text data.

  5. CV-Library This was 2.5 years ago, before ELMO/BERT ● Upload CV When you upload a CV, it gets converted to TXT ● Goes through word2vec and passed through Recommends industry ● deep NN ... Use TensorFlow NMT to ● Then some fields which candidate previously recommend search term filled out, get autofilled! ○ Repurposed Viet translator Result: more Trained on 12 million CVs ● engagement, fewer dropouts ● Deployed on GCP ● 7% increase in signups - £££

  6. Chatbots Artificial Solutions ● Worked building chatbots for mobile and web Shell, AT&T, IKEA, Samsung, HTC, Rightmove ● Integrated smart home with voice ● commands ○ turn on the coffee machine every Tuesday when I open the downstairs front door

  7. Forensic stylometry ● https://www.fastdatascience.com/author-prediction-demo ● Oxford University workshop on NLP every summer

  8. Document analysis, trend detection ● Developed NLP pipeline for English and German at Pattern Science AG, near Frankfurt Used for document classification ● ● Trend detection ● Emerging topics

  9. Masters Cambridge ● Unsupervised learning for identifying pleonastic pronouns It seemed that things would never get any better ○ ○ It surprised me to hear him say that Download available ●

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend