Machine learning-based detection of chemical risk N Grabar 1 , O - PowerPoint PPT Presentation

Context Material Methods Results and Discussion Conclusion Machine learning-based detection of chemical risk N Grabar 1 , O Wandji Tchami 1 , L Maxim 2 1 CNRS UMR8163 STL, Universit´ e Lille 3, France 2 Institut des Sciences de la Communication, CNRS UPS3088, France MIE, Istambul, September 2014 1/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Plan 1 Context 2 Material 3 Methods 4 Results and Discussion 5 Conclusion and Future work 2/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Context Chemical risk: when chemical substances dangerous for human or animal health or for environment Bisphenol A, phtalates: endocrine disrupters at certain doses can interfere with the endocrine (or hormone system) in mammals Great number of severe disorders: sexual development problems (feminizing of males or masculine effects on females) breast cancer, prostate cancer, thyroid and other cancers brain development problems and deformations of the body Controversial topics 3/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Context Dangerous substances Authorization for marketing is required European Food Safety Authority (EFSA) analysis of a great amount of literature to provide scientifically-based arguments for decision-makers on possibility and appropriateness of marketing products and goods → Propose an automatic approach for the analysis of literature for the detection of sentences related to chemical risk 4/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Material Processed corpus : literature on Bisphenol A-related experiments and results over 80,000 word occurrences typical statements from scientific and institutional literature used to support the chemical decisions on risk management Linguistic resources : negation: no, not, neither, lack, absent, missing uncertainty: possible, hypothetical, should, can, may, usually limitations: only, shortcoming, small, insufficient approximation: approximately, commonly, estimated 5/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Material Classification : types of chemical risk factors and uncertainties causal relationship between the chemicals and the induced risk laboratory procedures, human factors, animals tested, significance of results, form of reporting, natural variability, control of confounders, exposure, dosage, assumptions, performance of the measurement and analytical method Reference data : manual annotation by a specialist of chemical risk 425 segments are assigned to 55 classes of risk factors classes do not overlap the reference data are a subset of the whole corpus 6/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Methods: supervised categorization Pre-processing and Annotation : pre-processing with the Ogmios plateform tokenization, POS-tagging and lemmatization by Genia tagger annotation with the linguistic resources Supervised categorization : categories to be recognized: detect sentences concerned with the chemical risk detect to which classes of chemical risk the sentences belong datasets with equal numbers of positive and negative examples Features used : forms : uncertain, risks lemmas : uncertain, risk lf : uncertain/uncertain, risks/risk tag : adj, noun lft : uncertain/uncertain/adj, risks/risk/noun stag : uncertainty, negation, limitations, approximation all : combination of all the features 7/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Methods: supervised categorization Feature weighting : freq : row frequency of features norm : normalization of the frequency by the document length tfidf : weighting of the frequency by term frequency*inverse document frequency Baseline : assignment of sentences in the default category with a two-category test: 50% performance Evaluation : cross-validation: three-fold cross-validation precision, recall, f-measure gain: real improvement of the performance P by comparison P − BL with the baseline BL : 1 − BL 8/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Results and Discussion 1 freq norm 0.8 tfidf performance 0.6 0.4 0.2 0 all form lemm lf lft stag tag descripteurs little impact of features and their weighting semantic tags: positive effect 9/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Results and Discussion 1 1 1 freq freq freq norm norm norm 0.8 0.8 0.8 tfidf tfidf tfidf performance performance performance 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 all form lemm lf lft stag tag all form lemm lf lft stag tag all form lemm lf lft stag tag descripteurs descripteurs descripteurs Natural unexplained Choice of Results reporting variability uncertainty factors ...inter-individual differences occur in expression of the isoenzymes responsible for the detoxification of BPA The use of the standard uncertainty factor (UF) of 10 to take into account interspecies differences is therefore considered quite conservative From the study description, although not clearly stated, it can be inferred that the BPA dose level was 40 microg/kg b.w./day 10/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Results and Discussion 1 1 1 freq freq freq norm norm norm 0.8 0.8 0.8 tfidf tfidf tfidf performance performance performance 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 all form lemm lf lft stag tag all form lemm lf lft stag tag all form lemm lf lft stag tag descripteurs descripteurs descripteurs Significance of results Control of confounders Assumptions Based on the re-analysis the Panel considered that no conclusion can be drawn from this study on the effect of BPA on learning and memory behaviour due to large variability in the data In consideration of the shortcomings in the design of both studies, in particular the uncertainty regarding the lactational as well as in utero exposure of the offspring to BPA... For this reason it has been hypothesised that circulating UGTs may substantially contribute to detoxification of xenobiotics in the foetus 11/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Results and Discussion Gain Class Baseline F-measure Gain Natural unexplained variability 0.50 0.95 0.90 Choice of uncertainty factors 0.50 0.93 0.86 Results reporting 0.50 0.82 0.64 Significance of the results 0.50 0.72 0.44 Control of confounders 0.50 0.60 0.20 Assumptions 0.50 0.67 0.34 no direct relations between the number of sentences and performance semantic annotation exploited by automatic categorization lexical specifity of some classes 12/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Context Material Methods Results and Discussion Conclusion Conclusion and Future work Extraction of sentences concerned by chemical risk at two levels: chemical risk classes of chemical risk Some classes contain small number of sentences: not processed individually → oversampling Performance: 0.60-0.70 for classes that are difficult to detect 0.82-0.95 for classes that show lexical and semantic specificities Future work: building the dedicated lexicon application of over-sampling algorithms use of other methods (topic modeling, information retrieval) larger corpus, other substances evaluation by experts working in environmental agencies 13/13 Machine learning-based detection of chemical risk N Grabar, O Wandji Tchami, L Maxim

Machine learning-based detection of chemical risk N Grabar 1 , O - PowerPoint PPT Presentation

Context Material Methods Results and Discussion Conclusion Machine learning-based detection of chemical risk N Grabar 1 , O Wandji Tchami 1 , L Maxim 2 1 CNRS UMR8163 STL, Universit e Lille 3, France 2 Institut des Sciences de la

Chemical Equations and Chemical Reactions Symbols Used in Chemical Equations Chemical Equations

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Risk Management Workshop 1 Risk management workshop Why do we Risk Risk and need risk

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Chemical Reactions Slide 3 / 142 Slide 4 / 142 Table of Contents: Chemical Reactions Chemical

Chemical Thermodynamics Chemical Potential: gas Need chemical potential at arbitrary temperature

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

what t are th the opti tions? PRESCRIBING FOR DENTAL PAIN: WHAT ARE THE OPTIONS? This

MSMS (01PCYQW) 2014-2015 Organization: the course is divided into two parts: the first

MAT137 - Calculus with proofs Assignment #1 due on THURSDAY. TODAY: Limits geometrically WED:

Wedding & Event Design ! with Lindsay Landman ! 1 1 Week 6: Space & Floor Planning 2

ALL THINGS Lindy Strong THOUGHTS ARE ENERGY Thoughts are Energy Thought energy has no

Why We Earnestly Teach The Bible If I were the devil, one of my first aims would be to stop folk

Objectives Describe important elements of care for newborns and infants up to two months of age

Five factors for a fall - Judges 1:1-3:6 1) Inconsistency (knowing the right thing to do but not

Sambuz

Useful Links

Newsletter

Mail Us

Machine learning-based detection of chemical risk N Grabar 1 , O - PowerPoint PPT Presentation

Context Material Methods Results and Discussion Conclusion Machine learning-based detection of chemical risk N Grabar 1 , O Wandji Tchami 1 , L Maxim 2 1 CNRS UMR8163 STL, Universit e Lille 3, France 2 Institut des Sciences de la

Chemical Equations and Chemical Reactions Symbols Used in Chemical Equations Chemical Equations

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Risk Management Workshop 1 Risk management workshop Why do we Risk Risk and need risk

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Chemical Reactions Slide 3 / 142 Slide 4 / 142 Table of Contents: Chemical Reactions Chemical

Chemical Thermodynamics Chemical Potential: gas Need chemical potential at arbitrary temperature

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

what t are th the opti tions? PRESCRIBING FOR DENTAL PAIN: WHAT ARE THE OPTIONS? This

MSMS (01PCYQW) 2014-2015 Organization: the course is divided into two parts: the first

MAT137 - Calculus with proofs Assignment #1 due on THURSDAY. TODAY: Limits geometrically WED:

Wedding &amp; Event Design ! with Lindsay Landman ! 1 1 Week 6: Space &amp; Floor Planning 2

ALL THINGS Lindy Strong THOUGHTS ARE ENERGY Thoughts are Energy Thought energy has no

Why We Earnestly Teach The Bible If I were the devil, one of my first aims would be to stop folk

Objectives Describe important elements of care for newborns and infants up to two months of age

Five factors for a fall - Judges 1:1-3:6 1) Inconsistency (knowing the right thing to do but not

Sambuz

Useful Links

Newsletter

Mail Us

Wedding & Event Design ! with Lindsay Landman ! 1 1 Week 6: Space & Floor Planning 2