form and meaning
play

form and meaning in complex medical terms Evidence from clinical - PowerPoint PPT Presentation

The interplay of form and meaning in complex medical terms Evidence from clinical Dutch Leonie Grn, Ann Bertels & Kris Heylen LAW-MWE-CxG-2018, 26 August 2018, Santa Fe Specialized terminologies are dominated by Multi-Word Expressions


  1. The interplay of form and meaning in complex medical terms Evidence from clinical Dutch Leonie Grön, Ann Bertels & Kris Heylen LAW-MWE-CxG-2018, 26 August 2018, Santa Fe

  2. Specialized terminologies are dominated by Multi-Word Expressions (cf. e.g. Daille, 1994; De Hertog & Heylen, 2012)

  3. Specialized terminologies are dominated by Multi-Word Expressions (cf. e.g. Daille, 1994; De Hertog & Heylen, 2012)

  4. Specialized terminologies are dominated by Multi-Word Expressions (cf. e.g. Daille, 1994; De Hertog & Heylen, 2012)

  5. Specialized terminologies are dominated by Multi-Word Expressions (cf. e.g. Daille, 1994; De Hertog & Heylen, 2012)

  6. property_of

  7. finding procedure site site abdomen, head abdomen, head device used severity catheter mild, moderate

  8. obesitas ter hoogte van abdominale obesitas abdomen ‘abdominal obesity’ ‘obesitas at the abdomen’ obesity abdomen obees + obesitas abdominaal ‘abdomen obese’ ‘obesitas abdominal’ abdomen obesitas abdomen ‘obesitas abdomen’

  9. BUT: specialized information is entrenched in linguistic structures mutual attraction between syntactic & semantic structures grammatical features can indicate conceptual relations (cf. Schulze & Römer, 2008; Faber & Léon-Araùz, 2016; ten Hacken, 2015)

  10. Is there a patterning of the conceptual features ? surface form and of complex medical terms

  11. Annotation of medical terms with SNOMED codes ID term 249533007 abdomen obees 249533007 abdominale obesitas 249533007 abd obesitas 274,082 entities corpus of EHRs annotation of 15,025 unique terms 14,999 consultations 4,426 consultations 7,687 concepts 500 patients 171 patients validation of term- concept associations

  12. Retrieval of MWEs findings procedures diagnostic SNOMED term obesity ultrasonography Dutch variants obesitas echografie adipositas sonografie obes echo lexical stems adipo sono Σ 59,731 Σ 63,559

  13. Included types of MWEs -3 -2 -1 head noun +1 +2 +3 ochtend hypo ‘morning’ ‘hypoglycemia’ hypo matinale ‘hypoglycemia’ ‘matinal’ hypo met forse convulsie ‘hypoglycemia’ ‘with strong seizure’

  14. Included types of MWEs -3 -2 -1 head noun +1 +2 +3 ochtend hypo compounds ‘morning’ ‘hypoglycemia’ matinale hypo ‘matinal’ met forse convulsie hypo ‘with strong seizure’

  15. Included types of MWEs -3 -2 -1 head noun +1 +2 +3 ochtend hypo ‘morning’ ‘hypoglycemia’ pre-modified matinale hypo noun phrases ‘matinal’ met forse convulsie hypo ‘with strong seizure’

  16. Included types of MWEs -3 -2 -1 head noun +1 +2 +3 ochtend hypo ‘morning’ ‘hypoglycemia’ matinale hypo ‘matinal’ post-modified met forse convulsie hypo ‘with strong seizure’ noun phrases

  17. Annotation of MWEs at 2 levels: Penn Tagset for biomedical text formal SNOMED Semantic classes & attributes conceptual (de Castilho et al., 2016; Warner et al., 2012; SNOMED International, 2018)

  18. JJ NN formal diabetische retinopathie ‘diabetic retinopathy’ CAUSE FINDING conceptual

  19. NN NN formal injectie insuline ‘injection [of] insulin’ PROCEDURE SUBSTANCE conceptual

  20. Analysis at phrase level: Influence of the semantic type of the headword ~ degree of lexicalization ? ~ proportion of phrase types

  21. Distribution of phrase types procedures findings 0% 20% 40% 60% 80% 100% compounds pre-modified NPs post-modified NPs

  22. Average number of unique expressions per concept across different phrase types compounds pre-modified NPs post-modified NPs procedures 2.57 3.69 3.38 findings 1.33 3.63 2.83

  23. Analysis at token level: Patterning of concept combinations ? ~ grammatical structures

  24. Associate expressions with overlapping tag sequences with grammatico-semantic patterns rx thorax ‘x - ray [of the] chest’ ) ( NN, NN PROCEDURE, SITE CT schedel ‘CT [of the] skull’ ) ( JJ, NN abdominale injectie ‘abdominal injection’ SITE, PROCEDURE

  25. frequency of the grammatico-semantic pattern absolute frequency of the concept combination how dominant is a construction to express a combination of concepts

  26. Top patterns for findings combined with PoS sequence example relative frequency alimentaire obesitas CAUSE JJ, NN 0.90 ‘alimentary obesity ’ vaak hypo COURSE RB, NN ‘frequently 0.35 hypoglycemia’ morbiede obesitas SEVERITY JJ, NN 0.83 ‘morbid obesity ’

  27. Top patterns for findings combined with PoS sequence example relative frequency alimentaire obesitas CAUSE JJ, NN 0.90 ‘alimentary obesity ’ vaak hypo COURSE RB, NN ‘frequently 0.35 hypoglycemia’ morbiede obesitas SEVERITY JJ, NN 0.83 ‘morbid obesity ’

  28. Top patterns for procedures combined with PoS sequence example relative frequency lipidenmeting COMPONENT NNS, NN 0.44 ‘measurement of lipids ’ COMPONENT, gunstig lipidenprofiel JJ, NNS, NN 0.71 PROPERTY ‘good lipid profile ’ COMPONENT, glycemiedagprofielen NN, NN, NNS 0.72 TIME ‘glycemic day profiles’

  29. Top patterns for procedures combined with PoS sequence example relative frequency lipidenmeting COMPONENT NNS, NN 0.44 ‘measurement of lipids ’ COMPONENT, gunstig lipidenprofiel JJ, NNS, NN 0.71 PROPERTY ‘good lipid profile ’ COMPONENT, glycemiedagprofielen NN, NN, NNS 0.72 TIME ‘glycemic day profiles’

  30. conceptual composition ~ formal structure of medical MWEs findings ~ pre-modified NPs procedures ~ nominal compounds

  31. one reason: lexical gaps combined adjective noun with adj + noun extreem *extremiteit extreme obesitas finding SEVERITY ‘extreme’ ‘extremity’ ‘extreme obesity’ insuline procedure SUBSTANCE – noun + noun ‘insulin’ insulineinjectie ‘insulin injection’

  32. BUT: tendency is robust across concept combinations! adj + noun combined adjective noun renale insufficiëntie with ‘renal insufficiency’ finding renaal nier SITE ‘renal’ ‘kidney’ procedure noun + noun nierecho ‘kidney echography’ echo nier ‘echography [of the] kidney’

  33. structural reductions ~ fixed concept combinations combined reduced prepositional full prepositional phrase with phrase meting van de lipiden meting lipiden procedure COMPONENT ‘measurement of lipids’ ‘measurement lipids’ rx van de thorax rx thorax procedure SITE ‘x - ray of the thorax’ ‘x -ray thorax ’ lipodistrofie ter hoogte van het lipodistrofie abdomen SITE abdomen finding ‘lipodystrophy abdomen’ ‘lipodystrophy at the abdomen’ nefropathie ten gevolge van *nephropathie diabetes finding CAUSE diabetes ‘nephropathy diabetes’ ‘nephropathy due to diabetes’

  34. complex medical terms fixed concept constellations habitual formal constructions constructions take on communicative value in themselves

  35. benefit for clinical NLP identification and segmentation of MWEs semantic classification and relation extraction

  36. Thank you for your attention! Questions? Suggestions? leonie.gron@kuleuven.be 38

  37. References de Castilho, R. E., Mujdricza-Maydt, E., Yimam, S. M., Hartmann, S., Gurevych, I., Frank, A., & Biemann, C. (2016). A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures. In Proceedings of the LT4DH workshop at COLING 2016 (pp. 76 – 84). Osaka. Daille, B. (1994). Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. In The Balancing Act: Combining Symbolic and Statistical Approaches to Language. Workshop at the 32nd Annual Meeting of the Association for Computational Linguistics (pp. 29 – 36). Stroudsburg: Association for Computational Linguistics. De Hertog, D., & Heylen, K. (2012). The Prevalence of Multiword Term Candidates in a Legal Corpus. In G. Aguado de Cea (Ed.), Proceedings of the 10th Terminology and Knowledge Engineering Conference (TKE2012): New Frontiers in the Constructive Symbiosis of Terminology and Knowledge Engineering (pp. 283 – 290). Madrid: Universidad Politecnica de Madrid.

  38. References Faber, P., & León-Araùz, P. (2016). Specialized Knowledge Representation and the Parameterization of Context. Frontiers in Psychology , 7 (February). http://doi.org/10.3389/fpsyg.2016.00196 Warner, C., Lanfranchi, A., O’Gorman, T., Howard, A., Gould, K., & Regan, M. (2012). Bracketing Biomedical Text: An Addendum to Penn Treebank II Guidelines. Retrieved May 14, 2018, from https://clear.colorado.edu/compsem/documents/treebank_guidelines.pdf Schulze, R., & Römer, U. (2008). Introduction. Patterns, Meaningful Units and Specialized Discourses. International Journal of Corpus Linguistics , 13 (3), 265 – 270. http://doi.org/10.1075/ijcl.13.3.01sch SNOMED CT Editorial Guide. (2018). Retrieved May 14, 2018, from https://confluence.ihtsdotools.org/display/DOCEG/SNOMED+CT+Editorial+Guide

  39. References Icons from the Noun Project created by Ben Davis Cengis SARI Drishya Ken Murray Melvin https://thenounproject.com/

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend