extractive evidence based medicine summarisation based on
play

Extractive Evidence Based Medicine Summarisation Based on - PowerPoint PPT Presentation

Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics Abeed Sarker 1 a 1 ecile Paris 2 Diego Moll C 1 Centre for Language Technology, Macquarie University, Sydney 2 CSIRO ICT Centre, Sydney CBMS 2012, Rome


  1. Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics Abeed Sarker 1 a 1 ecile Paris 2 Diego Moll´ C´ 1 Centre for Language Technology, Macquarie University, Sydney 2 CSIRO ICT Centre, Sydney CBMS 2012, Rome

  2. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 2/28

  3. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 3/28

  4. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 4/28

  5. Background Method Evaluation Evidence Based Medicine http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/ EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 5/28

  6. Background Method Evaluation EBM and Natural Language Processing http://hlwiki.slais.ubc.ca/index.php?title=Five_steps_of_EBM NLP tasks ◮ Question analysis and classification ◮ Information Retrieval ◮ Classification and re-ranking ◮ Information extraction ◮ Question answering ◮ Summarisation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 6/28

  7. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 7/28

  8. Background Method Evaluation General Approach In a Nutshell 1. Gather statistics from the best 3-sentence extracts. ◮ Exhaustive search to find these best extracts. 2. Build three classifiers, one per sentence in the final extract. ◮ Classifier 1 based on statistics from best 1st sentence. ◮ Classifier 2 based on statistics from best 2nd sentence. ◮ Classifier 3 based on statistics from best 3rd sentence. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 8/28

  9. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 9/28

  10. Background Method Evaluation Journal of Family Practice’s “Clinical Inquiries” EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 10/28

  11. Background Method Evaluation The XML Contents I < r e c o r d i d =”7843” > < u rl > http ://www. j f p o n l i n e . com/ Pages . asp ?AID=7843&amp ; i s s u e=September 2009&amp ; UID= < /ur l > < question > Which treatments work best f o r hemorrhoids? < /question > < answer > < s n i p i d=”1” > < s n i p t e x t > E x c i s i o n i s the most e f f e c t i v e treatment f o r thrombosed e x t e r n a l hemorrhoids . < / s n i p t e x t > < s o r type=”B” > r e t r o s p e c t i v e s t u d i e s < /sor > < long i d =”1 1” > < l o n g t e x t > A r e t r o s p e c t i v e study of 231 p a t i e n t s t r e a t e d c o n s e r v a t i v e l y or s u r g i c a l l y found that the 48.5% of p a t i e n t s t r e a t e d s u r g i c a l l y had a lower r e c u r r e n c e r a t e than the c o n s e r v a t i v e group ( number needed to t r e a t [NNT]=2 f o r r e c u r r e n c e at mean f o l l o w − up of 7.6 months ) and e a r l i e r r e s o l u t i o n of symptoms ( average 3.9 days compared with 24 days f o r c o n s e r v a t i v e treatment ). < / l o n g t e x t > < r e f i d =”15486746” a b s t r a c t=”A b s t r a c t s /15486746. xml” > Greenspon J , Williams SB , Young HA , et a l . Thrombosed e x t e r n a l hemorrhoids : outcome a f t e r c o n s e r v a t i v e or s u r g i c a l management . Dis Colon Rectum . 2004; 47: 1493 − 1498. < / r e f > < /long > < long i d =”1 2” > < l o n g t e x t > A r e t r o s p e c t i v e a n a l y s i s of 340 p a t i e n t s who underwent o u t p a t i e n t e x c i s i o n of thrombosed e x t e r n a l hemorrhoids under l o c a l a n e s t h e s i a r e p o r t e d a low r e c u r r e n c e r a t e of 6.5% at a mean f o l l o w − up of 17.3 months. < / l o n g t e x t > EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 11/28

  12. Background Method Evaluation The XML Contents II < r e f i d =”12972967” a b s t r a c t=”A b s t r a c t s /12972967. xml” > Jongen J , Bach S , S t ub i n g er SH , et a l . E x c i s i o n of thrombosed e x t e r n a l hemorrhoids under l o c a l a n e s t h e s i a : a r e t r o s p e c t i v e e v a l u a t i o n of 340 p a t i e n t s . Dis Colon Rectum . 2003; 46: 1226 − 1231. < / r e f > < /long > < long i d =”1 3” > < l o n g t e x t > A p r o s p e c t i v e , randomized c o n t r o l l e d t r i a l (RCT) of 98 p a t i e n t s t r e a t e d n o n s u r g i c a l l y found improved pain r e l i e f with a combination of t o p i c a l n i f e d i p i n e 0.3% and l i d o c a i n e 1.5% compared with l i d o c a i n e alone . The NNT f o r complete pain r e l i e f at 7 days was 3. < / l o n g t e x t > < r e f i d =”11289288” a b s t r a c t=”A b s t r a c t s /11289288. xml” > P e r r o t t i P, A n t r o p o l i C, Molino D , et a l . C o n s e r v a t i v e treatment of acute thrombosed e x t e r n a l hemorrhoids with t o p i c a l n i f e d i p i n e . Dis Colon Rectum . 2001; 44: 405 − 409. < / r e f > < /long > < /snip > < /answer > < /record > EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 12/28

  13. Background Method Evaluation Corpus Statistics Size ◮ 456 questions (“records”). ◮ Over 1,100 distinct answers (“snips”). ◮ 3,036 text explanations (“longs”). ◮ 2,707 references. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 13/28

  14. Background Method Evaluation Summarisation Using This Corpus Input ◮ Question. ◮ Document Abstract. Output ◮ Extractive summary that answers the question. ◮ Target summary is the annotated evidence text (“long”). ◮ Evaluated using ROUGE-L. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 14/28

  15. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 15/28

  16. Background Method Evaluation The Statistics Gathered 1. Source sentence position. 2. Sentence length. 3. Sentence similarity. 4. Sentence type. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 16/28

  17. Background Method Evaluation 1. Source Sentence Position ◮ Compute relative positions. ◮ Create normalised frequency histograms f 1 , f 2 , . . . , f 10 . ◮ Score all relative positions of bin i with its bin frequency: S pos ( i ) = f bin ( i ) . EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 17/28

  18. Background Method Evaluation 2. Sentence Length Reward larger sentences and penalise shorter sentences: Normalised sentence length S len ( i ) = l s − l avg l d l s : sentence length l avg : average sentence length in the corpus l d : document length EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 18/28

  19. Background Method Evaluation 3. Sentence Similarity Sentence Similarity ◮ Lowercase, stem, remove stop words. ◮ Build vector of tf . idf with remaining words and UMLS semantic types. X . Y ◮ CosSim ( X , Y ) = | X || Y | Maximal Marginal Relevance (Carbonell & Goldstein, 1998) Reward sentences similar to the query and penalise those similar to other summary sentences. MMR = λ ( CosSim ( S i , Q )) − (1 − λ ) max S j ǫ S ( CosSim ( S i , S j )) EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 19/28

  20. Background Method Evaluation 4. PIBOSO (Kim et al. 2011) I 1. Classify all sentences into PIBOSO types (a variant of PICO). 2. Generate normalised frequency histograms of resulting PIBOSO types. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 20/28

  21. Background Method Evaluation 4. PIBOSO (Kim et al. 2011) II Position independent P best : proportion of this PIBOSO type among all best summary sentences. S PIPS ( i ) = P best P all P all : proportion of this PIBOSO type among all sentences. Position dependent P pos : proportion of this PIBOSO type among at best summary sentences at this position. S PDPS ( i ) = P pos P best EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 21/28

  22. Background Method Evaluation Classification Edmunsonian Formula S S i = α S rpos i + β S len i + γ S PIPS i + δ S PDPS i + ǫ S MMR i ◮ MMR is replaced with cosine similarity for first sentence. ◮ In case of ties, the sentence with greatest length is chosen. ◮ Parameters are fine-tuned through exhaustive search using training set. α = 1 . 0, β = 0 . 8, γ = 0 . 1, δ = 0 . 8, ǫ = 0 . 1, λ = 0 . 1. EBM Summarisation Abeed Sarker, Diego Moll´ a, C´ ecile Paris 22/28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend