dataset using a FHIR terminology server Matt Cor Cordell ll Ter - - PowerPoint PPT Presentation
dataset using a FHIR terminology server Matt Cor Cordell ll Ter - - PowerPoint PPT Presentation
An introduction to analysing a SNOMED CT coded dataset using a FHIR terminology server Matt Cor Cordell ll Ter erminolo logy Spe Specialist A quick introduction to SNOMED CT, FHIR & Ontoserver SNOMED CT Much larger than most other
A quick introduction to SNOMED CT, FHIR & Ontoserver
SNOMED CT
- Much larger than most other code systems traditionally used in healthcare (ICD, ICPC etc.)
- Primary purpose is recording clinical notes, with the specificity required by clinicians, and interoperability
– Structure* supports secondary uses (analytics).
- Codes have no intrinsic meaning, simply identifiers. 278285008|Left hemiplegia| & 278284007|Right
hemiplegia|
- Concepts in the terminology are associated by range of relationships, forming an Ontology.
- Expression Constraint Language (ECL) – language that supports sophisticated queries against the terminology.
FHIR
- Latest Interoperability standard from HL7, supporting modern RESTful practices. (ValueSets)
Ontoserver
- Provides FHIR based access to terminology, including ECL support
- Made available for use throughout Australia via the National Clinical Terminology Service (NCTS)
ECL in 90 seconds
<396234004|Infective arthritis| All (Subtypes) of Infective arthritis <64572001|Disease|:116676008|Associate d morphology|=23583003|Inflammation| All Diseases associated with inflamation <928000|Musculoskeletal disorder|:246075003|Causative agent|=<<49872002|Virus| Musculoskeletal disorders with some Viral involvement
What might a SNOMED CT dataset look like?
Unique Conditions : 24647 Unique Medications: 10128 Rows : 500,000 * Randomly generated synthetic dataset
Index Sex DoB PostCode Condition Medication F 26/04/1998 B03 102930000 7086011000036102 1 F 24/01/1953 E00 49512000 1112071000168105 2 M 7/09/1943 E00 277627005 5604011000036100 3 M 1/01/1966 E00 3109008 3231000036108 4 F 14/02/1957 E00 723409007 6286011000036105 5 M 14/08/1961 E00 3272007 761951000168100 6 F 28/01/1986 C04 86225009 921045011000036104 7 F 15/06/1983 C04 163577001 NaN 8 F 23/05/1967 C04 191737008 927853011000036101 … … … … … … 499998 M 16/01/1984 B09 443919007 36227011000036103 499999 M 28/03/1995 B09 723913009 5081011000036108
Basic outline of approach to SNOMED CT analytics
- Define aggregation categories using SNOMED CT Expression Constraint Language (ECL)
- Identify all the codes that match our category, using Ontoserver to perform valueSet Expansions.
- Store the results of each expansion in a Hash Set for fast lookup.
- Use the Sets to filter our dataset, and optionally create human readable labels.
- Use standard analytic approaches to report and visualise the data.
Populate Set with ECL
- Create a GET request with the ECL parameter
- Parse the JSON response to a FHIR Value Set
- Iterate through the Value Set and populate
the Hash with just the codes.
- Return the Hash.
import requests #for Rest calls from fhir.resources.valueset import ValueSet def PopulateSetWithECL(ecl): endpoint= https://ontoserver.csiro.au/stu3-latest expandAPI="/ValueSet/$expand“ sctValueSetUrl='http://snomed.info/sct?fhir_vs=ecl/’ urlParam={'url':sctValueSetUrl+ecl} response=requests.get(endpoint+expandAPI,params=urlParam) j=response.json() vs=ValueSet(j) _set=set() for e in vs.expansion.contains: _set.add(e.code) return _set
Creating Health Condition Labels
- A list of tuples, each tuple consisting of an ECL definition and label
- Iterate through this list
- Create the Hash Set based of the ECL
- Create Boolean filter for concepts that match the Set
- Label accordingly in a new “Category” column.
healthCategories=[ ('<<106028002','Musculoskeletal problems’), ('<<106048009','Respiratory problems’), ('<<195967001','Asthma’), ('<<363346000','Cancer’), ('<<13645005','COPD’), ('<<73211009','Diabetes mellitus’), ('<<106063007','Cardiovascular problems’), ('<<249578005','Kidney problems’), ('<<74732009','Mental illness’), ('<<40733004','Infectious disease’), ('<<414022008','Blood disease’)] for category in healthCategories: categorySet = PopulateSetWithECL(category[0]) filter = codeSet["Condition"].isin(categorySet) codeSet.loc[filter,"Category"]=category[1]
Index Sex Condition Medication Category F 102930000 7086011000036102 Other Condition 1 F 49512000 1112071000168105 Mental illness 2 M 277627005 5604011000036100 Cancer … … … … … 499998 M 443919007 36227011000036103 Mental illness 499999 M 723913009 5081011000036108 Mental illness
codeSet.groupby(['Category','Sex']).size()
Category Sex Count Blood disease F 7741 M 3295 Cancer F 1909 M 3298 Cardiovascular problems F 13716 M 10481 Diabetes mellitus F 18463 M 10362 Infectious disease F 1435 M 368 Kidney problems F 531 M 356 Mental illness F 106980 M 104910 Musculoskeletal problems F 1817 M 1400 Other Condition F 107163 M 105340 Respiratory problems F 230 M 205
Category Overlap
Overlap managed by:
- Categories ordered by priority
- Later categories overwrite; or
- Only label unlabled
- Build disjointness into ECL
<<106048009|Respiratory| Minus ( <<363346000|Cancer| OR <<106028002|Musculoskeletal| OR <<40733004|Infectious ) Use case dependent, especially where double counting
Counting Opioids
- Again, iterate through this list as before, adding an “Opioid”
label
- pioids= [('<34841011000036108','dihydrocodeine'),
('<21821011000036104','codeine'), ('<21705011000036108','pholcodine'), ('<21232011000036101','buprenorphine'), ('<21357011000036109','methadone'), ('<135971000036102','tapentadol'), ('<21258011000036102','fentanyl'), ('<21259011000036105','oxycodone’), … ('<21252011000036100','morphine'), ('<21486011000036105','tramadol'), ('<21901011000036101','dextropropoxyphene'), ('<34839011000036106','pethidine’), ('<1247191000168104','sufentanil')] for opioid in opioids: OpioidSet = PopulateSetWithECL(opioid[0]) filter = codeSet[“Medication"].isin(OpioidSet) codeSet.loc[filter,"Opioid"]= opioid[1]
Index Sex Medication Opioid 65 M 7349011000036100
- xycodone
219 M 1070441000168107 codeine 648 F 1048081000168105 buprenorphine ... ... ... ... 499738 F 34022011000036100 methadone 499802 M 785911000168101 fentanyl 499951 M 36062011000036104 dextropropoxyphene
Opioids
Using AMT’s “Concrete domain” in ECL
/*High Dose, 200mg or greater*/ <30497011000036103|medicinal product|: { 30364011000036101|has Au BoSS|=1817011000036100|aspirin|, 700000111000036105|Strength| >= #200, 177631000036102|has unit|=700000801000036102|mg/each| }, [1..1] 700000081000036101|has intended active ingredient|=ANY
53798011000036101|Ecotrin 650 mg enteric tablet|
/*Low Dose <200mg */ <30497011000036103|medicinal product|: { 30364011000036101|has Au BoSS|=1817011000036100|aspirin|, 700000111000036105|Strength| < #200, 177631000036102|has unit|=700000801000036102|mg/each| }, [1..1] 700000081000036101|has intended active ingredient|=ANY
/*Combination Aspirin Products*/ <21719011000036107| aspirin (MP)|: [2..*] 700000081000036101|has intended active ingredient|=ANY
“Concrete Domain” expansions
High Dose – 28 concepts
- Solprin 300 mg dispersible tablet
- Disprin Direct 300 mg chewable tablet
- Alka-Seltzer Lemon-Lime 324 mg effervescent tablet
Low Dose – 27 concepts
- Spren 100 mg tablet
- Cardasa 100 mg enteric tablet
- Aspirin Low Dose (Nyal) 100 mg enteric tablet
Combination Products – 54 concepts
- Clopidogrel/Aspirin 75/100 (AN) tablet
- Duoprel 75/100 tablet
- Action Cold and Flu effervescent tablet
Additional Resources
snomed.org/ecl
SNOMED CT ECL Specification
- ntoserver.csiro.au/shrimp
Shrimp Browser
github.com/AuDigitalHealth/ecl-examples
Agency ECL examples
bit.ly/SNOMED_HDA19
Supplementary Jupyter Notebook
Contact us
1300 901 001 help@digitalhealth.gov.au healthterminologies.gov.au twitter.com/AuDigitalHealth
Help Centre Website Twitter Email
OFFICIAL