01/12/2015 Outline Statistical issues in designing a large-scale • What is inflammatory arthritis and why is it important reliability exercise in ultrasonography of • Rheumatoid arthritis – synovitis as the target • The role of US in detecting synovitis and the challenges the joint synovium of measurement • Description of scoring methods Dr Richard Wakefield & Dr Liz Hensor • The statistical challenges presented by the data NIHR Leeds Musculoskeletal Biomedical Research Unit and • The rationale for the planned reliability study (IACON) Leeds Institute of Rheumatic and Musculoskeletal Medicine • The selection of patients to be included • The creation of the image bank 1
01/12/2015 Why is it important ? What is inflammatory arthritis (IA) ? • Arthritis characterized by signs of joint • If unrecognized, IA leads to increased risk of structural inflammation – stiffness, pain, warmth and damage (soft tissue and bone), poorer functional swelling outcome and disability • Common examples include rheumatoid arthritis, • Good evidence that early aggressive therapy improves psoriatic arthritis and gout outcome with there being a ‘window of opportunity’ • Each disease has its own target for inflammation • Concept of ‘Treat to Target’ where aim for maximal e.g. synovial membrane +/- tendons +/- ligaments suppression of disease 2
01/12/2015 Polyarticular disease; synovial disease Rheumatoid disease • Common cause of disability • Chronic deforming arthritis + systemic features Predominantly a disease of • Polyarticular – multiple joints wrists and ‘small joints’ of • Autoimmune – antibodies fingers and toes – 85% present this way • Synovium – Site of initiation Also affects larger joints – Membrane that lines joint spaces and tendon sheaths • If left untreated leads to tendon and bone damage Choy NEJM 2001 3
01/12/2015 INFLAMMATION - SYNOVITIS DAMAGE Polyarticular disease; synovial disease Normal joint Early RA BONE EROSION TENDON RUPTURE Established RA Choy NEJM 2001 4
01/12/2015 Limitations of clinical assessment Need for new methods of assessment • Clinical examination (CE) insensitive and non • MRI – often described as gold standard – specific tomographic but lacks feasibility esp for multiple assessments • Inflammatory markers (ESR, CRP) do not always correlate with CE • US – widely available, immediate decision making, multi – joint assessment at multi-time • Xray – insensitive to detect mild bone and points cartilage changes 5
01/12/2015 The US images…. The ultrasound equipment Gray scale Doppler (usually PD) MCP MCP qualitative functional assessment Computer Probe Gel structural changes (vascularity) 6-20 MHz 6
01/12/2015 Conventional scanning views Conventional scanning views • Shoulder – posterior GHJ, axillary GHJ (2) • Elbow – anterior, radio-humeral, posterior (3) • Wrist – midline, medial and lateral (3) • MCPJ – dorsal and volar (2) • PIPJ – dorsal and volar (2) • Knees – midline, medial and lateral (3) Different views taken / joint • MTPJ – dorsal only (1) 7
01/12/2015 Scoring systems Scoring systems • Joint level (per individual joint) • Patient level (multi-joint) – Binary (present/absent) – Semi-Quantitative – Joints chosen might depend on whether early ( i.e. • Commonest 0-3 (OMERACT-EULAR) – for GS and PD (or for diagnosis) or established disease (for combined); pragmatic monitoring) – Quantitative • Pixel counting – Total scores for GS, PD, combined • Resistive index of vessels (best of 3) – score 0-1 – Counts of joints High RI (> 0.7) - normal Low RI (< 0.7) - inflammation • Contrast agents – rate of uptake 8
01/12/2015 OMERACT-EULAR OMERACT-EULAR 9
01/12/2015 Pixel counting Resistive index Albrecht K et al. Clin Exp Rheum 2007;25:630-38 10
01/12/2015 Resistive index Challenges of US scoring • Physical limitations of ultrasound – Unable to visualize whole joint (cf MRI- tomographic) – Sensitivity of GS and Doppler differs between machines • Torp-Pederson S et al. Arthritis Rheum 2015 Albrecht K et al. Clin Exp Rheum 2007;25:630-38 11
01/12/2015 Challenges of US scoring Knowing what is normal • Small amounts of fluid and synovial • Standardization of exam hypertrophy are common in healthy controls – Environment • Ambient temperature, (Ellegaard K et al Rheumatol 2009) • level of pre scan physical activity, (Ellergaard K et al, Rheum Int 2013)) • Identifying which vessels are normal intra- and • pre scan use of medications eg steroids/ NSAIDS (Zayat A et al, ARD, 2011) extra-articular vessels – Position of joint (Zayat A et al. Rheum 2012) – Pressure of probe (Joshua F et al. Australasia Radiol 2005) – Position of probe (Vlad et al. BMC Musc Disorders 2011) 12
01/12/2015 Methods for testing reliability Outline Pros Cons • What is inflammatory arthritis and why is it important • • • Rheumatoid arthritis – synovitis as the target Static Easy to acquire Only best images selected • • Test multiple times Does not reflect acquisition • The role of US in detecting synovitis and the challenges of measurement • • • Description of scoring methods Video Captures whole joint Difficult to acquire in standardised way • • Test multiple times Video might be biased to reader i.e. • The statistical challenges presented by the data might concentrate on certain areas • The rationale for the planned reliability study (IACON) • • Real-time Real life: tests reading Difficult to organise • The selection of patients to be included • (patient) and acquisition Less suitable for multiple observers • The creation of the image bank 13
01/12/2015 Statistical challenges Statistical challenges • How to deal with clustered data at the joint • How to summarise at the patient level level – Two inter-related elements (GS and PD) – compartments within joints – Ordinal scaling of total scores – joints within patients – Accounting for joint size • How to properly assess agreement in joints where inflammation is less prevalent 14
01/12/2015 Clustered data Clustered data • How to combine GS/PD scores from different • How to deal with clustering of joints within joint compartments into one score patients when assessing agreement at joint level – Small joint eg MCPJ – volar and dorsal – Large joint eg knee – SPP, MJS and LJS • Stratified Kappa is possible • Necessary to compare against CE – Weighted by inverse of variance (Fleiss 2003) • Typically maximum score is used – Common correlation model (Donner & Klar 1996) – Treatment is given at the joint level – Weighting by stratum size (Barlow 1991) 15
01/12/2015 Low prevalence in some joints Patient-level data • How to assess operator agreement in joints • Total GS / total PD (summated 0-3 scores) that rarely affected • Counts of joints with GS present / PD present – Agreement may vary by joint type • Combined GS and PD – Prevalence of inflammation varies by joint type PD GS 0 1 2 3 – Hard to measure agreement in less commonly 0 0 affected joints; inflammation may be absent in 1 1 1 2 3 sample 2 2 2 2 3 – May require careful selection of individuals 3 3 3 3 3 16
01/12/2015 Ordinal scaling Ordinal scaling • Although described as semi-quantative at joint level, • Ordinal scales not valid for longitudinal changes scores cannot be considered interval-scaled • Limits usefulness of US scores as clinical trial – GS: Absent; mild; moderate; marked hypertrophy outcomes – PD: • Grade 0 = no flow in the synovium (gray scale area) • Grade 1 = up to 3 single spots signals or up to 2 confluent spots or 1 confluent spot + up to 2 single spots • Grade 2 = vessel signals in less than half of the area of the synovium (< 50%) • Grade 3 = vessel signals in more than half of the area of the synovium (> 50%) 17
01/12/2015 Ordinal scaling Accounting for joint size • Should joints be weighted in total scores and counts? • Lansbury & Haut 1956 – Used component bone ends of skeleton joints – Carefully covered cartilage areas with Al foil – Weighed several times – Converted to surface area 18
01/12/2015 Accounting for joint size Item response theory • Rasch model (single parameter model) – Probabilistic form of Guttman scaling • Model tests data for measurement axioms: – Unidimensionality (required for valid total score) – Invariance of item ordering – Appropriate category ordering – Absence of differential item functioning – Absence of residual correlation 19
01/12/2015 Item response category ordering Item response theory • Targeting of persons and items • Reliability – Extent to which scale can reliably distinguish between people with different levels of the latent trait • Sample size (n=200 ideally) • Software: RUMM, WINSTEPS, Stata, SAS 20
01/12/2015 Example of poorly targeted scale Rationale for the Leeds study • Small scale reliability studies common – Often added onto an existing study – Rarely powered – Inclusion criteria often at odds with requirements for reliability • Potentially misleading & wasteful of resources 21
Recommend
More recommend