SLIDE 1 What We Learned By Using the Rigor Metric
Emily S. Patterson, PhD
(and numerous colleagues)
SLIDE 2
August 7, 1998 Bombings of US Embassies in Africa 224 killed, including 12 US personnel Need for Analytic Rigor: Avoid Surprise
SLIDE 3 NASA Columbia Accident Investigation
"Another lack of rigor cited by the panel is the widespread use of PowerPoint presentations in lieu of actual engineering data and analyses.”
Need for Rigor Transparency: Be Calibrated
SLIDE 4 Down - Collect Conflict & Corroboration Hypothesis Exploration
How to Increase Rigor: Broadening Checks (“up” arrows)
Elm, W., Potter, S., Tittle, J., Woods, D.D., Grossman, J., Patterson, E.S. (2005). Finding decision support requirements for effective intelligence analysis tools. Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting.
SLIDE 5
Rigor Metric
SLIDE 6 What We Did with the Rigor Metric
- Study 1: Plan to move troops and supplies
– 12, 3-person teams – Undergraduate students (security & intelligence specialization)
- Study 2: Causes and impacts of Ariane 501 accident
– 2 novice, 8 expert intelligence analysts – 1 coder; entire session
- Cases: Low-Moderate-High for 8 attributes
– 1992, 2008: Separatist movements in Georgia – 2009: Lebanese elections/pro-Western shift potential – 1999-09: Chavez manipulation of democracy to retain power – 2009: Uyghur separatist movement/regional stability
SLIDE 7 Study 1: 3-Person Team; Logistics Planning Task
Constraints: Fastest (<2.5 hours) Cheapest (least fuel) Secure:
- Avoid railway (enemy agents)
- Avoid route C (most attacks)
Best answer: Trucks with supplies on route A Armored vehicles for troops on route B
SLIDE 8
Solution Scoresheet
SLIDE 9
Team 3 Rigor Measures
SLIDE 10
Explanation Critiquing: Example of High and Low High (Team 3): Extensive error checking throughout task Low (Team 10): Supportive throughout: “I think that’s probably the best way to go.” Low (Team 12): Blanket agreement with dominant, intimidating leader
SLIDE 11
Task Fidelity: No Specialist Collaboration
SLIDE 12
Task/Participant Fidelity: No “High” Scores on 3 Attributes
SLIDE 13
Perfect Task Score: Team 3
SLIDE 14
Perfect Task Score: Team 10
SLIDE 15
Perfect Task Score: Team 11
SLIDE 16
Low Performance (32%): Team 6
SLIDE 17
Low Performance (40%): Team 7
SLIDE 18 Study 2: Individual Analysts; Ariane 501 Accident
2 hour session (avg = 55 min)
Causes, impacts of Ariane 501 accident
- Participants: 10 NASIC analysts (avg = 13 yrs)
- Tools:
Search/browse features of Pathfinder
“On topic” database (~2000 documents)
Verbal (video-taped)
Think aloud, semi-structured interviews
Process tracing, briefing accuracy (к = 0.84)
SLIDE 19 › Repeated inaccurate information › Missed update that changed assessment › Inapplicable assumption
Embedded Risks for Inaccurate Statements
The monetary loss can be recovered by the insurance...
Cluster Satellite Program
program cancelled rebuild 1 rebuild all 4 lost satellites (no insurance)
SLIDE 20
Task Fidelity: No Spec Collab or Expl Critiquing
SLIDE 21
2 Novices vs. 8 Experts
SLIDE 22
Rely on Weak vs. ‘High Profit’ Documents
SLIDE 23 The Bottom Line: What We Learned Study 1
- Reliable for 3-person team over session (K=0.92)
- Discovers task and participant fidelity issues
Study 2
- (Likely) Detects novice-expert differences
- New insights from old study data
Cases
- All attributes worked (all cases)
- Low vs. High easy; Moderate more variable
- Jargon (knowledge shields)
- Context-dependent risks:
- Linguistic barriers (information search)
- Polarized issues (information validation)
- Limited access (specialist collaboration)
- Deliberate deception (stance analysis)
SLIDE 24
Next Step: Guidance for When to Invest ‘More’