Co mme nts o n Asse ssme nt Co nso rtia T e st Se c urity Pre se - - PowerPoint PPT Presentation
Co mme nts o n Asse ssme nt Co nso rtia T e st Se c urity Pre se - - PowerPoint PPT Presentation
Co mme nts o n Asse ssme nt Co nso rtia T e st Se c urity Pre se ntatio ns Steve Ferrara Presented in J. Steedle (Organizer), Test Security for Common Core Consortia Assessments , a session in the National Conference on Student Assessment June
2
Frameworks to guide my comments
- PDIR
- Threats to security
Comments on each paper Concluding comments
Ove rvie w
Test Security for Common Core Assessment Consortia
Test Security for Common Core Assessment Consortia 3
Prevention, Detection, Investigation, Resolution
(PDIR)
Progress on Prevention and Detection Investigation
- (Mostly from media) Making progress on investigations
- May be too much conflict of interest in local investigation
Resolution
- Much of the evidence is not accessible
- Evidence we do have is…
F rame wo rk fo r c o mpre he nsive te st se c urity syste ms
4
Before Test Administration During Test Administration After Test Administration Examinees Acquiring test content Copying or supplying answers, impersonating Divulging content, failing to report violations Test Administrators Divulging, teaching, inappropriate test prep Providing answers, allowing inappropriate conditions, inadequate protection Changing answers, tampering with answer files, failing to report Other Testing Site Staff Failing to train, failing to monitor security Failing to monitor administrations Failing to report, tampering with answers and files Program Mangers and Operations Vendors Failing to publicize expectations, failing to train Failing to observe administrations and protection Failing to report, failing to account for materials
T hre ats to te st se c urity
From Ferrara, 2016, Table 1; adapted from Fremer & Ferrara, 2013, Table 2.1
Some threats require human vigilance for prevention, detection, and investigation Some threats require human vigilance for prevention, detection, and investigation
Test Security for Common Core Assessment Consortia
Test Security for Common Core Assessment Consortia 5
Statistical plus non‐statistical approaches Addresses several threats Web crawling to detect sharing of content
- PARCC states follow internal breach procedures
- Is this adequate to protect current content and discourage
future breaches via rigorous resolution procedures?
Answer changing via points gained approach
- Logically similar to WTR answer changing methods
- Now, an automated system to detect changes for
constructed response items!
Je ff Ste e dle o n PARCC fo re nsic s
Test Security for Common Core Assessment Consortia 6
Plagiarism for prose constructed response via latent
semantic analysis
- Great to see PARCC exploring this approach
- Found very few flagged pairs; LSA adequately sensitive and
specific?
- Relatively high numbers of flags for Narrative Writing Task
grades 5 (37) and 7 (127)
Hypothesis: Retelling from a character’s perspective—Plausible? Conditional on PCR score or theta?
If think of plagiarism within schools, numbers of
pairings may be manageable
PARCC fo re nsic s (c o nt.)
Test Security for Common Core Assessment Consortia 7
Aberrant response patterns Modified caution index (MCI)
- Findings consistent with the literature?
Standardized log‐likelihood person fit index
- More sensitive in simulated data
- Specificity/false negatives? Its performance conditional on
theta/theta ranges?
PARCC fo re nsic s (c o nt.)
Test Security for Common Core Assessment Consortia 8
Longitudinal performance modeling
- Detect unusual performance changes via cumulative logit
regression
Clark et al. found it detects “test misconduct” with
good power and conservative false positive flags
- Two consecutive years—Could find downward spikes as well
as expected upward spikes
- Have to determine practical significance awa statistical
Not yet used operationally
- Look forward to hearing results from 2015‐2016 investigation
- f cumulative logit regression
PARCC fo re nsic s (c o nt.)
Test Security for Common Core Assessment Consortia 9
Will continue response change analyses—good
- Considering rules for flagging score increases—good
Plagiarism/copying on constructed responses—tough
detection problem
- Approach: States will request focus on worrisome schools
(”Known unknowns”)
- What about schools that are new to cheating? (“Unknown
unknowns”)
Add answer copying/plagiarism for short responses,
where answer copying/dictating responses is easier
PARCC le sso ns le arne d
Test Security for Common Core Assessment Consortia 10
“Security goals should benefit students”
- Of course
- Test security systems also should serve our
responsibilities to the public and our state and federal sponsors: data integrity (Ferrara, 2012; USDE, 2013)
Open source secure browsers—and roadmap to a
common industry solution
- Great way to pursue our responsibilities to federal
sponsors
Brandt Re dd o n Smarte r Balanc e d vie ws
- n te st se c urity
Test Security for Common Core Assessment Consortia 11
Security and CAT
- 20:1 ratio of available and presented items; suspend exposed
items
- Would like to see that ratio within standards x scale locations
matrix—“redundant items”
Efforts to protect security and enable assistive
technology—laudable
Test administration policies, training, monitoring, and
statistical forensics—laudable
What about human vigilance to protect and detect other
security threats?
Smarte r Balanc e d vie ws (c o nt.)
Test Security for Common Core Assessment Consortia 12
Aberrant response patterns Project W from R, L, S
- What were the conditions for the 3% significant
differences? (e.g., Theta = ‐2)
- Same question for second illustration
- Rudner, Bracey, & Skaggs asked that question 20 years
ago
Glad to hear Mark say bigger risk may be with
constructed response items; our field not doing enough here yet
Mark Hanse n o n Smarte r Balanc e d fo re nsic s fo r CAT
Test Security for Common Core Assessment Consortia 13
Glad to see this work going on in the consortia
- Next year: Why not invite WIDA, ELPA21, NCSC, and
DLM?
Would like to see big effort on forensics for security
threats for constructed response items
I’ll summarize “empty cells” in threats x detection
methods matrix
- Source for considering next method studies
I n c lo sing
Test Security for Common Core Assessment Consortia 14
Steve Ferrara sferrara1951@gmail.com
T hanks!
Ferrara, S. (2012 February 28). Investigation and response: Experiences and reflections
- f a former state assessment director. Invited panel presentation and discussion in
Testing Integrity Symposium, sponsored by the National Center for Education Statistics, Washington, DC. Ferrara, S. (2016). A framework for policies and practices to improve test security systems: Prevention, detection, investigation, and resolution (PDIR). Manuscript in preparation. Fremer, J. J., & Ferrara, S. (2013). Security in large scale, paper and pencil testing. In J.
- A. Wollack & J. J. Fremer (Eds.), Handbook of test security (pp. 17‐37). New York:
Routledge. US Department of Education. (2013). Testing integrity symposium: Issues and recommendations for best practice. See http://nces.ed.gov/pubs2013/2013454.pdf
Re fe re nc e s
15 Test Security for Common Core Assessment Consortia
16
Before Test Administration During Test Administration After Test Administration Examinees Acquiring test content Copying or supplying answers, impersonating Divulging content, failing to report violations Test Administrators Divulging, teaching, inappropriate test prep Providing answers, allowing inappropriate conditions, inadequate protection Changing answers, tampering with answer files, failing to report Other Testing Site Staff Failing to train, failing to monitor security Failing to monitor administrations Failing to report, tampering with answers and files Program Mangers and Operations Vendors Failing to publicize expectations, failing to train Failing to observe administrations and protection Failing to report, failing to account for materials
PARCC re spo nse s to te st se c urity thre ats
Test Security for Common Core Assessment Consortia
Web crawling Web crawling Performance modeling Performance modeling Aberrant responses Aberrant responses Plagiarism Plagiarism Answer changing Answer changing Plagiarism Plagiarism Web crawling Web crawling Answer changing Answer changing Performance modeling Performance modeling Aberrant responses Aberrant responses