SLIDE 2
SLIDE 3 Testing: Big Questions
construct tests?
tests like/unlike standardized tests?
from test results?
SLIDE 4
7.1 Instructional Objectives 7.2 Teacher-Developed Tests in the Classroom 7.3 Formative Evaluation 7.4 Classroom Grading Approaches
SLIDE 5
7.5 Criterion- Referenced Testing 7.6 Norm-Referenced Testing 7.7 Interpreting Norm- Referenced Tests Scores 7.8 Validity
SLIDE 6
7.9 Reliability 7.10 Test Bias 7.11 Using Tests Appropriately 7.12 Summary
SLIDE 7
7.1 Instructional Objectives
SLIDE 8
SLIDE 9 Objectives: Checklist for learning
- More specific than goals
- What students should know
- r be able to do by end of
lesson ➔ descriptive verbs!
hierarchies of increasing sophistication
- Bloom: Cognitive, affective,
psychomotor
SLIDE 10 Bloom’s taxonomies
- Cognitive most used
- 6 levels: remember,
comprehend, apply, analyze, evaluate, create
- Objective: “Students will
compare and contrast yurts and tipis, in 3 key features.”
- Note task, level (analysis),
criteria
➔ “Mastery learning” system
SLIDE 11
7.2 Teacher-Developed Tests in the Classroom
SLIDE 12
SLIDE 13 Classroom assessment
Backward planning as a “best practice”
taxonomy-level verb and criteria for mastery
- 2. Create Assessment/test
that fits objective
- 3. Plan learning activities
that support and prepare students for mastery
SLIDE 14 Classroom tests
- Essay: for comprehension,
analysis; needs criteria
- Multiple choice, matching
for recognition
- T/F, fill blanks for recall
- Problem-solving for
application/analysis
➔ Consider pros/cons and kind of students who benefit
SLIDE 15
SLIDE 16 Performance-based or authentic assessment 1
progress
- Exhibition, e.g. posters
- Demonstration, e.g. slide
shows, videos
assessment
SLIDE 17
Authentic assessment 2
Rubric with criteria for scoring (posted for all to see) 10 points 5 points Sources Over 5 Under 5 Facts Over 10 Under 10 Format Correct Errors Graphics Over 5 Under 5
SLIDE 18
7.3 Formative Evaluation
SLIDE 19
SLIDE 20 Formative assessments 1
needs before instruction (aka “pretest”)
knowledge on topic or skill
skill or topic
SLIDE 21 Formative assessments 2
monitor progress during learning cycle
- Spot errors for re-teaching
- Give feedback and
suggestions
- Check readiness for final
(summative) assessment (aka “posttest”)
SLIDE 22
7.4 Classroom Grading Approaches
SLIDE 23
SLIDE 24 Assigning grades 1
When student gets a grade for work, what does he/she think it means?
- This is what I am worth
- This how I compare with
classmates
- This is what teacher thinks
- f me
- This is how well I learned
SLIDE 25 Assigning grades
- Letter grades: A, B, C, D, F
- Absolute: 10 points per letter
- Curve (relative): comparative
scaling (force bell curve?)
- Descriptive (short or long)
- Performance rating (with
rubric/criteria)
attempts not important)
SLIDE 26
7.5 Criterion-Referenced Testing
SLIDE 27
SLIDE 28 Criterion referencing
specific skills/objectives
- Good for topics that can be
broken into small objectives
- Good for topics that have
hierarchy of skills (e.g. math)
- Must master skill A before
you can understand and master skill B
SLIDE 29 Criterion referencing
performance criteria to prove mastery for each skill
(e.g. 80% correct answers)
time constraints?) ➔ move to next level at own pace
SLIDE 30
7.6 Norm-Referenced Testing
SLIDE 31
SLIDE 32 Norm referencing
- “Standardized”
- Comparative with other
students
(what has been learned, e.g. state/graduation test)
(predict future success, e.g. IQ, SAT, GRE)
SLIDE 33
7.7 Interpreting Norm-Referenced Test Scores
SLIDE 34
SLIDE 35 Analyzing test results (1)
(comparative) score
samples of test-takers
- Norming = fitted onto normal
distribution (bell curve)
(skewed by extremes), median (middle #), and mode (most frequent) are same
SLIDE 36 Analyzing test results (2)
Statistical descriptors
- Areas of distribution marked
by standard deviations = deviations from average
- Example: IQ tests 100 = avg.;
34% either side of average
deviations +/- from average
- Stanines: #5 in center; 1-4
below, 6-9 above
SLIDE 37 Analyzing test results (3)
More statistical descriptors
- Percentiles = % of students
performing same or below
- Example: 80th percentile =
performs better than 80%
- f others
- Grade-level equivalents =
- Example: 3.4 = 3rd grade,
4th month
SLIDE 38
7.8 Validity
SLIDE 39
SLIDE 40 How is a test valid?
- Validity: accuracy measure
- Content: match what was in
curriculum
- Face: appropriate format
- Criterion-related: items
match objectives
performance
- Construct: match other tests
SLIDE 41
7.9 Reliability
SLIDE 42
SLIDE 43 How is a test reliable?
- Reliability = consistency
- Test-retest
- Alternate/parallel
(versions)
- Split-half = odds/evens
- Kuder-Richardson = 1
test
SLIDE 44
- Perfect = 1.0, but .80 OK
- 0 = no correlation
- Negative value = as one
factor goes up, other down
SLIDE 45
7.10 Test Bias
SLIDE 46
SLIDE 47 Can a test be biased?
- If content or format favors
- ne SES, race, culture,
gender, or learning style
- Shows up in form/content
- f test question or answer
- Partial solution: test in
students’ native language
- Not bias: Males vary more
than females in achievement scores
SLIDE 48
7.11 Using Tests Appropriately
SLIDE 49
SLIDE 50 Testing: Use wisely
- Check validity and standard
error of estimate (score +/-)
standard error of measurement (confidence interval) caused by degree
- f unreliability
- Consider how scores and
results will be used
SLIDE 51
7.12 Summary
SLIDE 52
SLIDE 53 Testing the test
- What are you trying to find
- ut, and at what point in
learning cycle?
achievement or compare students?
- Does a test measure what it
should, consistently and without bias to any learner?