Student Learning Data & My Evaluation For Instructional - - PowerPoint PPT Presentation
Student Learning Data & My Evaluation For Instructional - - PowerPoint PPT Presentation
Student Learning Data & My Evaluation For Instructional Personnel Spring, 2012 Presenters & Questions Boyd Karns Submit questions today Jason Wysong Submit questions online Brandon McKelvey Call/email us! Boyd
Presenters & Questions
- Boyd Karns
- Jason Wysong
- Brandon McKelvey
- Submit questions today
- Submit questions online
- Call/email us!
– Boyd 5-0198 – Jason 5-0212
- Talk with your
administrator
Today’s Focus
- Requirements of Senate Bill 736
- How Florida will measure student learning
– Concept – Example
- SCPS Business Rules for 2011-12
- Plan for 2012-13 and beyond
Disclaimers
- SCPS did not create the value-added model
- FL value-added is different from other places
- More flexibility for 2011-12 than 2012-13
- Every district using different rules for 2011-12
Annual Evaluation Ratings
Highly effective Effective Needs Improvement/Developing*
Category I: first 3 years—developing Category II: 4+ years—needs improvement
Unsatisfactory
Underlying Philosophy
Teachers are the single most important variable in a student’s academic growth. Teachers who effectively implement research- based practices will create student learning growth.
State Process
- Method for measuring student learning growth
- n FCAT must be established by Florida DOE, with
measures on other assessments to follow
- DOE Student Growth Implementation Committee
– Recommend a formula for learning growth measurement – Teachers, administrators, and university professors were appointed to this group – The DOE also contracted with AIR (American Institutes for Research) to provide technical assistance
Key Points
- Growth, not
proficiency
- Learning growth,
not learning gain
State Committee Recommendations
- The Student Growth Implementation Committee
recommended a covariate adjustment model
– Covariates are also called variables and represent student characteristics which influence learning – This model yields a VALUE-ADDED score
- This model establishes a personal learning growth
expectation for all students in the state
- If a student meets or exceeds their growth expectation,
that student will positively impact their teacher’s value added score
Variables
- A value added model measures the impact of a teacher on student
learning by accounting for other factors that may impact the learning process
- The Student Growth Implementation Committee chose to include in the
model the following student characteristics that may influence a student’s expected growth – Number of subject-relevant courses – Two prior years of achievement scores – Student with Disabilities (SWD) status – English Language Learner (ELL) status – Gifted Status – Student Attendance – Mobility (number of transitions) – Difference from Modal Age in Grade (retention) – Class Size (number of students) – Homogeneity of Student’s Prior FCAT Scores
Variables not in the model
- Gender
- Race/Ethnicity
- SES
School Component
- In addition to student-level scores, the model also calculates a
‘school component’
– The ‘school component’ is actually a ‘grade-level, subject’ component – For example, all 5th grade reading teachers at a school will have the same ‘school component’ score
- The school component is combined with the teacher calculation in
the value-added model
– The Student Growth Implementation Committee chose the school component because they believed that there were school-level and classroom level factors that influence student learning – The committee thought that teachers should not be held completely responsible for student performance because some responsibility is held by the school as a whole
School Component
- Elementary: 4 school components
- Middle: 6 school components
- High: 2 school components, maybe 3
- School components can vary significantly by
grade level, subject, and from year to year
- No direct link to school grades
- No way to game the system
Implications
- The value-added model starts by comparing
students to others around the state.
- The teacher’s initial score is the average of these
student-level comparisons across the state.
- The teacher’s score is adjusted based on the
average performance of other students in the same grade level at the school.
- This model is designed to control for school
effects (leadership, climate, etc.)
Finding a Value-Added Score
- There are two major components of the value-added score
– Teacher Score (how effective is the teacher) – School Score (how effective is the school)
- The difference between the school and teacher score is called the
‘teacher effect’
– This is the difference between a teacher’s effectiveness and the effectiveness of other teachers in the same grade-level and subject
- In order to find the value-added score, half of the school score must
be added back to the score
– This is because the student growth committee chose to only use half
- f the school component score
Simple Example
- Teacher score: 20
- School score: 10
- Unique teacher effect: 10
- Add ½ of school score back in: 15
Standardizing & Aggregating Scores
- Since the average FCAT growth rate is different at
each grade level, scores must be STANDARDIZED so that teachers of different grade levels can be compared.
– This is done by dividing each teacher’s score by the average amount of growth at a grade level – This ‘smoothes’ out differences in growth at grade levels
- This accounts for grade-level differences in FCAT.
Standardizing & Aggregating Scores
- Since most teachers have students in multiple
grade levels, value-added scores must be AGGREGATED so that each teacher receives
- nly one overall score.
– This is done through weighted averaging, so that proportion of students in each grade level is incorporated into the overall score
- Standardization & aggregation allow for
comparison of all teachers in the model regardless of subject, grade level, etc.
Standard Errors
- All statistical measures contain a degree of error
- Value-added scores have a ‘standard error’
– The standard errors are calculated by the DOE in conjunction with AIR
- The standard error is a value that represents the
amount of uncertainty that we have in a particular value
– For our purposes, a higher standard error would suggest that we have less confidence in the score
Why Does the Standard Error Matter?
- If we did not use the standard error in placing
teachers in categories, we would be ignoring an important piece of information about the data
- No data are perfect, but we have methods for
determining how likely data are close to the ‘real’ value
– Using data without this adjustment is not appropriate – Example: No one would sample two people in a Presidential poll and not mention that the ‘Margin of Error’ would be 99%
10
- 10
20
- 20
30
- 30
State Mean
Teacher Value Added Scores at School X in 7th Grade
Teacher A Teacher B Teacher C
The dots above the teacher labels are teacher value-added scores. The lines extending from the bars are confidence intervals at 0.5 Standard Errors (SE). At 0.5 SE, Teacher B is lower and Teacher C is higher than the state mean. 0.5 SE = 38% Confidence Interval
7
10
- 10
- 20
30
- 30
State Mean
Teacher Value Added Scores at School X in 7th Grade
20 Teacher A Teacher B Teacher C
At 1 SE, Only Teacher C would be considered significantly higher or lower than the state mean. 1 SE = 68% Confidence Interval
8
10
- 10
20
- 20
30
- 30
State Mean
Teacher Value Added Scores at School X in 7th Grade
Teacher A Teacher B Teacher C
At 2 SE, none of the three teachers would be significantly higher
- r lower than the state
mean.
2 SE = 95% Confidence Interval
9
Standard Error Implications
- When you account for the standard error, you are able
to have more certainty concerning which evaluation category is most appropriate for a teacher
– A 95% confidence interval is built by adjusting for approximately 2 standard errors
- However, the more adjustment that is made for the
standard error, the wider the range of possible scores are created for each teacher
– This means that most teachers will fall around the mean in a single category
Standard Error & Policy
Use of standard error makes it more difficult to clearly differentiate teacher performance level…but this is exactly what 736 requires.
VAM Procedures for 2011-12
SB 736 in 2011-12
- 736 requires State Board of Education to
establish business rules and set cut points for personnel evaluations
- State Board will not set rules until 2012-13
- DOE required districts to establish their own
rules for 2011-12
SCPS Process
- Teacher Evaluation Committee
- Teacher focus groups
- Administrator Evaluation Committee
- Dr. Vogel & District Administrators
SCPS Decision # 1
Use only 2011-12 student data
– This is year 1 – No historical data – Reduces learning growth from 50% to 40% – Remaining 60% is based
- n supervisor evaluation
SCPS Decision # 2
- Use 2 standard errors with all value-added scores
- Adjusting for 2 standard errors greatly increases
the range of scores that influence a teacher’s placement
- This suggestion is supported by educational
research and prior ‘best practices’ using value- added modeling
SCPS Decision # 3
- Discard value-added scores for teachers with 8
- r fewer students
- Very high standard error associated with these
teachers
- Fairness
MY EVALUATION
Value-Added & FCAT
- Groups automatically included
– Math: Grades 4-8 – Reading: Grades 4-10 – No writing, science, or retakes
- Results calculated by FDOE
Teachers of Other Subjects
- DOE intends to link teachers of other subject
areas to the performance of their own students on FCAT Reading and FCAT Math.
- Results calculated by DOE
- Districts have not seen models for this product
Personnel with no value-added data
- Evaluated using school-wide averages of FCAT
Math and Reading scores.
– Personnel without scores – Personnel with 8 or fewer students – Non-classroom instructional personnel
- Results provided by DOE; SCPS matches to
employees
Why use school averages?
- All personnel in SCPS are responsible for
instruction in literacy and numeracy.
- Linking all personnel to FCAT reading and
math averages creates collective responsibility and mutual accountability for the learning of all students.
What will my score look like?
- The final value-added score will be a decimal number that is either
positive or negative.
– Decimal shows amount of growth made by students. – Example: +0.18 = students grew 18% more than an average year’s growth (very good!)
- In addition to the final value-added score, there will be a standard
error that will also be a decimal number that is either positive or negative.
- The value-added score and standard error score will be used
together to determine the employee’s rating and score on the student learning growth portion of the evaluation.
SCPS Cut Points for 2011-12
- Any employee whose entire confidence interval is greater than or equal to +0.10
(10% or more of a year’s growth above average) will receive a rating of HIGHLY EFFECTIVE and a corresponding evaluation score of 4.00.
- Any employee whose confidence interval is not entirely less than or equal to
- 0.05 nor entirely greater than or equal to +0.10 is sufficiently close to the mean
value-added score to suggest some evidence of effective instruction and student
- learning. In this case, the employee will receive a rating of EFFECTIVE and a
corresponding evaluation score of 3.00.
- Any employee whose entire confidence interval is less than or equal to -0.05 (5%
- r more of a year's growth below average) but not less than or equal to
- 0.10 will receive a rating of NEEDS IMPROVEMENT (Category II personnel) or
DEVELOPING (Category I personnel) and a corresponding evaluation score of 2.00.
- Any employee whose entire confidence interval is less than or equal to -0.10 (10%
- r more of a year’s growth below average) will receive a rating of UNSATISFACTORY
and a corresponding evaluation score of 1.00.
Highly Effective
Any employee whose entire confidence interval is greater than or equal to +0.10 (10% or more
- f a year’s growth above average) will receive a
rating of HIGHLY EFFECTIVE and a corresponding evaluation score of 4.00.
Unsatisfactory
Any employee whose entire confidence interval is less than or equal to -0.10 (10% or more of a year’s growth below average) will receive a rating of UNSATISFACTORY and a corresponding evaluation score of 1.00.
Needs Improvement
Any employee whose entire confidence interval is less than or equal to -0.05 (5% or more of a year's growth below average) but not less than
- r equal to -0.10 will receive a rating of NEEDS
IMPROVEMENT (Category II personnel) or DEVELOPING (Category I personnel) and a corresponding evaluation score of 2.00.
Effective
Any employee whose confidence interval is not entirely less than or equal to -0.05 nor entirely greater than or equal to +0.10 is sufficiently close to the mean value-added score to suggest some evidence of effective instruction and student learning. In this case, the employee will receive a rating of EFFECTIVE and a corresponding evaluation score of 3.00.
Cut Scores Summarized
- 0.10 -0.05 0 +0.05 +0.10
U NI D E HE
Ratings & Scores
- Highly effective
Score of 4.0
- Effective
Score of 3.0
- Needs improvement/
Score of 2.0 Developing
- Unsatisfactory
Score of 1.0
Final Evaluation Rating
- Instructional Practice Score (60%)
– Calculated from supervisor’s annual evaluation – Scale of 1.0 to 4.0 – Will be available by last day of post-planning
- Student Learning Growth Score (40%)
– Determined by VAS & standard error – Whole number: 1, 2, 3, or 4
Final Evaluation Rating
Highly Effective: 3.50-4.00 Effective: 2.50-3.49 Needs Improvement: 1.50-2.49 (years 4+) Developing: 1.50-2.49 (years 1-3) Unsatisfactory: 1.00-1.49
When will my evaluation rating be available?
- Value-added scores based on FCAT
- Therefore, value-added scores are computed after FCAT scores are
released and validated.
- Once DOE releases value-added scores to districts, SCPS will need
time to analyze and validate data.
- Scores may be available as late as Fall, 2012
- Final evaluation rating will be released at same time as teacher’s
value-added score.
RULES FOR 2012-13
52
State Board of Education decision Current plan is for FDOE to make a recommendation by August 1 regarding 2012-13 rules & cut points Important for teachers to stay informed on the rule-making process
The Art of Teaching Third Annual Educators Conference At Seminole State College July 17-18, 2012
Transition
- 5 minute break
- Stay if you want to see math examples
- If not staying, turn in learning log & questions!
Example
- Computation for two teachers
- For each example, the same teacher will be
shown at two different schools:
- ne school is high growth (positive score)
- ne school is low growth (negative score)
Teacher A—Step 1 Determine teacher score
Teacher A has 5 students: Predicted Actual Residual Student A 1510 1570 60 Student B 1475 1520 45 Student C 1430 1400 -30 Student D 1550 1530 -20 Student E 1500 1600 100 Total Residual: 155 Number of Students: 5 Average Residual/Teacher Score: 31 This calculation demonstrates how the 'teacher score' is
- formed. It is an average of student residuals.
Teacher A—Step 2 Determine school score
- High growth school
- Score of +14 points
- On average, the
students at this school and grade level perform 14 DSS points better than predicted.
- Low growth school
- Score of -14 points
- On average, the
students at this school and grade level perform 14 points less than predicted.
Teacher A—Step 3 Compute Teacher Effect
- High growth school
- Teacher score: 31
- School score: +14
- Effect = 31 - +14
- Effect = 17
On average, this teacher contributes 17 more points of growth than others. This is called the unique teacher effect.
- Low growth school
- Teacher score: 31
- School score: -14
- Effect = 31 - -14
- Effect = 45
On average, this teacher contributes 45 more points of growth than others. This is called the unique teacher effect.
Teacher A—Step 4 Compute Raw Value-Added Score
- High growth school
- Teacher effect = 17
- VAS = Tchr. effect + ½ of
school score
- VAS = 17 + ½(14)
- VAS = 24
- Low growth school
- Teacher effect = 45
- VAS = Tchr. effect + ½ of
school score
- VAS = 45 + ½(-14)
- VAS = 38
Teacher B—Step 1 Determine teacher score
Teacher B has 5 students: Predicted Actual Residual Student A 1570 1510 -60 Student B 1520 1475 -45 Student C 1400 1430 30 Student D 1530 1550 20 Student E 1600 1550 -50 Total Residual: -105 Number of Students: 5 Average Residual/Teacher Score: -21 This calculation demonstrates how the 'teacher score' is
- formed. It is an average of student residuals.
Teacher B—Step 2 Determine school score
- High growth school
- Score of +14 points
- On average, the
students at this school and grade level perform 14 DSS points better than predicted.
- Low growth school
- Score of -14 points
- On average, the
students at this school and grade level perform 14 points less than predicted.
Teacher B—Step 3 Compute Teacher Effect
- High growth school
- Teacher score: -21
- School score: +14
- Effect = -21 - +14
- Effect = -35
On average, this teacher contributes 35 less points of growth than others. This is called the unique teacher effect.
- Low growth school
- Teacher score: -21
- School score: -14
- Effect = -21 - -14
- Effect = -7
On average, this teacher contributes 7 less points of growth than others. This is called the unique teacher effect.
Teacher B—Step 4 Compute Raw Value-Added Score
- High growth school
- Teacher effect = -35
- VAS = Tchr. Effect + ½ of
school score
- VAS = -35 + ½(14)
- VAS = -28
- Low growth school
- Teacher effect = -7
- VAS = Tchr. Effect + ½ of
school score
- VAS = -7 + ½(-14)
- VAS = -14
Standardization Example: MS Reading Teacher
7th Grade Reading VAS: 38
- Avg. Growth Read 7: 100
38/100 = 0.38 # of students: 5 0.38 x 5 = 1.9 8th Grade Reading VAS: 10
- Avg. Growth Read 8: 15
10/15 = 0.67 # of students: 10 0.67 x 10 = 6.7
Aggregation Example: MS Reading Teacher
- 7th Grade = 1.9
- 8th Grade = 6.7