VT VTQ Re Research: arch: Ge Getti tting ng Te Technical - - PowerPoint PPT Presentation

vt vtq re research arch ge getti tting ng te technical
SMART_READER_LITE
LIVE PREVIEW

VT VTQ Re Research: arch: Ge Getti tting ng Te Technical - - PowerPoint PPT Presentation

VT VTQ Re Research: arch: Ge Getti tting ng Te Technical hnical Paul ul E. Newto ton AO O Foru rum, m, London, don, 14 May 2019 Technical hnical research earch into to GQ- like tests and exams be like Using Hierarchical


slide-1
SLIDE 1

VT VTQ Re Research: arch: Ge Getti tting ng Te Technical hnical

Paul ul E. Newto ton AO O Foru rum, m, London, don, 14 May 2019

slide-2
SLIDE 2

Technical hnical research earch into to GQ-like tests and exams be like… ■ Using Hierarchical Logistic Regression to Study DIF and DIF Variance in Multilevel Data ■ A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests Under the Random Groups Design ■ Calculating Conditional Reliability for Dynamic Measurement Model Capacity Estimates

slide-3
SLIDE 3

Technical hnical research earch into to VTQ-like assessments be like…

Credit: Bill Applin

slide-4
SLIDE 4

The assumption [underpinning NVQs] has always been that asse sessment ssment will be unpr probl

  • blematic

ematic because ause it t simpl ply y invo volves lves com

  • mparing

ring behavio aviour ur with th th the tr transparent sparent ‘benchmark’ of the performance criteria. (Wolf, 1995, p.24) But isn’t VTQ assessment inherently non-te technical? hnical? What I am proposing is that we shou

  • uld

ld just t for

  • rge

get t relia iability bility alto toge gether, ther, and d con

  • ncentrate

ntrate on

  • n valid

lidity ty, which is ultimately all that matters. (Jessup, 1991, p.191)

slide-5
SLIDE 5

The assumption [underpinning NVQs] has always been that asse sessment ssment will be unpr probl

  • blematic

ematic because ause it t simpl ply y invo volves lves com

  • mparing

ring behavio aviour ur with th th the tr transparent sparent ‘benchmark’ of the performance criteria. Unfo fortu rtunately nately, in practice tice th this tu turns ns ou

  • ut n

t not

  • t to

to b be th the case. e. (Wolf, 1995, p.24) But isn’t VTQ-li like asse sessme ssment nt inhere erently ntly non

  • n-te

technica chnical? l? ■ … even NVQ-like Competence-Based Qualifications are far from unproblematic, and require a technical eye; ■ … and the more ‘GQ-like’ a Competence-Based Qualification becomes, the more of a technical eye it’s likely to require.

slide-6
SLIDE 6

Becoming ‘GQ-like’? ■ “Professor Wolf’s report is very clear that assessment methods for many vocational qualifications need to be strengthened […] Therefore, on

  • nly

th thos

  • se quali

lificatio fications ns th that t prov

  • vid

ide evidence dence of

  • f substantial

stantial amou

  • unt

nt of

  • f

exte ternal rnal asse sessme ssment nt, together with synoptic assessment […] will be counted in the tables.” ■ “So we will on

  • nly includ

lude e th thos

  • se quali

lificatio fications ns th that t are grad aded ed – as

  • pposed to a pass/fail – in the tables in the future. Qualifications may

have a pass, merit, distinction structure or a more detailed scale.”

slide-7
SLIDE 7

https:/ ://w /www.gov.uk ww.gov.uk/g /govern

  • vernme

ment/ nt/pu public licatio ions/gr ns/grading ding-voca vocation tional al-and nd-techn echnica ical-qual qualif ificat icatio ions

slide-8
SLIDE 8

Level el 1 NVQ Certificat tificate e in Hos

  • spital

italit ity y Servic vices es … comparing behaviour with the transparent ‘benchmark’ of the performance criteria?

slide-9
SLIDE 9

LO1 AC1.1 State why it is important for a business to change AC1.2 State the risks associated with a business changing too quickly AC1.3 State the risks associated with a business changing too slowly Understand the reasons for change in business

Level el 2 & 3 D Diploma

  • ma in Skil

ills ls for

  • r Busine

siness ss

LO1 AC1.1 Explain why it is important for a business to change AC1.2 Analyse the positive and negative effects of change on a selected AC1.3 Compare the risks of slow against rapid change within a business AC1.4 Compare the benefits of slow against rapid change within a business Understand change in business

Level el 2 (unit

nit 10)

Level el 3 (unit

nit 10)

slide-10
SLIDE 10

Unit 10 Pass

L2

AC1.1 The candidate will state why it is important for a business to change

L3

AC1.1 The candidate will explain why it is important for a business to change Merit Distinction The candidate will state why it is important for a business to change, demonstrating critical understanding [No D for this AC] The candidate will explain in detail why it is important for a business to change The candidate will give a sophisticated explanation of why it is important for a business to change

slide-11
SLIDE 11

All sor

  • rts

ts of

  • f te

technical nical question tions s relat ated ed to to grad ading ing ■ Standardisation ■ Comparability ■ Grading and levelling ■ Weighting ■ Burden and backwash ■ Transparency

slide-12
SLIDE 12

■ Because we’ve done a lot of technical research into issues like these, for GQ-like qualifications. We know

  • w how
  • w hard

rd chall llenges enges like e sta tanda ndardisa rdisati tion/

  • n/co

compa mparabi rabili lity ty can be

slide-13
SLIDE 13

Grad ade infla flation tion at t A level el?

Cumulative percentage of A level students awarded grade E (or higher) (for All Boards, Summer Awards, All Modes, by Syllabus Group) 60 65 70 75 80 85 90 95 100

1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Biology Latin Physics Sociology

slide-14
SLIDE 14

20 22 24 26 28 30 2002 2003 2004 2005 2006 2007 2008 2009

Cumulative % candidates awarded A at A level JCQ (June) Data, All UK Candidates, All Subjects

Grad ade infla flation tion at t A level el?

slide-15
SLIDE 15
slide-16
SLIDE 16

No l

  • lon
  • nger

er any y evidence dence of

  • f grad

ade e infla flation tion at t A level

20 22 24 26 28 30 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

Cumulative % candidates awarded A at A level JCQ (June) Data, All UK Candidates, All Subjects

slide-17
SLIDE 17
slide-18
SLIDE 18

https://w s://www. w.go gov. v.uk uk/go gove vernm nment/ ent/pu publ blic icat ations ions/stre strengt ngthe hening ing-vocat vocational ional-and and-tec technical hnical-qual qualifica ification tions

Ben Cuff, Nadir Zanini, Beth Black

slide-19
SLIDE 19

A Level BTEC (Subsidiary Diploma)

slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22
slide-23
SLIDE 23

■ But is it particularly challenging to maintain a firm grip on standards within Competence-Based Qualifications, like ‘older style’ BTECs? We know

  • w how
  • w hard

rd chall llenges enges like e sta tanda ndardisa rdisati tion/

  • n/co

compa mparabi rabili lity ty can be

slide-24
SLIDE 24

Becoming ‘GQ-like’? ■ “Professor Wolf’s report is very clear that assessment methods for many vocational qualifications need to be strengthened […] This helps ps to to ensure ure that vocational qualifications offer a com

  • mparabl

rable e level l of

  • f

chall llenge enge to academic qualifications and are seen to do so. External assessment also prov

  • vid

ides es an addit ditio ional nal check k th that t sta tandards ndards are con

  • nsiste

istent nt across centres.”

slide-25
SLIDE 25

■ But what does a good VTQ test look like? We know

  • w what

t a goo

  • od GQ te

test t loo

  • oks like

ke

slide-26
SLIDE 26

1. Questions (and test overall) should:

a) be of an appropriate level of difficulty b) differentiate between learners, but (only) on the basis of their proficiency

2. Test overall should:

a) deliver reliable results b) embody standards that are comparable with comparable tests

What t makes s a goo

  • od te

test? t?

slide-27
SLIDE 27

https:/ ://w /www.gov.uk ww.gov.uk/g /govern

  • vernme

ment/ nt/pu public licatio ions/a ns/asse sess ssme ment nt-funct unctio ionin ning-of

  • f-exte

xterna nal-asse sessmen sments ts Beth Black, Qingping He, Stephen Holmes, Caroline Morin

slide-28
SLIDE 28

2016 serie ries ■ 49 tests (27 qualifications) ■ mainly L1 and L2 ■ health & social care, carpentry, hospitality, digital media, applied science, mathematics VTQs on the ‘approved list’ for DfE’s 16-19 perfo formance rmance ta table les 2017 serie ries ■ 20 tests (15 qualifications) ■ L3 only (Applied Generals/Tech Levels) ■ applied science, business, digital media, engineering, health & social care, IT/computing, sport

slide-29
SLIDE 29

■ Facility indices

□ how easy/hard is the question?

■ Discrimination indices

□ do candidates who tend to perform poorly/well on the question also tend to perform poorly/well on the test overall?

Ite tem funct ctioning ioning sta tati tistics stics – com

  • mpute

ted d separate rately ly for

  • r each

h question tion

slide-30
SLIDE 30

■ Mean of marks

□ how easy/hard is the test?

■ Standard deviation of marks

□ how widely spread are the marks (across the mark range)?

■ Reliability coefficient

□ an estimate of the degree to which results are likely to be replicable

Test t funct ctioning ioning sta tatistics tistics – com

  • mpute

ted d for

  • r th

the te test t ov

  • veral

rall

slide-31
SLIDE 31

■ Each test is intended to provide an ov

  • veral

rall esti timat mate of proficiency

□ it is not intended to certify the attainment of a specified set of AC for the unit

■ Each test is intended to differe fferentia ntiate te between candidates

□ between gradations of proficiency (pass, merit, distinction)

■ All candidates were adequately pr prepa pared red for their tests

□ they were enrolled on an appropriate course (at the right level) □ they were taught the subject content appropriately □ they were given appropriate experience of test-taking

We need d to to m make e certa tain in assump umption tions

slide-32
SLIDE 32

2016 te tests ts – Facili cility ty indi dice ces

slide-33
SLIDE 33

2016 te tests ts – Relia iabi bili lity ty coe

  • efficient

ficients

slide-34
SLIDE 34

We need d to to e enga gage ge te technica nically lly with th VTQs ■ WHY? Because VTQ assessment is never unproblematic, so it always needs to be studied (scientifically)

□ validity (includes reliability, comparability, bias)

■ HOW? ? By comparing how VTQ assessment is supposed to work (in theory) with how it actually works (in practice) ■ AND? ? Statistics can be our friends!

□ just as long as we get to know them really well and treat them right

“nothing more than doing one’s damnedest with one’s mind, no holds barred” (Bridgman, 1955)

Credit: Wikimedia Commons