Working Papers in Education

Improving Performance Assessment Score Validation Practices: An Instructional Module on Generalizability Theory

Oksana Naumenko


In developing and validating measures of language abilities, researchers are faced with the challenge of balancing statistical and consequential qualities of language tests. With the growing use of performance assessments, or test types that require examinees to demonstrate the mastery of a specific complex skill by performing or producing something, so grows the need for quantitative tools that can disentangle variability in language assessment scores due to language ability from those due to irrelevant factors. A well-known, but underutilized, technique in the validation of language tests for such a purpose is Generalizability Theory. Generalizability Theory (G-theory) extends Classical Test Theory (CTT) in providing a mechanism for examining dependability of behavioral measurements (Cronbach, Gleser, Nanda, & Rajaratnam, 1972). One of the main advantages of using G-Theory in establishing evidence of measurement soundness is that in this framework, the observed test score can be partitioned into components other than the true test score and random error. Examining the relative and absolute magnitudes of variance components related to factors of study design can uncover the sources of unreliability or imprecision in the data and make possible the estimation of reliability under yet unstudied conditions. The present paper serves as an instructional module for a set of analyses under the G-Theory framework and showcases an analytic study that exemplifies the various inferences that can legitimately be made from G-study results.

Full Text:



  • There are currently no refbacks.