您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[ACT]:Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing Modes - 发现报告
当前位置:首页/行业研究/报告详情/

Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing Modes

文化传媒2012-10-03ACT啥***
Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing Modes

*050206120* Rev 1Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing ModesYi Zheng Yuki Nozawa Xiaohong Gao Hua-Hua Chang September 2012ACT Research Report Series2012 (6) For additional copies, write: ACT Research Report Series P.O. Box 168 Iowa City, IA 52243-0168© 2012 by ACT, Inc. All rights reserved. Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing Modes Yi Zheng Yuki Nozawa Xiaohong Gao Hua-Hua Chang ii Abstract Multistage adaptive tests (MSTs) have gained increasing popularity in recent years. MST is a balanced compromise between linear test forms (i.e., paper-and-pencil testing and computer-based testing) and traditional item-level computer-adaptive testing (CAT). It combines the advantages of both. On one hand, MST is adaptive (and therefore more efficient than linear tests). On the other hand, unlike CAT, it allows test developers to review test forms before administration, and it allows examinees to review and revise answers. Despite the advantages of MST, there is little literature on the details of heuristic automated assembly of MST and on the investigation of MST in the context of classification tests. In this study, we designed a MST for a large-scale classification test and performed the automated test assembly using a heuristic method. We then compared the performance of the MST with that of a linear test form and a CAT using computer simulation. The automated test assembly was successful. In comparing MST and CAT, we did observe a trade-off in measurement accuracy and item bank usage. For classification purposes, however, MST provided classification accuracy as good as that from CAT, with more efficient item bank usage. iii Acknowledgements The authors thank Deborah Harris, Rongchun Zhu, and Chunyan Liu, for their insightful comments on this study. The first author also thanks Steven Nydick for his generous help as a co-intern. Multistage Adaptive Testing for a Large-Scale Classification Test: Design, Heuristic Assembly, and Comparison with Other Testing Modes Multistage adaptive tests (MSTs) have gained increasing popularity in recent years as the Certified Public Accountants (CPA) Examination successfully switched from the paper-and-pencil (P&P) mode to the MST mode in 2004 (Breithaupt & Hare, 2007; Luecht, Brumfield, & Breithaupt, 2006) and the Graduate Record Examination (GRE) replaced the P&P linear testing mode and the computerized-adaptive testing (CAT) mode with MST in August, 20111. CAT has been applied for decades, while MST was recently promoted as an alternative — a “balanced compromise” (Hendrickson, 2007) between the linear testing modes (i.e., P&P and computer-based testing, CBT) and CAT. As the MST framework develops, it has taken several forms and names (Hendrickson, 2007), including two-stage testing (Adema, 1990; Kim & Plake, 1993), computerized mastery testing (CMT; Lewis & Sheehan, 1990), computer-adaptive sequential testing (CAST; Luecht, 2000; Luecht & Nungester, 1998), bundled multistage adaptive testing (BMAT; Luecht, 2003), and multiple form structures (MFS; Armstrong, Jones, Koppel, & Pashley, 2004). Recently, the names multistage testing, multistage adaptive testing, adaptive multistage testing, or computer adaptive multistage testing have been widely used in literature (e.g., Armstrong & Roussos, 2005; Belov & Armstrong, 2008; Breithaupt & Hare, 2007; Chen, 2011; Hambleton & Xing, 2006; Jodoin, Zenisky, & Hambleton, 2006; Keng, 2008; Luecht, Brumfield, & Breithaupt, 2006; Luecht & Burgin, 2003; Patsula, 1999). This study will use the name multistage adaptive testing (MST). In addition to the name, researchers have used different sets of terminology to describe the framework of multistage adaptive testing. This study will use the framework and terminology adopted in Luecht and Nungester’s (1998) paper. 1 See http://www.ets.org/gre/revised_general/about/experience for more information. 2In the MST framework, a test is divided into several stages. Having multiple stages gives the test a few chances to tailor itself for each examinee by selecting an item set that matches the examinee’s ability best for every stage after Stage 1 based on his/her responses to previous stages. This is similar to item selection in CAT. However, while CAT selects each item on-the-fly, MST preassembles all tests before administration. The basic structure of the assembled MST tests is termed “panel.” A panel is comprised of several st