您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[ACT]:Performance of Three Conditional DIF Statistics in Detecting Differential Item Functioning on Simulated Tests - 发现报告
当前位置:首页/行业研究/报告详情/

Performance of Three Conditional DIF Statistics in Detecting Differential Item Functioning on Simulated Tests

文化传媒2014-09-12ACT赵***
Performance of Three Conditional DIF Statistics in Detecting Differential Item Functioning on Simulated Tests

ACT Research Report Series89-7Performance of Three Conditional DiF Statistics in Detecting Differential Item Functioning on Simulated TestsJudith A. SprayOctober 1989 For additional copies write: ACT Research Report Series P.O. Box 168 Iowa City, Iowa 52243©1989 by The American College Testing Program. All rights reserved. Judith A. SprayPERFORMANCE OF THREE CONDITIONAL DIF STATISTICS IN DETECTINGDIFFERENTIAL ITEM FUNCTIONING ON SIMULATED TESTS ABSTRACTComputer simulations were conducted to study the behavior of three conditional differential item functioning (DIF) statistics in the detection of true or asymptotic DIF. The statistics included the standardized difference in proportion-correct (STD), the Mantel-Haenszel common odds-ratio (MH) and the root mean weighted squared difference in proportion-correct (RMWSD). The simulated tests were based on actual administrations o£ the ACT Assessment to certain focal and base examinee populations. Sample sizes of examinees were varied while true DIF and test length remained fixed. Results of these simulations showed that/the MH and STD statistics were preferred as DIF indicators for sample sizes greater than 250, In the fall of 1988, several members of the American College TestingProgram's Test Development Division conducted computer simulations to study the behavior of three conditional differential item functioning (DIF) statistics, in terms of DIF or item bias detection. The statistics selected for inclusion in this study were the standardized difference in proportion- correct (Dorans & Kulick, 1986), Mantel-Haenszel common odds-ratio (Holland & Thayer, 1986; Mantel & Haenszel, 1959), and the root mean weighted squared difference in proportion-correct (Dorans & Kulick, 1986).Item bias statistics which condition on some examinee ability measure are thought to be better measures of DIF than those statistics that use the simple unconditional difference in proportion-correct values, sometimes referred to as impact. The unconditional impact does not take into account underlying differences in ability distributions between populations or groups of interest. The conditional procedures, on the other hand, reflect proportion- correct differences only between examinees with comparable ability in each population or group. These DIF statistics have been used by other testing programs and services to detect or flag test items on tests where DIF might beproblematic. The statistics were defined as follows.The populations or groups of interest were referred to as the focal (F) group and the base (B) group. Then s indexed each observed score category of a k-item test, or s = 0, 1, ..., k. ThenN = the number of examinees in the F group at score s,F ~sN = the number of examinees in the B group at score s,BSPERFORMANCE OF THREE CONDITIONAL DIF STATISTICS IN DETECTINGDIFFERENTIAL ITEM FUNCTIONING ON SIMULATED TESTS 2N = the total number of examinees in F and B at score s, s - - -u> = N / I N , the relative frequency of F at s, r r __ r — —s s s=0 scj = N / I N , the relative frequency of B at s, and d B __ B — —s s s=G sdi = N / Z N , the total relative frequency of F and B at s. s=0Also R and R were the numbers of examinees (i.e., absolute frequency) in F b B —S 5and B respectively at s who answered the item correctly. The proportion- correct values for each group at s were given byPF = rf /nf • ands s spb = rb /nbs s sThe STD StatisticThe standardized difference in proportion-correct was defined asSTD = Z wF (Pp - PB ) (1)s=0 s s sThe signed difference, (P - P ), was weighted by the relative frequency ofr oS SF because u>_ provided the greatest weight to differences at those score tslevels most frequently observed in the focal group. 3The MH StatisticIf W and W were the absolute frequencies of incorrect responses to r dS Sthis item in F and B, respectively at s, and Ng was the total number of responses at s, then the Mantel-Haenszel common odds-ratio wasi rb wf /nsMH = ------2------5------- .E R- Wn /N_n F B ss=0 s sIf Q and Q were defined as (1 - P ) and (1 - P ) respectively, then this r o r BS 3 3 Sindex also could be written ask NB . NFZ P Q — -----------_n B F NMTI - s=0 s s sk nb • nfJ PP QB • 5 ss=0 s s Nor even as a function of several relative weights or densities,k “b • WFZ P Q — ^-----*-o B F ws=0 s s s= ^ ^ ' <2>* *F % 5 Ss=0 s s a) 4The RMWSD StatisticAnd fina