نام پژوهشگر: سارا جلالی
سارا جلالی غلامرضا کیانی
there are two major theories of measurement in psychometrics: classical test theory (ctt) and item-response theory (irt). despite its widespread and long use, ctt has a number of shortcomings, which make it problematic to be used for practical and theoretical purposes. irt tries to solve these shortcomings, and provide better and more dependable answers. one of the applications of irt is the assessment of differential item functioning (dif). dif tells the test developer whether the test item functions differently for different groups. another important use of irt is in the area of computer adaptive tests (cat). cat is based on irt, and the stepping-stone in preparing a cat is the preparation of an item bank. item banking is based on irt. when irt is ignored, item banking will not be applicable and consequently there will be no cat. the present study first provided a thorough comparison of ctt and irt from both theoretical and practical perspectives. for this part of the study, the scores of 3000 testees were used. after that, irt was utilized to estimate dif between two gender groups and three fields of study i.e. mathematics, science and humanities in the specific english language part of the foreign language university entrance exam questions of the year 2006. for this part, the data of 15486 participants were used for finding gender dif and the data of 3924 participants for field dif. then, irt was used to prepare an item bank of the specific english language part of the mock foreign language entrance exam questions for the years 2006 and 2007. this mock exam is administered by an institute related to national organization of educational testing (noet). for preparing the item bank, specific new software i.e. fasttest, was utilized. finally, this item bank was utilized for preparing the cat version of the english exam, which was the final goal of the dissertation. the findings of this study showed that ctt- and irt-based person statistics correlated highly across the three irt models. also, it was found that item difficulty and item discrimination indexes from ctt correlated highly with those from all irt models. the dif analysis showed that there were a number of dif items in the exam and these items were analyzed in order to find the source of dif. finally, a suitable item bank along with the cat version of the english exam was prepared. the findings of the present study can be of great importance for the educational system. the researcher proposed some suggestions as to the use of irt and english cat in iran.