TY - JOUR
T1 - Automated estimation of item difficulty for multiple-choice tests
T2 - An application of word embedding techniques
AU - Hsu, Fu Yuan
AU - Lee, Hahn Ming
AU - Chang, Tao Hsing
AU - Sung, Yao Ting
N1 - Publisher Copyright:
© 2018
PY - 2018/11
Y1 - 2018/11
N2 - Pretesting is the most commonly used method for estimating test item difficulty because it provides highly accurate results that can be applied to assessment development activities. However, pretesting is inefficient, and it can lead to item exposure. Hence, an increasing number of studies have invested considerable effort in researching the automated estimation of item difficulty. Language proficiency tests constitute the majority of researched test topics, while comparatively less research has focused on content subjects. This paper introduces a novel method for the automated estimation of item difficulty for social studies tests. In this study, we explore the difficulty of multiple-choice items, which consist of the following item elements: a question and alternative options. We use learning materials to construct a semantic space using word embedding techniques and project an item's texts into the semantic space to obtain corresponding vectors. Semantic features are obtained by calculating the cosine similarity between the vectors of item elements. Subsequently, these semantic features are sent to a classifier for training and testing. Based on the output of the classifier, an estimation model is created and item difficulty is estimated. Our findings suggest that the semantic similarity between a stem and the options has the strongest impact on item difficulty. Furthermore, the results indicate that the proposed estimation method outperforms pretesting, and therefore, we expect that the proposed approach will complement and partially replace pretesting in future.
AB - Pretesting is the most commonly used method for estimating test item difficulty because it provides highly accurate results that can be applied to assessment development activities. However, pretesting is inefficient, and it can lead to item exposure. Hence, an increasing number of studies have invested considerable effort in researching the automated estimation of item difficulty. Language proficiency tests constitute the majority of researched test topics, while comparatively less research has focused on content subjects. This paper introduces a novel method for the automated estimation of item difficulty for social studies tests. In this study, we explore the difficulty of multiple-choice items, which consist of the following item elements: a question and alternative options. We use learning materials to construct a semantic space using word embedding techniques and project an item's texts into the semantic space to obtain corresponding vectors. Semantic features are obtained by calculating the cosine similarity between the vectors of item elements. Subsequently, these semantic features are sent to a classifier for training and testing. Based on the output of the classifier, an estimation model is created and item difficulty is estimated. Our findings suggest that the semantic similarity between a stem and the options has the strongest impact on item difficulty. Furthermore, the results indicate that the proposed estimation method outperforms pretesting, and therefore, we expect that the proposed approach will complement and partially replace pretesting in future.
KW - Cognitive processing model
KW - Item difficulty estimation
KW - Machine learning
KW - Multiple-choice item
KW - Semantic similarity
KW - Word embedding
UR - http://www.scopus.com/inward/record.url?scp=85049610755&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049610755&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2018.06.007
DO - 10.1016/j.ipm.2018.06.007
M3 - Article
AN - SCOPUS:85049610755
SN - 0306-4573
VL - 54
SP - 969
EP - 984
JO - Information Processing and Management
JF - Information Processing and Management
IS - 6
ER -