대한언어학회 전자저널

대한언어학회

전자저널 E-book

31권 4호 (2023년 12월)

베이지언 네트워크에 기반한 인지진단모형의 영어 읽기 시험에 대한 적용

최세일

Pages : 83-111

DOI :

PDF보기

리스트

Abstract

Choi, Sae il. (2023). Application of bayesian network-based cognitive diagnostic modeling to small sample English reading comprehension test data. The Linguistic Association of Korea Journal, 31(4), 83-111. Cognitive diagnostic models (CDMs), a family of classification models developed to provide fine-grained diagnostic information for learning and teaching in education, have increasingly been used in language testing. However, most of the previous CDM studies in language testing have mainly been conducted based on large samples from professional testing agencies. This trend makes it difficult for practitioners to apply the models in classroom assessment contexts for which the models were originally developed. Realizing this limitation, researchers working in CDMs have recently begun to turn their attention to the conditions in which CDMs can work for classroom assessments, especially with small sample sizes. Bayesian networks (BN) provide an efficient and intuitive framework for modeling complex systems of observable or latent variables and have been extensively employed in the data science as well as in intelligent tutoring systems for modeling students learning progress. The framework has also huge potential to be well suited for diagnostic modeling of students learning in classroom contexts. This study was to examine whether BN can be applied in cognitive diagnostic modeling for classroom assessments. After constructing a set of small test data (N=100, 150, 200) from a large real test, the study applied a BN-based CDM model to the data sets and compared with conventional CDMs its item parameter and attribute classification recovery. The results show that the BN-based CDMs yielded uniformly better estimates in all testing conditions than the conventional methods. The study then discusses its implications for the CDM applications in language testing.

Keywords

# 진단정보(diagnostic information) # 인지진단모형(cognitive diagnostic models) # 문항모수(itemparameters) # 분류정확도(classification accuracy) # 베이지언 네트워크(bayesian networks)

References

Afflerbach, P., Cho, B. Y., Kim, J. Y., Crassas, M. E., & Doyle, B. (2013). Reading: What else matters besides strategies and skills?. The Reading Teacher, 66(6), 440-448.
Akbay, L., & de la Torre, J. D. L. (2020). Estimation approaches in cognitive diagnosis modeling when attributes are hierarchically structured. Psicothema, 32(1), 122-129
Alderson J. C. (1990a). Testing reading comprehension skills (Part Two). Reading in a Foreign Language, 7, 465–503.
Alderson J. C. (1990b). Testing reading comprehension skills (Part One). Reading in a Foreign Language, 6, 425–438.
Alderson J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.
Alderson, J. C. (2005). Assessing reading. Cambridge, MA: Cambridge University Press.
Alderson, J. C., & Lukmani, Y. (1989). Cognition and reading: cognitive levels as embodied in test questions. Reading in a Foreign Language, 5, 253-70.
Almond, R. G., Mislevy, R. J., Steinberg, L. S., Yan, D., & Williamson, D. M. (2015). Bayesian networks in educational assessment. Springer.
Bernhardt, E. (2005). Progress and procrastination in second language reading. Annual Review of Applied Linguistics, 25, 133-150.
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov Chain Monte Carlo. Applied Psychological Measurement, 27, 395-414.
Bradshaw, C. P., Milam, A. J., Furr-Holden, C. D. M., & Johnson, L. S. (2015). The School Assessment for Environmental Typology (SAfETy): An observational measure of the school environment. American Journal of Community Psychology, 56, 280-292.
Buck, G., VanEssen, T., Tatsuoka, K., Kostin, I., Lutz, D., & Phelps, M. (1998). Development, selection and validation of a set of cognitive and linguistic attributes for the SAT I verbal: Sentence completion section (ETS Research Report No. RR-98-23). Princeton, NJ: Educational Testing Service.
Buck, G., & Tatsuoka, K. (1998). Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, 15(2), 119–157.
Carver, R. P. (1992). Reading rate: Theory, research, and practical implications. Journal of Reading, 36(2), 84-95.
Chen, J., & de la Torre, J. (2012). An extension of the G-DINA model for polytomous attributes. Paper presented at the annual meeting of American Educational Research Association, Vancouver.
Choi, Y., & McClenen, C. (2020). Development of adaptive formative assessment system using computerized adaptive testing and dynamic bayesian networks. Applied Sciences, 10(22), 8196.
Choi, Y., & Zhang, D. (2021). The relative role of vocabulary and grammatical knowledge in L2 reading comprehension: A systematic review of literature. International Review of Applied Linguistics in Language Teaching, 59, 1-30.
Cowell, R. G., Dawid, P., Lauritzen, S. L. & Spiegelhalter, D. J. (2007). Probabilistic networks and expert systems: Exact computational methods for Bayesian networks. Springer Science & Business Media.
Culbertson, M. J. (2016). Bayesian networks in educational assessment: The state of the field. Applied Psychological Measurement, 40(1), 3-21.
Davey, B. (1988). Factors affecting the difficulty of reading comprehension items for successful and unsuccessful readers. The Journal of Experimental Education, 56(2), 67-76.
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-Matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36(6), 447–468. de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34, 115-130.
DiBello, L. V., Stout W. F., & Roussos L. A. (1995). Unified cognitive/ psychometric diagnostic assessment likelihood-based classification techniques. In Nichols P. D., Chipman S. F., Brennan R. L. (Eds.), Cognitively diagnostic assessment (pp. 361-389). Routledge.
Epskamp, S., Borsboom, D., & Fried, E. I. (2018). Estimating psychological networks and their accuracy: A tutorial paper. Behavior Research Methods, 50(1), 195-212.
Finkelman, M., Kim, W., Weissman, A., & Cook, R. (2014). Cognitive diagnostic models and computerized adaptive testing: Two new item-selection methods that incorporate response times. Journal of Computerized Adaptive Testing, 2(4), 59-76.
Gottardo, A., & Mueller, J. (2009). Are first- and second-language factors related in predicting second-language reading comprehension? A study of Spanish-speaking children acquiring English as a second language from first to second grade. Journal of Educational Psychology, 101, 330–344. doi: 10.1037/a0014320
Gough, P. B., & Tunmer, W. E. (1986). Decoding, reading, and reading disability. Remedial and Special Education, 7(1), 6-10.
Grabe, W. (2009) Reading in a second language: moving from theory to practice. Cambridge University Press, Cambridge.
Harding, L., Alderson, J. C., & Brunfaut, T. (2015). Diagnostic assessment of reading and listening in a second or foreign language: Elaborating on diagnostic principles. Language Testing, 32(3), 317-336.
Hoover, W. A., & Gough, P. B. (1990). The simple view of reading. Reading and Writing: An Interdisciplinary Journal, 2(2), 127–160.
Hu, M., & Nation, I. S. P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 23, 403–430.
Hughes, A. (2003) Testing for Language Teachers. 2nd Edition, Arthur Hughes, Cambridge.
Jang, E. E. (2009). Cognitive diagnostic assessment of L2 reading comprehension ability: Validity arguments for applying Fusion Model to LanguEdge assessment. Language Testing, 26(1), 31–73.
Jang, E. E., Dunlop, M., Wagner, M., Kim, Y. H., & Gu, Z. (2013). Elementary school ELLs' reading skill profiles using cognitive diagnosis modeling: Roles of length of residence and home language environment. Language Learning, 63(3), 400-436.
Kasai, M. (1997). Application of the rule space model to the reading comprehension section of the test of English as a foreign language (TOEFL). University of Illinois at Urbana-Champaign.
Kim, Y. H. (2011). Diagnosing EAP writing ability using the reduced reparameterized unified model. Language Testing, 28(4), 509–541.
Kim, A. Y. (2015). Exploring ways to provide diagnostic feedback with an ESL placement test: Cognitive diagnostic assessment of L2 reading ability. Language Testing, 32(2), 227-258.
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: principles and techniques. MIT press.
Laufer, B. (1989). What percentage of text-lexis is essential for comprehension? In C. Lauren & M. Nordman (Eds.), Special language: From humans to thinking machines (pp. 316–323). Clevedon, UK: England: Multilingual Matters.
Lee, J. F. (1986). Background knowledge and L2 reading. The Modern Language Journal, 70(4), 350-354.
Lee, J. W. (2016). The role of vocabulary and grammar in different L2 reading comprehension measures. English Teaching, 71, 79–97.
Lee, B. Y., & Shin, S. K. (2020). Doable and practical: A validation study of classroom diagnostic tests. Journal of Asia TEFL, 17(2), 363.
Lee, Y-W., & Sawaki, Y. (2009). Application of three cognitive diagnosis models to ESL reading and listening assessments. Language Assessment Quarterly, 6(3), 239-263.
Leighton, J. P., & Gierl, M. J. (Eds.). (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge University Press.
Levy, R., & Mislevy, R. J. (2017). Bayesian psychometric modeling. CRC Press.
Li, Y., Huang, C., & Liu, J. (2023). Diagnosing primary students’ reading progression: Is cognitive diagnostic computerized adaptive testing the way forward? Journal of Educational and Behavioral Statistics, 48(6), 842-865.
Li, Y., Zhen, M., & Liu, J. (2021). Validating a reading assessment within the cognitive diagnostic assessment framework: Q-matrix construction and model comparisons for different primary grades. Frontiers in Psychology, 2021 Dec 16; 12, 786612.
Li, H., Hunter, C. V., & Lei, P. W. (2016). The selection of cognitive diagnostic models for a reading comprehension test. Language Testing, 33, 391–409.
Lin, Q., Xing, K., & Park, Y. S. (2020) Measuring skill growth and evaluating change: unconditional and conditional approaches to latent growth cognitive diagnostic models. Frontiers in Psychology, 11, 2205. doi: 10.3389/fpsyg.2020.02205
Li, H. (2011). A cognitive diagnostic analysis of the MELAB reading test. Spaan Fellow, 9, 17-46.
Lumley, T. (1993). The notion of subskills in reading comprehension tests: An EAP example. Language Testing, 10(3), 211-234.
Ma, W., & Jiang, Z. (2021). Estimating cognitive diagnosis models in small samples: Bayes modal estimation and monotonic constraints. Applied Psychological Measurement, 45(2), 95-111.
McNeish, D. (2016). On using Bayesian methods to address small sample problems. Structural Equation Modeling, 23(5), 750–773.
Morvay, G. (2012). The relationship between syntactic knowledge and reading comprehension in EFL learners. Studies in Second Language Learning and Teaching, 2(3), 415-438.
Nájera, P., Abad, F. J., Chiu, C-Y., & Sorrel, M. A. (2023). The restricted DINA model: A comprehensive cognitive diagnostic model for classroom-level assessments. Journal of Educational and Behavioral Statistics, 48(6), 719-749.
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63, 59-82.
Neapolitan, R. E. (2004) Learning Bayesian Networks. Prentice Hall, Upper Saddle River, NJ.
Brevik, L. M., Olsen, R. V., & Hellekjær, G. O. (2016). The complexity of second language reading: Investigating the L1-L2 relationship. Reading in a Foreign Language, 28(2), 161-182.
Ravand, H., & Baghaei, P. (2020). Diagnostic classification models: recent developments, practical issues, and prospects. International Journal of Testing, 20, 24–56. doi: 10.1080/15305058.2019.1588278
Pan, Y., & Zhan, P. (2020) The Impact of sample attrition on longitudinal learning diagnosis: A Prolog. Frontiers in Psychology, 11, 1051. doi: 10.3389/fpsyg. 2020.01051
Ravand, H. (2016). Application of a cognitive diagnostic model to a high-stakes reading comprehension test. Journal of Psychoeducational Assessment, 34(8), 782–799.
Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford Press.
Sawaki, Y., Kim, H-J., & Gentile, C. (2009). Q-Matrix construction: Defining the link between constructs and test items in large-scale reading and listening comprehension assessments. Language Assessment Quarterly, 6(3), 190-209.
Sen, S., & Cohen, A. S. (2021). Sample size requirements for applying diagnostic classification models. Frontiers in Psychology, 11, 621251. doi: 10.3389/fpsyg.2020. 621251
Sessoms, J., & Henson, R. A. (2018). Applications of diagnostic classification models: a literature review and critical commentary. Measurement Interdisciplinary Research and Perspectives, 16, 1-17.
Shiotsu, T., & Weir, C. J. (2007). The relative significance of syntactic knowledge and vocabulary breadth in the prediction of reading comprehension test performance. Language Testing, 24(1), 99-128.
Sinharay, S., & Almond, R. G. (2007). Assessing fit of cognitive diagnostic models a case study. Educational and Psychological Measurement, 67(2), 239-257.
Sorrel, M. A., Abad, F. J., & Nájera, P. (2021). Improving accuracy and usage by correctly selecting: The effects of model selection in cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 45(2), 112-129.
Sorrel, M. A., Abad, F. J., Olea, J., de la Torre, J., & Barrada, J. R. (2017). Inferential item-fit evaluation in cognitive diagnosis modeling. Applied Psychological Measurement, 41(8), 614–631.
Stanovich, K. E. (1980). Toward an interactive-compensatory model of individual differences in the development of reading fluency. Reading Research Quarterly, 16, 32–71.
Sun, X., Gao, Y., Xin, T., & Song, N. (2021) Binary restrictive threshold method for item exposure control in cognitive diagnostic computerized adaptive testing. Frontiers in Psychology, 12, 517155. doi: 10.3389/fpsyg.2021.517155
Susoy, Z., & Tanyer, S. (2019). The role of vocabulary vs. syntactic knowledge in L2 reading comprehension. Eurasian Journal of Applied Linguistics, 5, 113-130. doi: 10.32601/ejal.543787
Yi, Y. -S. (2017). In search of optimal cognitive diagnostic model(s) for ESL grammar test data. Applied Measurement in Education, 30(2), 82-101.
Van Gelderen, A., Schoonen, R., de Glopper, K., Hulstijn, J., Snellings, P., Simis, A., et al. (2003). Roles of linguistic knowledge, metacognitive knowledge and processing speed in L3, L2 and L1 reading comprehension: A structural equation modeling approach. International Journal of Bilingualism, 7, 7–25.
Wang, N., & Almond, R. (2019). Bayesian model checking in cognitive diagnostic models. Behaviormetrika, 46(2), 371-388.
Weir, C., Hawkey, R., Green, A., Unaldi, A., & Devi, S. (2009). The relationship between the academic reading construct as measured by IELTS and the reading experiences of students in their first year of study at a British …. British Council/IDP Australia Research Reports. 9.
Zhan, P., Jiao, H., Man, K., & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473-503.
Zhang, D. (2012). Vocabulary and grammar knowledge in second language reading comprehension: A structural equation modeling study. Modern Language Journal, 96, 558–575.