AI and Multimodality-Based Authentic-Innovative Assessment for Evaluating English Speaking Skills

Ibnu Subroto; Yanti Rosalinah; Cicih Nuraeni

doi:10.31294/wanastra.v18i1.11392

Authors

Ibnu Subroto Universitas Bina Sarana Informatika Author
Yanti Rosalinah Universitas Bina Sarana Informatika Author
Cicih Nuraeni Universitas Bina Sarana Informatika Author

DOI:

https://doi.org/10.31294/wanastra.v18i1.11392

Keywords:

Authentic Assessment, Artificial Intelligence, Multimodality, Speaking Skills, EFL

Abstract

This study aimed to develop and validate an AI-Enhanced Multimodal Authentic Assessment model for evaluating EFL students' speaking skills, addressing the limitations of conventional, subjective methods. Employing a mixed-methods approach with a qualitative-dominant design, the research involved 35 university students from a Communication and Language study program. Data were collected through authentic video-based speaking tasks, AI-assisted linguistic analysis (using Google Speech-to-Text and a ChatGPT-based evaluator), detailed multimodal rubric assessments, and student perception questionnaires. Data analysis was conducted through three procedures: multimodal performance analysis using a validated rubric, comparative analysis of AI-generated linguistic metrics, and thematic analysis of questionnaire responses. Quantitative data from AI metrics and Likert-scale questionnaire items were analyzed using descriptive statistics, while qualitative data were analyzed thematically. The findings revealed that multimodal assessment effectively captured verbal, prosodic, visual, and gestural aspects of performance. Concurrently, AI excelled at objectively analyzing micro-linguistic features such as pronunciation, speech rate, and vocabulary. The integration of human and AI evaluation created a comprehensive hybrid model that provided richer, more informative feedback. Furthermore, students expressed positive perceptions regarding the clarity and usefulness of the AI-generated feedback. The study concludes that this integrated assessment model is highly relevant for 21st-century pedagogy and enhances the accuracy and quality of oral performance evaluation. The implications suggest that educators can adopt this framework to create more objective, efficient, and holistic speaking assessments, ultimately fostering better learning outcomes.

References

Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807

Braun, V., & Clarke, V. (2021). Thematic Analysis: A Practical Guide. SAGE Publications. https://books.google.co.id/books?id=eMArEAAAQBAJ

Chapelle, C. A. (2016). 20 YEARS OF TECHNOLOGY AND LANGUAGE ASSESSMENT IN LANGUAGE LEARNING & TECHNOLOGY. 20(2), 116–128.

Eva, Fachriyah, Berita Mambarasi Nehe, A. H. (2025). Harnessing video reaction-based tasks to foster speaking fluency and critical thinking in English: A mixed-method study. Jeltim, 7(2), 182–205.

Gulikers, J. T. M., Bastiaens, T. J., & Kirschner, P. A. (2004). A Five-Dimensional Framework for Authentic Assessment. Educational Technology Research and Development, 52(3), 67–86. https://doi.org/10.1007/BF02504676

Hu, Anjin, Liu, Qian, & Daniel, Ben. (2025). Digital Technologies in Authentic Assessment in Higher Education: A Systematic Literature Review and Narrative Synthesis. Sage Open, 15(3), 21582440251357200. https://doi.org/10.1177/21582440251357198

Huang, Becky H, Bailey, Alison L, Sass, Daniel A, & Shawn Chang, Yung-hsiang. (2021). An investigation of the validity of a speaking assessment for adolescent English language learners. Language Testing, 38(3), 401–428. https://doi.org/10.1177/0265532220925731

Isaacs, T., & Trofimovich, P. (2016). Second Language Pronunciation Assessment. Multilingual Matters. https://doi.org/10.21832/ISAACS6848

Johnson, R. Burke, Onwuegbuzie, Anthony J, & Turner, Lisa A. (2007). Toward a Definition of Mixed Methods Research. Journal of Mixed Methods Research, 1(2), 112–133. https://doi.org/10.1177/1558689806298224

Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., … Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/https://doi.org/10.1016/j.lindif.2023.102274

Kress, G., & van Leeuwen, T. (2020). Reading Images: The Grammar of Visual Design. Taylor & Francis. https://books.google.co.id/books?id=zmsJEAAAQBAJ

Li, Junfei, Huang, Jinyan, & Sheeran, Thomas. (2025). ChatGPT4o as an AI Peer Assessor in EFL Speaking Classrooms: Examining Scoring Reliability and Feedback Effectiveness. Sage Open, 15(3), 21582440251369936. https://doi.org/10.1177/21582440251369938

Liu, X. J., Wang, J., & Zou, B. (2025). Evaluating an AI speaking assessment tool: Score accuracy, perceived validity, and oral peer feedback as feedback enhancement. Journal of English for Academic Purposes, 75, 101505. https://doi.org/https://doi.org/10.1016/j.jeap.2025.101505

Luoma, S. (2004). Assessing Speaking. Cambridge University Press. https://doi.org///doi.org/10.1017/CBO9780511733017

Mawalim, C. O., Leong, C. W., Sivan, G., Huang, H.-H., & Okada, S. (2025). Beyond accuracy: Multimodal modeling of structured speaking skill indices in young adolescents. Computers and Education: Artificial Intelligence, 8, 100386. https://doi.org/https://doi.org/10.1016/j.caeai.2025.100386

Palmour, L. (2024). Assessing speaking through multimodal oral presentations : The case of construct underrepresentation in EAP contexts. Lamguage Testing, 41 (1)(X), 9–34. https://doi.org/10.1177/02655322231183077

Plough, I. (2021). 3 A Case for Nonverbal Behavior: Implications for Construct, Performance and Assessment. In M. R. Salaberry & A. R. Burch (Eds.), Expanding the Construct and its Applications (pp. 50–70). Multilingual Matters. https://doi.org/doi:10.21832/9781788923828-004

Shadiev, R., & Feng, Y. (2024). Using automated corrective feedback tools in language learning: a review study. Interactive Learning Environments, 32(6), 2538–2566. https://doi.org/10.1080/10494820.2022.2153145

Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Frontiers in Psychology, Volume 14-2023. https://doi.org/10.3389/fpsyg.2023.1210187

Toyama, M., & Hori, T. (2025). Technology-enhanced multimodal approaches in classroom L2 pronunciation training. Frontiers in Education, Volume 10-2025. https://doi.org/10.3389/feduc.2025.1552470

Villarroel, V., Bloxham, S., Bruna, D., Bruna, C., & Herrera-Seda, C. (2018). Authentic assessment: creating a blueprint for course design. Assessment & Evaluation in Higher Education, 43(5), 840–854. https://doi.org/10.1080/02602938.2017.1412396

Wörtwein, T., Chollet, M., Schauerte, B., Morency, L.-P., Stiefelhagen, R., & Scherer, S. (2015). Multimodal Public Speaking Performance Assessment. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 43–50. https://doi.org/10.1145/2818346.2820762

Xi, Xiaoming, Higgins, Derrick, Zechner, Klaus, & Williamson, David. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371–394. https://doi.org/10.1177/0265532211425673

Zechner, K., & Evanini, K. (2019). Automated Speaking Assessment: Using Language Technologies to Score Spontaneous Speech (1st ed.). Routledge. https://doi.org/https://doi.org/10.4324/9781315165103

Zechner, K., Higgins, D., Xi, X., & Williamson, D. M. (2009). Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Communication, 51(10), 883–895. https://doi.org/https://doi.org/10.1016/j.specom.2009.04.009

Zou, Bin, Du, Yiran, Wang, Zhimai, Chen, Jinxian, & Zhang, Weilei. (2023). An Investigation Into Artificial Intelligence Speech Evaluation Programs With Automatic Feedback for Developing EFL Learners’ Speaking Skills. Sage Open, 13(3), 21582440231193816. https://doi.org/10.1177/21582440231193818