AI and Multimodality-Based Authentic-Innovative Assessment for Evaluating English Speaking Skills
DOI:
https://doi.org/10.31294/wanastra.v18i1.11392Keywords:
Authentic Assessment, Artificial Intelligence, Multimodality, Speaking Skills, EFLAbstract
This study aimed to develop and validate an AI-Enhanced Multimodal Authentic Assessment model for evaluating EFL students' speaking skills, addressing the limitations of conventional, subjective methods. Employing a mixed-methods approach with a qualitative-dominant design, the research involved 35 university students from a Communication and Language study program. Data were collected through authentic video-based speaking tasks, AI-assisted linguistic analysis (using Google Speech-to-Text and a ChatGPT-based evaluator), detailed multimodal rubric assessments, and student perception questionnaires. Data analysis was conducted through three procedures: multimodal performance analysis using a validated rubric, comparative analysis of AI-generated linguistic metrics, and thematic analysis of questionnaire responses. Quantitative data from AI metrics and Likert-scale questionnaire items were analyzed using descriptive statistics, while qualitative data were analyzed thematically. The findings revealed that multimodal assessment effectively captured verbal, prosodic, visual, and gestural aspects of performance. Concurrently, AI excelled at objectively analyzing micro-linguistic features such as pronunciation, speech rate, and vocabulary. The integration of human and AI evaluation created a comprehensive hybrid model that provided richer, more informative feedback. Furthermore, students expressed positive perceptions regarding the clarity and usefulness of the AI-generated feedback. The study concludes that this integrated assessment model is highly relevant for 21st-century pedagogy and enhances the accuracy and quality of oral performance evaluation. The implications suggest that educators can adopt this framework to create more objective, efficient, and holistic speaking assessments, ultimately fostering better learning outcomes.
References
Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807
Braun, V., & Clarke, V. (2021). Thematic Analysis: A Practical Guide. SAGE Publications. https://books.google.co.id/books?id=eMArEAAAQBAJ
Chapelle, C. A. (2016). 20 YEARS OF TECHNOLOGY AND LANGUAGE ASSESSMENT IN LANGUAGE LEARNING & TECHNOLOGY. 20(2), 116–128.
Eva, Fachriyah, Berita Mambarasi Nehe, A. H. (2025). Harnessing video reaction-based tasks to foster speaking fluency and critical thinking in English: A mixed-method study. Jeltim, 7(2), 182–205.
Gulikers, J. T. M., Bastiaens, T. J., & Kirschner, P. A. (2004). A Five-Dimensional Framework for Authentic Assessment. Educational Technology Research and Development, 52(3), 67–86. https://doi.org/10.1007/BF02504676
Hu, Anjin, Liu, Qian, & Daniel, Ben. (2025). Digital Technologies in Authentic Assessment in Higher Education: A Systematic Literature Review and Narrative Synthesis. Sage Open, 15(3), 21582440251357200. https://doi.org/10.1177/21582440251357198
Huang, Becky H, Bailey, Alison L, Sass, Daniel A, & Shawn Chang, Yung-hsiang. (2021). An investigation of the validity of a speaking assessment for adolescent English language learners. Language Testing, 38(3), 401–428. https://doi.org/10.1177/0265532220925731
Isaacs, T., & Trofimovich, P. (2016). Second Language Pronunciation Assessment. Multilingual Matters. https://doi.org/10.21832/ISAACS6848
Johnson, R. Burke, Onwuegbuzie, Anthony J, & Turner, Lisa A. (2007). Toward a Definition of Mixed Methods Research. Journal of Mixed Methods Research, 1(2), 112–133. https://doi.org/10.1177/1558689806298224
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., … Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/https://doi.org/10.1016/j.lindif.2023.102274
Kress, G., & van Leeuwen, T. (2020). Reading Images: The Grammar of Visual Design. Taylor & Francis. https://books.google.co.id/books?id=zmsJEAAAQBAJ
Li, Junfei, Huang, Jinyan, & Sheeran, Thomas. (2025). ChatGPT4o as an AI Peer Assessor in EFL Speaking Classrooms: Examining Scoring Reliability and Feedback Effectiveness. Sage Open, 15(3), 21582440251369936. https://doi.org/10.1177/21582440251369938
Liu, X. J., Wang, J., & Zou, B. (2025). Evaluating an AI speaking assessment tool: Score accuracy, perceived validity, and oral peer feedback as feedback enhancement. Journal of English for Academic Purposes, 75, 101505. https://doi.org/https://doi.org/10.1016/j.jeap.2025.101505
Luoma, S. (2004). Assessing Speaking. Cambridge University Press. https://doi.org///doi.org/10.1017/CBO9780511733017
Mawalim, C. O., Leong, C. W., Sivan, G., Huang, H.-H., & Okada, S. (2025). Beyond accuracy: Multimodal modeling of structured speaking skill indices in young adolescents. Computers and Education: Artificial Intelligence, 8, 100386. https://doi.org/https://doi.org/10.1016/j.caeai.2025.100386
Palmour, L. (2024). Assessing speaking through multimodal oral presentations : The case of construct underrepresentation in EAP contexts. Lamguage Testing, 41 (1)(X), 9–34. https://doi.org/10.1177/02655322231183077
Plough, I. (2021). 3 A Case for Nonverbal Behavior: Implications for Construct, Performance and Assessment. In M. R. Salaberry & A. R. Burch (Eds.), Expanding the Construct and its Applications (pp. 50–70). Multilingual Matters. https://doi.org/doi:10.21832/9781788923828-004
Shadiev, R., & Feng, Y. (2024). Using automated corrective feedback tools in language learning: a review study. Interactive Learning Environments, 32(6), 2538–2566. https://doi.org/10.1080/10494820.2022.2153145
Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Frontiers in Psychology, Volume 14-2023. https://doi.org/10.3389/fpsyg.2023.1210187
Toyama, M., & Hori, T. (2025). Technology-enhanced multimodal approaches in classroom L2 pronunciation training. Frontiers in Education, Volume 10-2025. https://doi.org/10.3389/feduc.2025.1552470
Villarroel, V., Bloxham, S., Bruna, D., Bruna, C., & Herrera-Seda, C. (2018). Authentic assessment: creating a blueprint for course design. Assessment & Evaluation in Higher Education, 43(5), 840–854. https://doi.org/10.1080/02602938.2017.1412396
Wörtwein, T., Chollet, M., Schauerte, B., Morency, L.-P., Stiefelhagen, R., & Scherer, S. (2015). Multimodal Public Speaking Performance Assessment. Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, 43–50. https://doi.org/10.1145/2818346.2820762
Xi, Xiaoming, Higgins, Derrick, Zechner, Klaus, & Williamson, David. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371–394. https://doi.org/10.1177/0265532211425673
Zechner, K., & Evanini, K. (2019). Automated Speaking Assessment: Using Language Technologies to Score Spontaneous Speech (1st ed.). Routledge. https://doi.org/https://doi.org/10.4324/9781315165103
Zechner, K., Higgins, D., Xi, X., & Williamson, D. M. (2009). Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Communication, 51(10), 883–895. https://doi.org/https://doi.org/10.1016/j.specom.2009.04.009
Zou, Bin, Du, Yiran, Wang, Zhimai, Chen, Jinxian, & Zhang, Weilei. (2023). An Investigation Into Artificial Intelligence Speech Evaluation Programs With Automatic Feedback for Developing EFL Learners’ Speaking Skills. Sage Open, 13(3), 21582440231193816. https://doi.org/10.1177/21582440231193818
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Ibnu Subroto, Yanti Rosalinah, Cicih Nuraeni (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All articles published in Wanastra: Jurnal Bahasa dan Sastra are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Under this license, authors retain copyright of their work and grant others the right to:
-
Share — copy and redistribute the material in any medium or format
-
Adapt — remix, transform, and build upon the material for any purpose, even commercially
Terms of Use:
Users are free to use the material under the following conditions:
-
Attribution — Proper credit must be given to the original author(s) and the source, a link to the license must be provided, and any changes made must be indicated.
-
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
The full license details can be accessed at: https://creativecommons.org/licenses/by/4.0/
By submitting to Wanastra: Jurnal Bahasa dan Sastra, authors agree to these licensing terms.




