Abstract
Natural language-based assessment (NLA) is an approach to second language assessment that uses instructions - expressed in the form of can-do descriptors - originally intended for human examiners, aiming to determine whether large language models (LLMs) can interpret and apply them in ways comparable to human assessment. In this work, we explore the use of such descriptors with an open-source LLM, Qwen 2.5 72B, to assess responses from the publicly available S&I Corpus in a zero-shot setting. Our results show that this approach - relying solely on textual information - achieves competitive performance: while it does not outperform state-of-the-art speech LLMs fine-tuned for the task, it surpasses a BERT-based model trained specifically for this purpose. NLA proves particularly effective in mismatched task settings, is generalisable to other data types and languages, and offers greater interpretability, as it is grounded in clearly explainable, widely applicable language descriptors.
Supplementary materials
Title
Natural Language-based Assessment of L2 Oral Proficiency using LLMs - Paper
Description
Paper published in the Proceedings of the 10th Workshop on Speech and Language Technology in Education (SLaTE)
Actions
Supplementary weblinks
Title
Link to the Proceedings of the 10th Workshop on Speech and Language Technology in Education (SLaTE)
Description
Link to the Proceedings of the 10th Workshop on Speech and Language Technology in Education (SLaTE)
Actions
View 


![Author ORCID: We display the ORCID iD icon alongside authors names on our website to acknowledge that the ORCiD has been authenticated when entered by the user. To view the users ORCiD record click the icon. [opens in a new tab]](https://www.cambridge.org/engage/assets/public/coe/logo/orcid.png)