Abstract
Estimating word complexity is essential for many computer-assisted language learning technologies. We introduce semantic error prediction (SEP) as a novel task that assesses the production complexity of content words. In SEP, a system has to predict which word token are replacements of tokens from the original text. We use LLMs for this novel task and establish its practical relevance for predicting the vocabulary scores of learner essays, providing a finer-grained assessment of learner skills.