Abstract
Our work empirically assesses a novel hypothesis that language acquisition theories can help design more effective curriculum learning strategies for Cognitively-inspired Small-Scale Language Models (SSLMs). SSLMs are Transformer-based language models trained on corpora that approximate the volume and nature of input that a first-language learner can expect to receive during language acquisition. Curriculum Learning (CL) has emerged as a promising method to improve the cognitive plausibility of SSLMs in the first BabyLM Shared Task, as a way to gradually introduce more complex linguistic phenomena into the model later in training in a manner that is similar to human language acquisition. However, CL strategies have not led to considerable improvements over non-curriculum models. This is contrary to the predictions of linguistic theory, which suggests that children naturally focus on input that is neither too simple nor too difficult but at the right level of challenge for learning. This acquisition behaviour, known as the “Goldilocks Effect”, is a form of self-selecting curriculum learning that appears to naturally occur in first language (L1) acquisition. We compare the success of three curricula (Growing, Inwards & MMM) that precisely replicate the predictions of contrastive acquisition theories based on contemporary Chomksyan acquisition models to specify fine-grained curriculum learning strategies on a volume of Child-Directed Speech (CDS) that a learner would expect to receive by 6 years-old (6;0) for five languages. Overall, Curriculum Learning strategies that precisely replicate language acquisition theories, formulated based on careful analysis of child developmental sequences, can lead to better-performing data-efficient architectures.
Supplementary weblinks
Title
Code
Description
Code for Cognitively-Plausible Small-Scale Language Models trained using developmentally-plausible corpora of Child-Directed Speech, and a series of universal and language-specific objective curricula.
Actions
View Title
Trained Models & Training Datasets
Description
Models and Datasets developed in this paper are available to use on HuggingFace.
Actions
View Title
Cambridge Small Language Models
Description
Resources for Cambridge University Small Language Models.
Actions
View