The Missing Half of Language Learning in Current Developmental Language Models: Exogenous and Endogenous Linguistic Input

  • Nan Zhao
  • , Xufeng Duan
  • , Zhenguang G. Cai*
  • *Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

Abstract

Developmental language models (DLMs) aim to replicate the efficiency of child language acquisition but often focus solely on the estimation of exogenous linguistic input. We argue that a child’s linguistic growth is also critically shaped by endogenous processes, including (1) co-opting language in non-linguistic perception and cognition, (2) engaging in private and inner speech, and (3) benefiting from neural replay of linguistic information during sleep. These endogenous processes amplify and refine exogenous linguistic input in ways that current DLMs do not replicate. To align DLMs with child language acquisition, we propose redefining “linguistic exposure“ to encompass both exogenous and endogenous linguistic input. By integrating label feedback, self-generated speech, and sleep-like consolidation, researchers can narrow the gap between artificial and human learning. Collaborations across machine learning, psychology, and linguistics will be essential to ground models in empirical data on child behavior and build DLMs that truly reflect the marvel of language acquisition.
Original languageEnglish
Pages (from-to)1543-1549
Number of pages7
JournalOpen Mind
Volume9
DOIs
Publication statusPublished - 17 Sept 2025

User-Defined Keywords

  • developmental language models
  • inner speech
  • label feedback
  • language learning
  • private speech
  • sleep consolidation

Fingerprint

Dive into the research topics of 'The Missing Half of Language Learning in Current Developmental Language Models: Exogenous and Endogenous Linguistic Input'. Together they form a unique fingerprint.

Cite this