A systematic framework for generating novel experimental hypotheses from language models
Kanishka Misra, Najoung Kim · Aug 9, 2024 · Citations: 0
How to use this page
Low trustUse this as background context only. Do not make protocol decisions from this page alone.
Best use
Background context only
What to verify
Read the full paper before copying any benchmark, metric, or protocol choices.
Evidence quality
Low
Derived from extracted protocol signals and abstract evidence.
Abstract
Neural language models (LMs) have been shown to capture complex linguistic patterns, yet their utility in understanding human language and more broadly, human cognition, remains debated. While existing work in this area often evaluates human-machine alignment, few studies attempt to translate findings from this enterprise into novel insights about humans. To this end, we propose a systematic framework for hypothesis generation that uses LMs to simulate outcomes of experiments that do not yet exist in the literature. We instantiate this framework in the context of a specific research question in child language development: dative verb acquisition and cross-structural generalization. Through this instantiation, we derive novel, untested hypotheses: the alignment between argument ordering and discourse prominence features of exposure contexts modulates how children generalize new verbs to unobserved structures. Additionally, we also design a set of experiments that can test these hypotheses in the lab with children. This work contributes both a domain-general framework for systematic hypothesis generation via simulated learners and domain-specific, lab-testable hypotheses for child language acquisition research.