Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Authors: Jian-Qiao Zhu / Haijiang Yan / Thomas L. Griffiths

Year: 2024

Source: https://arxiv.org/abs/2405.19313

TLDR:

This paper mainly introduces the potential of using large language models (LLMs) as cognitive models, with a particular focus on the decision-making process of risk and time choices. The study found that training LLMs on ecologically valid arithmetic datasets can better capture human decision-making, even surpassing classical behavioral models commonly used in psychological experiments. The paper also discusses the challenges of using LLMs as cognitive models, including limitations on disclosure and access to training data, as well as further research on model architecture and training objectives. At the same time, the paper calls for further research to bridge the data gap between LLMs and human learners.

Free Login To Access AI Capability

Free Access To ChatGPT

The paper explores the potential of using large language models (LLMs) as cognitive models, particularly in predicting human decision-making in risky and intertemporal choices, and proposes an approach to address the challenges associated with training and accessing LLMs.

Free Access to ChatGPT

Abstract

The observed similarities in the behavior of humans and Large Language Models (LLMs) have prompted researchers to consider the potential of using LLMs as models of human cognition. However, several significant challenges must be addressed before LLMs can be legitimately regarded as cognitive models. For instance, LLMs are trained on far more data than humans typically encounter, and may have been directly trained on human data in specific cognitive tasks or aligned with human preferences. Consequently, the origins of these behavioral similarities are not well understood. In this paper, we propose a novel way to enhance the utility of LLMs as cognitive models. This approach involves (i) leveraging computationally equivalent tasks that both an LLM and a rational agent need to master for solving a cognitive problem and (ii) examining the specific task distributions required for an LLM to exhibit human-like behaviors. We apply this approach to decision-making -- specifically risky and intertemporal choice -- where the key computationally equivalent task is the arithmetic of expected value calculations. We show that an LLM pretrained on an ecologically valid arithmetic dataset, which we call Arithmetic-GPT, predicts human behavior better than many traditional cognitive models. Pretraining LLMs on ecologically valid arithmetic datasets is sufficient to produce a strong correspondence between these models and human decision-making. Our results also suggest that LLMs used as cognitive models should be carefully investigated via ablation studies of the pretraining data.

Method

The paper utilized a method called Arithmetic-GPT, a small language model trained to perform arithmetic tasks, to investigate human risky and intertemporal decision-making. The researchers developed a data generation algorithm to create synthetic datasets, allowing complete control over the training data for the language model. They also conducted logistic regression and cross-validation to evaluate the model's performance in predicting human choices. Additionally, the study involved pretraining the model on tasks related to human decision-making processes and analyzing the neural activation patterns crucial for decision-making. These methods enabled the researchers to assess the capabilities and limitations of the language model in capturing human decision-making.

Main Finding

The main finding of this paper is that training small-scale language models on arithmetic tasks can better predict human decision-making, even outperforming existing psychological models and large-scale language models trained on more extensive datasets. This approach can also be extended to other cognitive tasks that heavily rely on language interfaces, and can be used in conjunction with other types of foundational models to study human perception.

Conclusion

The conclusion of this paper is that large language models (LLMs) have the potential to serve as cognitive models for understanding human decision-making processes. The study proposes a novel approach to enhance the utility of LLMs as cognitive models by leveraging computationally equivalent tasks and examining specific task distributions required for LLMs to exhibit human-like behaviors. The results demonstrate that pretraining LLMs on ecologically valid arithmetic datasets, such as Arithmetic-GPT, can predict human behavior better than many traditional cognitive models, particularly in the context of risky and intertemporal choices. This suggests that LLMs used as cognitive models should be carefully investigated through ablation studies of the pretraining data.

Keywords

Large language models, cognitive models, arithmetic tasks, human decision-making, risky choices, intertemporal choices, synthetic datasets, training data, value alignment, Bayesian models of cognition, meta-learning, pre-training, attention mask, position embedding, ecological distributions, logistic regression, embeddings, ablation studies, data disclosure, limitations, future research.

The Best AI PDF Reader