Effect of task condition on working memory span

The role of working memory in human reinforcement learning of non-Markovian tasks

Effect of task condition on working memory span

The role of working memory in human reinforcement learning of non-Markovian tasks

Working memory refers to our ability to temporarily store and manipulate information. It is an important cognitive system necessary for many daily life tasks such as thinking, planning, problem solving, reading and communicating with others. These activities require us to selectively maintain some information in mind while replacing old and irrelevant information with new and relevant one.

Different types of learning have been shown to involve working memory, including rule-based category learning (Ashby and Spiering, 2004; DeCaro et al., 2008), model based reinforcement learning (Otto et al., 2013a,b) and model-free reinforcement learning (Collins and Frank, 2012; Collins et al., 2014). However, the contributions of working memory in these studies mainly consist of either retaining task instructions, storing freshly learned action-outcome associations, maintaining hypotheses to test in subsequent trials, or supporting trial-to-trial adaptation after negative outcomes.

Working memory-based reinforcement learning models under the gating framework (O’Reilly and Frank, 2006; Todd et al., 2009) take for granted the assumption that working memory supports reinforcement learning in non-Markovian environments (i.e. environments where the states can depend on past information), by keeping track of previous events and allowing them to take part of the learning context. Although this framework is compelling from a theoretical perspective, it is not clear whether people would indeed rely on working memory or on another memory system such as episodic memory when given a non-Markovian task (Zilli and Hasselmo, 2008).

The aim of the present work is to investigate the potential additional role of working memory in keeping track of useful cues from the past to serve action selection in non-Markovian settings. For that, we designed an experiment where we measured participants’ available working memory capacity while doing a simple non-Markovian task based on the 12-AX (Frank et al., 2001), and compared it with their estimated working memory capacity in a similar Markovian task. A smaller available working memory capacity in the non-Markovian condition would indicate that people use those extra working memory resources to solve non-Markovian tasks, as hypothesised by working memory-based reinforcement learning models. We also examined whether participants’ learning success depended on the way they allocated their working memory resources over the course of the learning tasks.

References

Ashby, F. G. and Spiering, B. J. (2004). The neurobiology of category learning. Behavioral and cognitive neuroscience reviews, 3(2):101-113.

Collins, A. G., Brown, J. K., Gold, J. M., Waltz, J. A., and Frank, M. J. (2014). Working memory contributions to reinforcement learning impairments in schizophrenia. The Journal of Neuroscience, 34(41):13747-13756.

Collins, A. G. and Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35(7):1024-1035.

DeCaro, M. S., Thomas, R. D., and Beilock, S. L. (2008). Individual differences in category learning: Sometimes less working memory capacity is better than more. Cognition, 107(1):284-294.

Frank, M. J., Loughry, B., and O’Reilly, R. C. (2001). Interactions between the frontal cortex and basal ganglia in working memory: A computational model. Cognitive, Affective, & Behavioral Neuroscience, 1(2):137-160.

Otto, A. R., Gershman, S. J., Markman, A. B., and Daw, N. D. (2013a). The curse of planning dissecting multiple reinforcement-learning systems by taxing the central executive. Psychological science, 24(5):751-761.

Otto, A. R., Raio, C. M., Chiang, A., Phelps, E. A., and Daw, N. D. (2013b). Working-memory capacity protects model-based learning from stress. Proceedings of the National Academy of Sciences, 110(52):20941-20946.

O’Reilly, R. C. and Frank, M. J. (2006). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation, 18(2):283-328.

Todd, M. T., Niv, Y., and Cohen, J. D. (2009). Learning to use working memory in partially observable environments through dopaminergic reinforcement. In Advances in neural information processing systems, pages 1689-1696.

Zilli, E. A. and Hasselmo, M. E. (2008). Modeling the role of working memory and episodic memory in behavioral tasks. Hippocampus, 18(2):193-209.

Avatar
Adnane Ez-zizi
Senior Lecturer in Artificial Intelligence

My research interests include reinforcement learning, educational data mining and AI, Natural language processing and computational modelling of human behaviour.