This post was originally posted on The Neurocritic Blog on March 27, 2016
The word dopamine has become a shorthand for positive reinforcement, whether it’s from fantasy baseball or a TV show.But did you know that a subset of dopamine (DA) neurons originating in the ventral tegmental area (VTA) of the midbrain respond to obnoxious stimuli (like footshocks) and regulate aversive learning?Sometimes the press coverage of a snappy dopamine paper can be positive and (mostly) accurate, as was the case with a recent paper on risk aversion in rats (Zalocusky et al., 2016).
This study showed that rats who like to “gamble” on getting a larger sucrose reward have a weaker neural response after “losing.” In this case, losing means choosing the risky lever, which dispenses a low amount of sucrose 75% of the time (but a high amount 25%), and getting a tiny reward. The gambling rats will continue to choose the risky lever after losing. Other rats are risk-averse, and will choose the “safe” lever with a constant reward after losing.This paper was a technical tour de force with 14 multi-panel figures.1 For starters, cells in the nucleus accumbens (a VTA target) expressing the D2 receptor (NAc D2R+ cells) were modified to express a calcium indicator that allowed the imaging of neural activity (via fiber photometry). Activity in NAc D2R+ cells was greater after loss, and during the decision phase of post-loss trials. And these two types of signals were dissociable.2 Then optogenetic methods were used to activate NAc D2R+ cells on post-loss trials in the risky rats. This manipulation caused them to choose the safer option.
– click to enlarge –
Now, there’s a boatload of data on the role of dopamine in reinforcement learning and computational models of reward prediction error (Schultz et al., 1997) and discussion about potential weaknesses in the DA and RPE model. So while a very impressive addition to the growing pantheon of laser-controlled rodents, the results of Zalocusky et al. (2016) aren’t massively surprising.
More surprising are two recent papers in the highly sought-after population of humans implanted with electrodes for seizure monitoring or treatment of Parkinson’s disease. I’ll leave you with quotes from these papers as food for thought.
1. Stenner et al. (2015). No unified reward prediction error in local field potentials from the human nucleus accumbens: evidence from epilepsy patients.
Signals after outcome onset were correlated with RPE regressors in all subjects. However, further analysis revealed that these signals were better explained as outcome valence rather than RPE signals, with gamble gains and losses differing in the power of beta oscillations and in evoked response amplitudes. Taken together, our results do not support the idea that postsynaptic potentials in the Nacc represent a RPE that unifies outcome magnitude and prior value expectation.
The next one is extremely impressive for combining deep brain stimulation with fast-scan cyclic voltammetry, a method that tracks dopamine fluctuations in the human brain!
2. Kishida et al. (2016). Subsecond dopamine fluctuations in human striatum encode superposed error signals about actual and counterfactual reward.
Dopamine fluctuations in the striatum fail to encode RPEs, as anticipated by a large body of work in model organisms. Instead, subsecond dopamine fluctuations encode an integration of RPEs with counterfactual prediction errors, the latter defined by how much better or worse the experienced outcome could have been. How dopamine fluctuations combine the actual and counterfactual is unknown. One possibility is that this process is the normal behavior of reward processing dopamine neurons, which previously had not been tested by experiments in animal models. Alternatively, this superposition of error terms may result from an additional yet-to-be-identified subclass of dopamine neurons.
As Addictive As Cupcakes – Mind Hacks (“If I read the phrase ‘as addictive as cocaine’ one more time I’m going to hit the bottle.”)
Dopamine Neurons: Reward, Aversion, or Both? – Scicurious
Back to Basics 4: Dopamine! – Scicurious (in fact, anything by Scicurious on dopamine)
Why Dopamine Makes People More Impulsive – Sofia Deleniv at Knowing Neurons
2-Minute Neuroscience: Reward System – video by Neuroscientifically Challenged
1 For example:
Because decision-period activity predicted risk-preferences and increased before safe choices, we sought to enhance the D2R+ neural signal by optogenetically activating these cells during the decision period. An unanticipated obstacle (D2SP-driven expression of channelrhodopsin-2 eYFP fusion protein (D2SP-ChR2(H134R)-eYFP) leading to protein aggregates in rat NAc neurons) was overcome by adding an endoplasmic reticulum (ER) export motif and trafficking signal29 (producing enhanced channelrhodopsin (eChR2); Methods), resulting in improved expression (Extended Data Fig. 7). In acute slice recordings, NAc cells expressing D2SP-eChR2(H134R)-eYFP tracked 20-Hz optical stimulation with action potentials (Fig. 4c).
2 The human Reproducibility Project: Psychology brigade might be interested to see Pearson’s r2 = 0.86 in n = 6 rats.
Kishida KT, Saez I, Lohrenz T, Witcher MR, Laxton AW, Tatter SB, White JP, Ellis TL, Phillips PE, Montague PR. (2016). Subsecond dopamine fluctuations in human striatum encode superposed error signals about actual and counterfactual reward. Proc Natl Acad Sci 113(1):200-5.
Schultz W, Dayan P, Montague PR. (1997). A neural substrate of prediction and reward. Science 275:1593–1599. [PubMed]
Stenner MP, Rutledge RB, Zaehle T, Schmitt FC, Kopitzki K, Kowski AB, Voges J, Heinze HJ, Dolan RJ. (2015). No unified reward prediction error in local field potentials from the human nucleus accumbens: evidence from epilepsy patients. J Neurophysiol. 114(2):781-92.
Zalocusky, K., Ramakrishnan, C., Lerner, T., Davidson, T., Knutson, B., & Deisseroth, K. (2016). Nucleus accumbens D2R cells signal prior outcomes and control risky decision-making Nature DOI: 10.1038/nature17400