The Bayesian Brain

Comments on A Free Energy Principal for the Brain

In this 2006 paper Karl Friston, of University College London, puts forward his unified theory of brain functions. This theory is a further development of the Bayesian Brian school of neurology and behaviour which posits that many brain  operations follow the logic of inference and may be treated mathematically using the methods of Bayesian Probability.

We might tend to assume that humans are the most highly rational organisms but this may be misleading. Any of us that have played games with our pet dogs may well be impressed with their superior ability in many sensory/perceptual/reaction type games such as 'keep the ball from the master'. We may not think of this ability as rational but as Friston shows in this paper all adaptive systems, including a dog at play, must gather information about the external world, infer models of events and causes in the outside world based on this information and take action based on these models. The degree of rationality involved during the 'infer' step will determine the degree of success in interacting with the external world. By this standard our dog is highly rational.

This should come as no surprise. There is after all an external world that exists with very little notice of personal, individual concerns. All organism are dependent on performing in accord with this external reality for their survival. It is little wonder mechanisms efficiently (rationally) adapting them to their environments have been built in by evolution from the ground up.

As Daniel Dennett noted:

Getting it right, not making mistakes, has been of paramount importance to every living thing on this planet for more than three billion years, and so these organisms have evolved thousands of different ways of finding out about the world they live in, discriminating friends from foes, meals from mates, and ignoring the rest for the most part.[i] 

The theory of the brain, which Fristonís paper builds on, attempts to explain the brain's underlying rationality as essential to an adaptive system. In view of this theory we might be surprised by the notion that our non-conscious brain functions including sensation, perception, memory, learning and many types of behaviour may be the most highly rational areas of our brain functions. The conscious states we dwell in may be powerful but may, due to their recent evolutionary appearance, be not well tuned to the world in which we live and thus only partially rational. 

The papers introduction leads off with a very interesting statement:

Our capacity to construct conceptual and mathematical models is central to scientific explanations of the world around us. Neuroscience is unique because it entails models of this model making procedure itself. There is something quite remarkable about the fact that our inferences about the world, both perceptual and scientific, can be applied to the very process of making those inferences: Many people now regard the brain as an inference machine that conforms to the same principles that govern the interrogation of scientific data.

This observation may suggest that there is an optimal method of knowing about and adapting to the reality in which we live; rational inference from data. Rational inference from data was discovered by Natural Selection hundreds of millions of years ago when nervous systems where first being formed and more recently it has been rediscovered by cultural evolution with its building of science. Further it may suggest that science is evolving to be to culture what brain functions are to the individual organism: its primary adaptive mechanism.

Friston begins the body of his paper by defining the general characteristics of an adaptive system as a one that can react to changes in its environment by changing its interactions with the environment to optimize the results for itself.

Let's say we see something out of the corner of our eye that might be important. Because we cannot see the object very well we may have trouble assigning good probabilities to the numerous imaginable possibilities. If we were adaptive systems we would turn our head and/or eyeballs and bring the thing into focus. We would change the way we interact with our environment by bringing a potentially important aspect of our environment into focus. Now we then better informed and in a better position to optimize our response.

Most of us would view his definition as a fair one. Certainly any system that operated according to the definition should be considered 'adaptive'.

He examines an adaptive system in term of three features or variables: the system itself, the effect of the environment on the system and the effect of the system on the environment. Friston then expands on his definition and claims that it is equivalent to the one stating the system attempts to change the effect of the environment on the system by either by changing itself or by changing its effects on the environment. He then makes a further shift and claims that a third definition is also equivalent: an adaptive system is one that minimizes unlikely or surprising exchanges with the environment. In other words the best strategy for an adaptive system is always the one where whatever the environments effect on the system that effect was expected. We might consider this to mean where the systemís internal models are 'in tune' with external reality. 

It does not seem to me that the logic of the third definition is equivalent to the logic of the first. The third definition talks only of making an internal model (expectation) as close as possible to what actually happens. What happened to  optimizing for our benefit and what happened to the effect we have on the environment. I assume Friston takes it for granted that those issues have been dealt with by other processes and both the expectation we have and the actual outcome have both already been optimized, the actual outcome by the internal changes we have made to our system or by changes we have made on our effect on the environment. Yes in this scheme we can and do manipulate how the environment effects us so we can optimize what actually happens. Given this assumption I think his third definition holds.

He is able to express the logic inherent in this third definition as a mathematical function using the form of Bayesian Probability if he introduces one more variable to denote unknown environmental forces that cause the effect the environment has on the system. Once he has done that he moves quickly to deduce a physical characteristic of adaptive systems that must be maximized in order for his third definition to hold: free energy. Unfortunately that formula contains the new variable, the unknown environment causes and this looks deadly as how could the adaptive system operate in this manner if it has to have as a precondition the answer to the very puzzle it is trying to figure out (adapt to).

Amazingly with a little mathematical sleight of hand Friston escapes this problem. He shows that while the system is unable to calculate the exact free energy function  it is able to compute a bound on that function. That bound can be computed without any reference to the new variable (unknown causes in the external world). After some further mathematical contortions a link in the form of a probability density connects states of the adaptive system to the unknown environmental causes. In other words he shows that a necessary characteristic of an adaptive system is the possession of a model of the outside world. In Friston's words:

The free-energy formulation in Eq. 3 has a fundamental implication: systems that minimise the surprise of their interactions with the environment by adaptive sampling can only do so by optimising a bound, which is a function of the systemís states. Formulating that bound in terms of Jensenís inequality requires that function to be a probability density, which links the systemís states to the hidden causes of its sensory input. In other words, the system is compelled to represent the causes of its sensorium. This means adaptive systems, at some level, represent the state and causal architecture of the environment in which they are immersed. Conversely, this means that causal regularities in the environment are transcribed into the systemís configuration.

Perhaps the coolest thing about this theory is that the internal model that must be developed on the basis of mathematical inference from sensory data is exactly analogous to science. The theory posits that the brain unconsciously smoothes the system's challenges with the environment by being a good scientist and building accurate models or theories concerning the operation of its environment through making rational inferences from sensory data. When we turn our head and/or eyeballs to better focus on that thing we saw in our peripheral vision our brain has unconsciously taken action to gather data with which to decide how best to weigh the probabilities of the various competing hypothesis contained in its model. 

Thus Friston has arrived at a powerful, mathematically tractable model of brain function. It  utilizes inferential logic (Bayesian Probability) and with his Free Energy principal he is able to compute and predict a wide range of measureable neurological results.

This model of brain function can explain a wide range of anatomical and physiological aspects of brain systems; for example, the hierarchical deployment of cortical areas, recurrent architectures using forward and backward connections and functional asymmetries in these connections (Angelucci et al., 2002a; Friston, 2003). In terms of synaptic physiology, it predicts associative plasticity and, for dynamic models, spike-timing-dependent plasticity. In terms of lectrophysiology it accounts for classical and extra-classical receptive field effects and long-latency or endogenous components of evoked cortical responses (Rao and Ballard, 1998; Friston, 2005). It predicts the attenuation of responses encoding prediction error with perceptual learning and explains many phenomena like repetition suppression, mismatch negativity and the P300 in electroencephalography. In psychophysical terms, it accounts for the behavioural correlates of these physiological phenomena, e.g., priming, and global precedence (see Friston, 2005 for an overview). It is fairly easy to show that both perceptual inference and learning rest on a minimisation of free energy (Friston, 2003) or suppression of prediction error (Rao and Ballard, 1998).

Wow, it explains everything! Seriously, this theory does seem to explain a great deal of the data that has been gathered on brain function and has energized and excited many of the researchers in the field.


[i] Dennet D. (1995). Darwinís Dangerous Idea. Touchstone Publishing, New York