Understanding the Mathematics of the Brain

Insights to help actuaries build better models Bryon Robidoux

The mathematics of the brain and the implications on the social sciences—such as economics, finance and, especially, insurance and actuarial science—have always intrigued me. Buckle up as I take you for a ride through fuzzy mathematics, information theory and neuroscience. Once I put the puzzle together, I will show why it has a profound impact and finally solves the debate between classical and behavioral economists. This article will also demonstrate how actuarial models need to be changed to better incorporate these insights.

Fuzzy Mathematics

Fuzzy mathematics is the mathematics of vagueness. The core of fuzzy mathematics is the idea that objects have a property, to some degree, because they are not sharply determined.1 An example of this is where a cloud begins and the sky around it ends. When looking from the ground, the answer seems so obvious; but when you get up close, the hard boundary seems to disappear. In language, a vague adjective is a word like “short,” “old” or “young.” I know a newborn is young and someone over 100 is old, but at what point does one go from being young to old? Is it at age 25, 35, 40, 65 or even older? The exact moment this change takes place is not well defined.

Classic sets, called crisp sets, are a special case of fuzzy sets.2 With crisp sets, an element either belongs to the set or it does not, whereas a fuzzy set can have partial membership. The membership function determines the degree to which the element belongs to the set. It can take any value between [0 and 1], going from no membership to full membership, respectively. The interesting outcome is that fuzzy sets extend logic past the familiar binary (e.g., true and false) to ternary (e.g., white, gray and black) or beyond.

There are several notions of membership related to fuzzy sets.

  • Gradualness is the concept that many categories (in natural language) are a matter of degree, including truth.
  • Epistemic uncertainty (EU) is the concept of representing partial or incomplete information by sets.
  • Vagueness is the idea that the extension of natural language predicates a lack of clear truth conditions.3

This article will focus on EU. It leads to possibility theory and modal logic, which are against the probabilistic tradition. The membership function for EU represents a possibility distribution π.4

The person who best explained EU was Donald Rumsfeld. During the days leading up to U.S. war with Iraq, he rationalized the lack of evidence supporting Iraq’s linkage of weapons of mass destruction to terrorist groups by stating that there are known knowns, known unknowns, unknown unknowns and unknown knowns. The known knowns are information you know with epistemic certainty. The known unknowns are information you know for certain but you do not cognize. The unknown unknowns are information that is completely epistemic uncertain and you do not even fathom. The unknown knowns are the information you thought you knew but later learned you did not fully understand. Figure 1 explains the percentage of EU in each category.

Figure 1: Percentage of Epistemic Uncertainty With a Given Level of Knowledge

Known Unknown
Known 0 > 0 and <100
Unknown > = 0 and < 100 100

EU has real-world physical consequences, but to understand why, we need to take a little journey through information theory.

Information Theory

Information theory is a subfield of mathematics concerned with transmitting data across a noisy channel—for example, how much data can be transmitted through a telephone line at one time. Information theory defines the term information entropy, which is the measure of the average uncertainty of information. When our uncertainty is reduced, we gain information, so information and entropy are two sides of the same coin.5

Through the Landauer limit, there is a direct connection between the entropy of thermodynamics and the entropy of information. No matter how efficient any physical device is (e.g., a computer or our brains), it can acquire one bit of information only if it expends at least 0.693 kT joules of energy,6 where k is the Boltzmann constant (1.38 X 10-23) and T is the temperature in Kelvin. The reason the Landauer limit increases with temperature is the information communicated must overcome random noise and fluctuations.7

There are two main sources of uncertainty that cause noise:

  1. Information entropy due to a random set of outcomes
  2. EU due to incomplete information

The common thread is information, which means they both have consequences due to thermodynamics. Let’s now move to how the brain processes information and uses energy.


A Ted Talk by Suzana Herculano-Houzel—“What Is So Special About the Human Brain?”—does a good job of explaining how our brains consume energy. She talks about how our brains are larger than they should be relative to the size of our bodies. Even though our brains are 2 percent of our body mass, they use 25 percent of our body’s energy—about 516 kCal/day. The computation power of the brain is a simple linear function of the number of neurons, not its size. Humans have more neurons than any other animal on the planet, so therefore they have more processing power. We have, on average, 86 billion neurons in our brains. Sixteen billion are in the cerebral cortex, which is the part of the brain that results in our higher-level thinking.

If other primates are larger than we are, then why do they have smaller brains? There is a tradeoff between the amount of energy required to power all of the neurons in the brain versus the amount of energy required to fuel the body. These two competing forces are constrained by the number of calories an animal can reasonably consume each day. Other primates eat raw vegetables, which takes a lot of energy to break down in the digestive track. Humans are the only animals on Earth that cook their food. This softens and partially uses energy outside of the body to predigest the food, allowing us to consume way more calories in a day to power our brains.8 Without cooking, our brains would be a huge liability if we ate like other primates!

With this knowledge of information theory and the amount of energy used by the brain, it is possible to calculate the number of theoretical maximum gigabytes of information the brain can process per day (see Figures 2, 3 and 4). This does not account for the frictions in the brain due to heat lost by thermodynamics in the chemical reactions, but even if efficiency is 50 percent, the computational power is quite astounding. It is amazing to see that a vast majority of the neurons in the brain are dedicated to activities outside of higher-level thinking. I call this the operating system.

Figure 2: Thermodynamics of the Brain Minimum Cost of Information

Variables Values
Constant 0.693
Boltzmann constant 1.38-23
Body temperature in Kelvin 310
Joules/bit 2.96465-21

Figure 3: Brain Energy Consumption

Variables Values
Total neurons 86,000,000,000
Cerebral cortex 16,000,000,000
Operating system 70,000,000,000
kCal/billion neurons/day 6
Joules/kCal 4184
kCal total brain/day 516
kCal cerebral cortex/day 96
Joules/day total brain 2,158,944
Joules/day cerebral cortex 401,664
Portion used for signal processing 80%
Joules/day for computation total brain 1,727,155
Joules/day for computation cerebral cortex 321,331

Figure 4: Theoretical Brain Maximum Computation Limits

Variables Values
Bits/day calculated for our brains 582,582,385,667,940,000,000,000,000
Bytes/day 72,822,798,208,492,500,000,000,000
Gigabytes/day total brain 72,822,798,208,492,500
Gigabytes/day cerebral cortex 13,548,427,573,673,000

The energy use by the brain is quite stable.9 This is due to what is perceived as System 1 and System 2 thinking. System 1 is fast passive thinking, which is cheaper energy-wise. System 2 thinking is slow methodical thinking, which requires a lot more energy.10 It is hard for a person to stay engaged in System 2 thinking for any length of time. The brain wants to naturally put you back in System 1, which helps to minimize and stabilize energy needs.

Composition of the Brain

In his book, A Thousand Brains,11 Jeff Hawkins has several insights about low-level brain mechanics.

  • Tasks we associate with intelligence, which on the surface appear to be different, are manifestations of the same cortical algorithm.
  • The brain’s ubiquitous algorithm is a prediction of the world around us.
  • Most predictions happen inside the neurons.
  • The brain is a distributed calculation.

Within the book, Hawkins explains the basic structure of the brain: “The neocortex [a sub portion of the cerebral cortex] is a sheet of tissues about the size of a large napkin. It is divided into dozens of regions that do different tasks. Each region is divided into thousands of columns. Each column is composed of several hundred hair-like mini columns, which consist of a little over 100 [neuron] cells each … The neurons have a thousand to tens of thousands of synapses spaced along the branches of the dendrites. [The dendrites receive signals from other cells. They are the hub of connectivity for each cell. The synapses are the connections to other neurons.]”12

The aggregation of the mini columns and columns is used to represent partial or whole objects. The 1,000 brains concept derives from the fact that the brain does not have one representation of an object, but it potentially has 1,000 or more representations of an object. The representations are not for redundancy, but for slightly different points of view. When I say the words “little red Corvette,” do you think of the car, the song or the Hot Wheels?

Putting It All Together

Here is where it gets exciting! EU is the inability to totally infer that an item belongs to a set due to a lack of information. We know that, due to thermodynamics, to gain more information requires more energy. Hence there is a tradeoff between crispness of information and energy consumption—certainty comes at a cost. Given the brain must minimize energy consumption13 while trying to maximize computational output, the brain must be doing fuzzy mathematics. By dealing with less than complete information, it requires less data to be pumped through our sensory system and less energy for computation, while also speeding up the computation process. (To learn more about how at least one brain process maps back to fuzzy mathematics, read “Modeling Human Thinking About Similarities by Neuromatrices in the Perspective of Fuzzy Logic.”)

In more relatable terms, our brains are doing at a micro level what we are doing at a macro level. Imagine you have $100,000 to buy assets. Before you buy your portfolio, you are going to do a lot of research. There is a tradeoff between the EU that is eliminated versus the time and energy spent to get more information. There comes a point when you must lay a stake in the ground and say this is good enough—otherwise you suffer from analysis paralysis. This means we should change the adage from “trust your gut” to “trust your fuzz!”

Let us look at it another way. As we learn, the synapses thicken to make the connections between the neurons stronger. When a synapsis is thicker, this makes the electrical current between two neurons flow more easily.14 This thickness of a synapsis encodes the possibility that the neuron belongs to the same set as the connected neuron. The easier the flow of electricity, the greater the possibility of belonging. This implies the neuron is an element of a fuzzy set and the synapsis plays the part of the membership function. The fuzzy sets are the different representations of objects by the columns and mini columns, which Hawkins describes as one of the 1,000 brains.

Furthermore, our senses receive tons of fuzzy information from the outside world. The brain uses parallel processing to make predictions based on the possibility of the data belonging to different fuzzy sets. This process boils down to a defuzzification algorithm for making crisp decisions.15 I hypothesize that if the brain used only crisp-binary logic, then the thickness of the synapses would all be uniform, with connection and no connection being 0 and 1, respectively.

Financial and Economic Implications

The brain being a fuzzy calculator has massive implications for economics and finance. Economics and finance are built on the premise of crisp sets and binary logic. Let us tackle the most elusive of all economics theories: efficient market hypotheses (EMH), which behavioral economists have been disputing for years. (You can read more about the relationship between fuzzy mathematics and behavioral finance in the paper “Fuzzy Logic and Behavioral Finance: A Connection.”)

Market efficiency refers to the degree to which market prices reflect all available relevant information. If markets are efficient, then all information is already incorporated into prices, so there is no way to “beat” the market—there are no undervalued or overvalued securities available.

  • The weak form of EMH states all past information is priced into securities.
  • Semi-strong form EMH implies neither fundamental analysis nor technical analysis can provide an advantage. It also suggests new information is instantly priced into securities.
  • The strong form of EMH says all information, both public and private, is priced into stocks. Therefore, no investor can gain advantage over the market.16

To show the EMH is in trouble, at least two conditions must be met:

  1. Noise traders (who are not the rational, knowledgeable trader or investor commonly assumed in finance theory17)must be systematic. There must be a herd mentality.
  2. Noise traders need to survive a significant period of time and make substantial profits under some conditions.18

Given everything we have learned to this point, do you spot the flaw with EMH? I have just shown that processing and receiving information requires energy by the laws of physics. EMH theory has a huge hole in it because it does not consider the thermodynamics required to consume, store and process the information. Even large institutional investors who use supercomputers to do algorithmic trading can neither use all information nor do it instantaneously—due to information entropy—and they cannot process it all due to thermodynamics, and this notion becomes less likely for individual investors or fund managers making trades. EMH assumes information is free, yet it is anything but.

The problem boils down to what the definition of “all relevant” is. What do all relevant prices and information really mean? Is it all prices at the end of the month, day, hour, second, millisecond or nanosecond? Does the past price data need to start at the beginning of the company, the U.S. stock market, the beginning of the United States or the beginning of human bartering? The meaning of “all relevant” is an extremely fuzzy concept once it is poked at a bit.

What all relevant is postulated to mean is whatever is required to make the information a crisp set (i.e., rational) with no EU (i.e., knowledgeable). But given EU’s existence, can any set of market data truly be crisp? To have full epistemic certainty of something could require an infinite amount of information, and therefore energy. Hence, no one is the rational epistemic-certain trader commonly assumed in finance theory, which puts EMH in serious trouble. Everyone is a noise trader, to some degree, due to EU governed by the laws of thermodynamics. One person’s useless information is another person’s golden ticket! It is hard to find fund managers who dependably beat the market because it is difficult to consistently overcome uncertainty. Entropy is always increasing.

The price of an asset is not due to efficiency with an all-infinite wisdom converged onto one price; rather, it is due to an inefficient consensus based on less-than-perfect information. This causes the rate of change in prices to move more than the rate of change in information. The less information that is available, the more price volatility. The concepts of undervalued or overvalued only make sense relative to the consensus. It is now obvious why the 2021 GameStop fiasco and other similar examples are easy to find.

This further brings the definition of value into question from the definition of EMH. The law of one price, given market efficiency, states that two identical assets that have a costless exchange should have the same price. But with EU and randomness, can you ever be certain that two assets are identical? Even without uncertainty, identical is a fuzzy word because even a minute perceived nonphysical difference can have a radical impact on the price of a good. This means there must be a continuum of prices within a fuzzy range based on individual perceptions of value. This really complicates calculations for actuaries because we live and breathe taking the present value of cash flows, which is reliant on the law of one price.19


Even though this article was a crazy, wild ride, it led to three important insights:

  1. The key driver to human behavior and the market is ultimately thermodynamics.
  2. The world is fuzzy, and therefore your brain is a fuzzy calculator that does soft computing.
  3. In the game of life, uncertainty is everywhere due to randomness and imperfect information.

To my satisfaction, I settled the debate between a traditional economist modeling Econs and a behavioral economist modeling Humans. The behavioral economist was the ultimate victor. But the behavioralist brought the wrong mathematics to the fight because they tried to append the crisp models of traditional economics. The behavioralist should have brought thermodynamics and fuzzy mathematics to the brawl because they are the ultimate mechanics of the brain.

The consequence of our brains using fuzzy logic is that Human’s System 1 behavior is not based on the Econ’s binary logic, but rather ternary or higher logic. Looking at ternary or higher logic through the lens of a binary logic would look strange. It is equivalent to trying to understand the rotation of the planets by assuming Earth is at the center of the solar system. Humans are rational, but they are limited by strict energy constraints and therefore must deal with EU—which does not exist for Econs in their frictionless, crisp world.

You might ask, why does the world seem so crisp? It is because our brains build models of the world. As actuaries, we know that models are just an approximation of the actual world.

I believe we see the world as crisp because when our brain needs to take an action, the information ultimately goes through a defuzzification process to make a crisp decision. Second, as our brains learn and become more familiar with a topic, they give more weight to the possibility. The higher possibility is perceived as a familiar outcome that appears crisp. To read more about fuzzy mathematics in insurance, read the Society of Actuaries’ (SOA’s) Risk Assessment Applications of Fuzzy Logic.

Bryon Robidoux, FSA, CERA, is actuary for The Standard. He is also a contributing editor for The Actuary.

Statements of fact and opinions expressed herein are those of the individual authors and are not necessarily those of the Society of Actuaries or the respective authors’ employers.

Copyright © 2021 by the Society of Actuaries, Chicago, Illinois.