Thinking Deeper

Big data and the psychology of behavior and cognition Regina M. Tuma

In 2008, Chris Anderson made a bold proclamation about the end of theory. His article in Wired1 magazine signaled the end of our existing relationship with science and knowledge as we know it: theory, conjecture, hypothesis testing and sampling. Instead, infinite petabytes in puffy data clouds offer not a caricature but unmediated access to reality itself. Powerful correlations and massive datasets too complex for human imagination make cause and effect irrelevant. What matters is what people do—not the meaning of what they do, nor the meaning attached to self-other behavior.

The application of big data has grown, leaving little in personal and professional lives untouched, including employment decisions, personal credit worthiness and insurance rates.2,3,4 Society now relies on predictive policing and sentencing; education is increasingly data-driven; and municipalities deploy “smart city” initiatives based on data. Micro-marketing tactics bleed into electoral politics.5,6 And all of this is backed by the logic of lucrative data markets and an algorithmic decision industry that, as of 2016, was expected to grow more than $2 trillion over the next 10 years.7

What Anderson announced to the world was the coming age of big data,8 not just as one method among many, but as the method and measure of all knowledge. As a psychologist, this sounded familiar as it took me to 1913 and John Watson’s bold manifesto for psychology, “Psychology as the Behaviorist Views It.”9 Psychology, according to Watson, would no longer be concerned with fuzzy psychological states that are not measurable. The data points for psychology would be behavior—what people do—publicly observable and countable. Psychology would now be a science concerned with the prediction and control of behavior.

Psychology’s version of big data and prediction came early, bringing this modern psychology into the fold of an artistically creative and increasingly urban modern society.10 Like big data, behaviorism had a practical focus on real world issues. By the time behaviorism evolved into B.F. Skinner’s Walden Two,11 there was a clear utopian vision attached to it, much like the utopic tones attached to digital technology and the creation of a better world.12,13 Like data scientists, behaviorists conceptualized the mind as a black box.14 Behaviorism was also riding the technology boom of its day, in a society that saw the development of the airplane, Ford’s automobile and Edison’s grid while sharing great faith in inventors and industrialists.15 Skinner was known for wanting to create a “technology of the mind” and was referred to as a “machine psychologist,” having created the baby crib and the learning machine as technologies of positive reinforcement.16,17,18

The study of cognition represents different approaches that emphasize the role of cognitive factors in shaping how we process information and make sense of self, others and events.

The Transition Away From Stimulus-Response

The behaviorist model was built on the contingency between stimulus and response. The old “mentalist” psychology had been based on a more or less scientific study of consciousness that assumed access to the contents of the mind. This mentalism includes cognitive processes like memory, perception and motivational factors. It also extends to social psychological constructs like representations, attitudes, attributions and notions of self. These organismic variables mediate input and output. For example, Gestalt psychologists (the main challengers to behavioral theory) were very successful in demonstrating the work of cognition in perception—that simple, physical stimuli (light energy on receptor cells on the retina) are transformed into organized (complex) perceptions, which implied the work of cognition.19 For behaviorists, science could not be based on hypothesized unobservable cognitive constructs, but it could be based on observable facts (behavior) through the contingency of an observable stimulus and an observable response. As psychologist John Mills explains, “The only real data were those that could be observed.”20 Again, this could easily be a statement about big data, where the behavior of interest becomes a series of countable digital traces in the form of online clicks and searches.

What is important, however, is the trajectory of what happened in between Watson’s declaration and Skinner’s unrepentant behaviorism: the resurgence of intervening cognitive variables, mental maps and latent learning. This necessitated a movement away from stimulus-response (S-R) to a stimulus-organism-response (S-O-R) model known as neo-behaviorism. Neo-behaviorists like Hull were responding to the Gestalt challenge on perception. Hull tried to operationalize cognitive variables through quantification and abstract language: drives, fear, incentives and frustration.21 Meanwhile, Tolman’s behaviorism was pointing to the role of learning without contingency to rewards and pointing to cognitive variables in the form of cognitive maps.22 Stimulus selectivity, (devaluation or preference for a particular stimulus) undermined the neutrality of the input variable, pointing to the role of motivation in determining the contingency be S-R relationships.23

Only Skinner remained unrepentant with his brand of radical behaviorism, and that is because his behaviorism accounted for the experiential quality of mental life even as he redefined these experiences as physical events. For Skinner, the experience of consciousness was not metaphysical but an objective (though internal and private) bodily event. According to Mills,24 Skinner seamlessly bridged the divide between cognitive states and behavior, though eventually he, too, would be challenged for the inadequacy of his theory of language and the hidden ghost-like specter of intentionality lurking within his approach.25,26

It is important to recognize that the study of cognition represents different approaches that emphasize the role of cognitive factors in shaping how we process information and make sense of self, others and events. New technology made it possible for psychologists to more directly measure what Watson and others had deemed “fuzzy mental states.” Computers provided a model of the human mind as an information processor. Not all cognition represents a departure from the behavioral approach, though not all cognitive approaches are extensions of it. Experimental approaches to cognition, in contrast to more qualitative approaches, adhere to methodological behaviorism and are co-extensions of neo-behavioral models that preserve a mediating role for cognitive variables.27 The difference, however, is that cognitive approaches explicate the cognitive malleability of the stimulus as it is filtered, processed and cognitively redefined. Behavior could be due to this psychological redefinition of the stimulus.28 An early example can be seen in Bruner’s study,29,30 which shows that poor children overestimate the size of coins compared to rich children in the study. Objectively, the input (stimulus) is the same, but when filtered through affect, needs and social economic status, there is a different outcome.

Unconscious cognitive biases can shape and influence the harvesting, selection and management of databases.

Big Data Meets Cognition

There are two ways in which cognition becomes relevant for big data. The first refers to unconscious cognitive processes that shape inferences and judgments. Kahneman and Tversky’s31 influential work On the Psychology of Prediction explores psychological shortcuts or mini algorithms that lead to less-than-optimal results in gathering information and making judgments.32 This work is part of a broader literature on social inference that includes everything from attribution theory, how memory shapes raw data in human reasoning to less-than-optimal information-gathering strategies. This automaticity of thought has been extended to unconscious biases that are similarly employed with detrimental effects in social inference.33

There is emerging literature documenting the differential effects and impacts of big data algorithms that recreate biases in society: Google searches and the association of African-American names with ads for criminal search records,34 proprietary algorithms and algorithmic redlining,35 predictive policing and increased surveillance of minority communities,36 and bias in facial recognition algorithms.37 While there is industry hope that “objectivity” can be written into the code, the industry must remain open to understanding how unconscious cognitive biases can shape and influence the harvesting, selection and management of databases, and how these cognitive variables translate into human efforts to create algorithms. In other words, the data industry must be open to making the transition from thinking fast to thinking slow.38

Not all cognitive variables are unconscious and automatic. There is established research on metacognitive factors, which include executive control, reflection and awareness.39 And there is now higher-order awareness in society about the role and impact of technology. This includes concerns over privacy and algorithmic overreach, as well as concerns over the role of technology firms in private life and the reshaping of public culture.40

As big data spreads, it enters into relational dynamics with individuals and communities on the receiving end of data-driven decisions and policy outcomes.

Moments of awareness have consequences.41,42 Until recently, it was difficult for individuals to understand the mediating role of algorithms in their lives and interactions. However, articles like “If You Are Not Paranoid, You Are Crazy”43 in The Atlantic bring home the reality that our gadgets are indeed watching and listening. Recent U.S. House and Senate hearings on the role of social media giants in the 2016 election also serve to alert the public to the extent to which social media companies watch, track and target users.44 Becoming aware of increased surveillance can prompt reactance and subversion,45 especially when perceived as a loss of freedom or privacy. This can lead to attempts to subvert the algorithm by creating subaltern modes of exchanges on social media that avoid algorithmic detection—sub-tweeting, screen capture and hate-linking through images instead of links.46 Knowledge of widespread surveillance tempers willingness to express political opinions.47 Pervasive “dataveillance” threatens to turn lives into extended data collection labs, prompting demand characteristics inherent in the social psychology of the experiment in the form of the good subject syndrome or social desirability.48,49 All of this can affect the integrity of any data collected.

It is important to emphasize that as big data spreads, it enters into relational dynamics with individuals and communities on the receiving end of data-driven decisions and policy outcomes. Yet to be explored are social cultural factors like stereotype threat50 that can be triggered as a result of increased surveillance in minority communities—or even in terms of the over-testing of minority children through data-driven education policies. When triggered, stereotype threat becomes a self-fulfilling prophecy in terms of the behavior elicited.

Genetics and Epigenetics in Health and Life Insurance

There are many genetic tests available today that can provide information on heredity and susceptibility to diseases. It is important to note that virtually all diseases have an environmental component to them. This means that just because one is more susceptible to a disease doesn’t mean they will get it …


The Way Forward

Just as it is important not to caricature behaviorism, it is important to not dismiss big data methods. Big data is an important technology and method of knowledge creation. It is important to bring large numbers into the fold of social science concepts.51 At the same time, there is an inherent duality to big data. It can be imposed top-down on communities, or it can be used from the ground up—big numbers in the hands of communities can confer visibility where none or little existed.52 Organizations and civil action groups are increasingly becoming aware of both the value and limitations inherent in big data. For example, the group Data for Black Lives recognizes the power of data to “empower communities of color,” while at the same time critiquing the use of data as an “instrument of oppression, reinforcing inequality and perpetuating injustice.” As the group states, “Today, discrimination is a high-tech enterprise.”53

The Actuarial Connection

John Watson not only introduced behaviors as data points, he led the development of a science of measurement referred to as “Mensuration” to quantify experimental results. For this reason, John Watson is regarded by some as the first data scientist.

The parable for big data, as it was for behaviorism, is that in the chain of human events, there is lots of room for complexity between stimulus and response—and in terms of input and output of data. The complexity of cognition will also extend to social dynamics as relational spaces that shape the relationship between big data, cognition and communities. It will be important to welcome big data into this future.

Regina M. Tuma, Ph.D., teaches psychology and big data at Fielding Graduate University, Media Psychology Ph.D. Program.

Copyright © 2019 by the Society of Actuaries, Chicago, Illinois.