Visual Perception Theory In Psychology

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

perception vs sensation

What is Visual Perception?

To receive information from the environment, we are equipped with sense organs, e.g., the eye, ear, and nose.  Each sense organ is part of a sensory system that receives sensory inputs and transmits sensory information to the brain.

A particular problem for psychologists is explaining how the physical energy received by sense organs forms the basis of perceptual experience. Sensory inputs are somehow converted into perceptions of desks and computers, flowers and buildings, cars and planes, into sights, sounds, smells, tastes, and touch experiences.

A major theoretical issue on which psychologists are divided is the extent to which perception relies directly on the information present in the environment.  Some argue that perceptual processes are not direct but depend on the perceiver’s expectations and previous knowledge as well as the information available in the stimulus itself.

perception theories

This controversy is discussed with respect to Gibson (1966), who has proposed a direct theory of perception which is a “bottom-up” theory, and Gregory (1970), who has proposed a constructivist (indirect) theory of perception which is a “top-down” theory.

Psychologists distinguish between two types of processes in perception: bottom-up processing and top-down processing .

Bottom-up processing is also known as data-driven processing because perception begins with the stimulus itself. Processing is carried out in one direction from the retina to the visual cortex, with each successive stage in the visual pathway carrying out an ever more complex analysis of the input.

Top-down processing refers to the use of contextual information in pattern recognition. For example, understanding difficult handwriting is easier when reading complete sentences than reading single and isolated words. This is because the meaning of the surrounding words provides a context to aid understanding.

Gregory (1970) and Top-Down Processing Theory

what is top-down processing in visual perception

Psychologist Richard Gregory (1970) argued that perception is a constructive process that relies on top-down processing.

Stimulus information from our environment is frequently ambiguous, so to interpret it, we require higher cognitive information either from past experiences or stored knowledge in order to make inferences about what we perceive. Helmholtz called it the ‘likelihood principle’.

For Gregory, perception is a hypothesis which is based on prior knowledge. In this way, we are actively constructing our perception of reality based on our environment and stored information.

  • A lot of information reaches the eye, but much is lost by the time it reaches the brain (Gregory estimates about 90% is lost).
  • Therefore, the brain has to guess what a person sees based on past experiences. We actively construct our perception of reality.
  • Richard Gregory proposed that perception involves a lot of hypothesis testing to make sense of the information presented to the sense organs.
  • Our perceptions of the world are hypotheses based on past experiences and stored information.
  • Sensory receptors receive information from the environment, which is then combined with previously stored information about the world which we have built up as a result of experience.
  • The formation of incorrect hypotheses will lead to errors of perception (e.g., visual illusions like the Necker cube).

Supporting Evidence

There seems to be an overwhelming need to reconstruct the face, similar to Helmholtz’s description of “unconscious inference.” An assumption based on past experience.

Perceptions can be ambiguous

necker cube

The Necker cube is a good example of this. When you stare at the crosses on the cube, the orientation can suddenly change or “flip.”

It becomes unstable, and a single physical pattern can produce two perceptions.

Gregory argued that this object appears to flip between orientations because the brain develops two equally plausible hypotheses and is unable to decide between them.

When the perception changes though there is no change in the sensory input, the change of appearance cannot be due to bottom-up processing. It must be set downwards by the prevailing perceptual hypothesis of what is near and what is far.

Perception allows behavior to be generally appropriate to non-sensed object characteristics.

Critical Evaluation of Gregory’s Theory

1. the nature of perceptual hypotheses.

If perceptions make use of hypothesis testing, the question can be asked, “what kind of hypotheses are they?” Scientists modify a hypothesis according to the support they find for it, so are we, as perceivers, also able to modify our hypotheses? In some cases, it would seem the answer is yes.  For example, look at the figure below:

perception

This probably looks like a random arrangement of black shapes. In fact, there is a hidden face in there; can you see it? The face is looking straight ahead and is in the top half of the picture in the center.  Now can you see it?  The figure is strongly lit from the side and has long hair and a beard.

Once the face is discovered, very rapid perceptual learning takes place and the ambiguous picture now obviously contains a face each time we look at it. We have learned to perceive the stimulus in a different way.

Although in some cases, as in the ambiguous face picture, there is a direct relationship between modifying hypotheses and perception, in other cases, this is not so evident.  For example, illusions persist even when we have full knowledge of them (e.g., the inverted face, Gregory 1974).

One would expect that the knowledge we have learned (from, say, touching the face and confirming that it is not “normal”) would modify our hypotheses in an adaptive manner. The current hypothesis testing theories cannot explain this lack of a relationship between learning and perception.

2. Perceptual Development

A perplexing question for the constructivists who propose perception is essentially top-down in nature is “how can the neonate ever perceive?”  If we all have to construct our own worlds based on past experiences, why are our perceptions so similar, even across cultures?  Relying on individual constructs for making sense of the world makes perception a very individual and chancy process.

The constructivist approach stresses the role of knowledge in perception and therefore is against the nativist approach to perceptual development.

However, a substantial body of evidence has been accrued favoring the nativist approach. For example, Newborn infants show shape constancy (Slater & Morison, 1985); they prefer their mother’s voice to other voices (De Casper & Fifer, 1980); and it has been established that they prefer normal features to scrambled features as early as 5 minutes after birth.

3. Sensory Evidence

Perhaps the major criticism of the constructivists is that they have underestimated the richness of sensory evidence available to perceivers in the real world (as opposed to the laboratory, where much of the constructivists” evidence has come from).

Constructivists like Gregory frequently use the example of size constancy to support their explanations. That is, we correctly perceive the size of an object even though the retinal image of an object shrinks as the object recedes. They propose that sensory evidence from other sources must be available for us to be able to do this.

However, in the real world, retinal images are rarely seen in isolation (as is possible in the laboratory). There is a rich array of sensory information, including other objects, background, the distant horizon, and movement. This rich source of sensory information is important to the second approach to explaining perception that we will examine, namely the direct approach to perception as proposed by Gibson.

Gibson argues strongly against the idea that perception involves top-down processing and criticizes Gregory’s discussion of visual illusions on the grounds that they are artificial examples and not images found in our normal visual environments.

This is crucial because Gregory accepts that misperceptions are the exception rather than the norm. Illusions may be interesting phenomena, but they might not be that information about the debate.

Gibson (1966) and Bottom-Up Processing

Gibson’s bottom-up theory suggests that perception involves innate mechanisms forged by evolution and that no learning is required. This suggests that perception is necessary for survival – without perception, we would live in a very dangerous environment.

Our ancestors would have needed perception to escape from harmful predators, suggesting perception is evolutionary.

James Gibson (1966) argues that perception is direct and not subject to hypothesis testing, as Gregory proposed. There is enough information in our environment to make sense of the world in a direct way.

His theory is sometimes known as the ‘Ecological Theory’ because of the claim that perception can be explained solely in terms of the environment.

For Gibson: the sensation is perception: what you see is what you get.  There is no need for processing (interpretation) as the information we receive about size, shape, distance, etc., is sufficiently detailed for us to interact directly with the environment.

Gibson (1972) argued that perception is a bottom-up process, which means that sensory information is analyzed in one direction: from simple analysis of raw sensory data to the ever-increasing complexity of analysis through the visual system.

what is bottom-up processing in visual perception

Features of Gibson’s Theory

The optic array.

Perception involves ‘picking up’ the rich information provided by the optic array in a direct way with little/no processing involved.

Because of movement and different intensities of light shining in different directions, it is an ever-changing source of sensory information. Therefore, if you move, the structure of the optic array changes.

According to Gibson, we have the mechanisms to interpret this unstable sensory input, meaning we experience a stable and meaningful view of the world.

Changes in the flow of the optic array contain important information about what type of movement is taking place. The flow of the optic array will either move from or towards a particular point.

If the flow appears to be coming from the point, it means you are moving towards it. If the optic array is moving towards the point, you are moving away from it.

Invariant Features

the optic array contains invariant information that remains constant as the observer moves. Invariants are aspects of the environment that don’t change. They supply us with crucial information.

Two good examples of invariants are texture and linear perspective.

visual representation definition psychology

Another invariant is the horizon-ratio relation. The ratio above and below the horizon is constant for objects of the same size standing on the same ground.

OPTICAL ARRAY : The patterns of light that reach the eye from the environment.

RELATIVE BRIGHTNESS : Objects with brighter, clearer images are perceived as closer

TEXTURE GRADIENT : The grain of texture gets smaller as the object recedes. Gives the impression of surfaces receding into the distance.

RELATIVE SIZE : When an object moves further away from the eye, the image gets smaller. Objects with smaller images are seen as more distant.

SUPERIMPOSITION : If the image of one object blocks the image of another, the first object is seen as closer.

HEIGHT IN THE VISUAL FIELD : Objects further away are generally higher in the visual field

Evaluation of Gibson’s (1966) Direct Theory of Perception

Gibson’s theory is a highly ecologically valid theory as it puts perception back into the real world.

A large number of applications can be applied in terms of his theory, e.g., training pilots, runway markings, and road markings.

It’s an excellent explanation for perception when viewing conditions are clear. Gibson’s theory also highlights the richness of information in an optic array and provides an account of perception in animals, babies, and humans.

His theory is reductionist as it seeks to explain perception solely in terms of the environment. There is strong evidence to show that the brain and long-term memory can influence perception. In this case, it could be said that Gregory’s theory is far more plausible.

Gibson’s theory also only supports one side of the nature-nurture debate, that being the nature side. Again, Gregory’s theory is far more plausible as it suggests that what we see with our eyes is not enough, and we use knowledge already stored in our brains, supporting both sides of the debate.

Visual Illusions

Gibson’s emphasis on DIRECT perception provides an explanation for the (generally) fast and accurate perception of the environment. However, his theory cannot explain why perceptions are sometimes inaccurate, e.g., in illusions.

He claimed the illusions used in experimental work constituted extremely artificial perceptual situations unlikely to be encountered in the real world, however, this dismissal cannot realistically be applied to all illusions.

For example, Gibson’s theory cannot account for perceptual errors like the general tendency for people to overestimate vertical extents relative to horizontal ones.

Neither can Gibson’s theory explain naturally occurring illusions. For example, if you stare for some time at a waterfall and then transfer your gaze to a stationary object, the object appears to move in the opposite direction.

Bottom-up or Top-down Processing?

Neither direct nor constructivist theories of perception seem capable of explaining all perceptions all of the time.

Gibson’s theory appears to be based on perceivers operating under ideal viewing conditions, where stimulus information is plentiful and is available for a suitable length of time. Constructivist theories, like Gregory”s, have typically involved viewing under less-than-ideal conditions.

Research by Tulving et al. manipulated both the clarity of the stimulus input and the impact of the perceptual context in a word identification task. As the clarity of the stimulus (through exposure duration) and the amount of context increased, so did the likelihood of correct identification.

However, as the exposure duration increased, so the impact of context was reduced, suggesting that if stimulus information is high, then the need to use other sources of information is reduced.

One theory that explains how top-down and bottom-up processes may be seen as interacting with each other to produce the best interpretation of the stimulus was proposed by Neisser (1976) – known as the “Perceptual Cycle.”

DeCasper, A. J., & Fifer, W. P. (1980). Of human bonding: Newborns prefer their mothers” voices . Science , 208(4448), 1174-1176.

Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin.

Gibson, J. J. (1972). A Theory of Direct Visual Perception. In J. Royce, W. Rozenboom (Eds.). The Psychology of Knowing . New York: Gordon & Breach.

Gregory, R. (1970). The Intelligent Eye . London: Weidenfeld and Nicolson.

Gregory, R. (1974). Concepts and Mechanisms of Perception . London: Duckworth.

Necker, L. (1832). LXI. Observations on some remarkable optical phenomena seen in Switzerland; and on an optical phenomenon which occurs on viewing a figure of a crystal or geometrical solid . The London and Edinburgh Philosophical Magazine and Journal of Science, 1 (5), 329-337.

Slater, A., Morison, V., Somers, M., Mattock, A., Brown, E., & Taylor, D. (1990). Newborn and older infants” perception of partly occluded objects. Infant behavior and Development , 13(1), 33-49.

Further Information

Trichromatic Theory of Color Vision

Held and Hein (1963) Movement-Produced Stimulation in the Development of Visually Guided Behavior

What do visual illusions teach us?

Print Friendly, PDF & Email

  • Review article
  • Open access
  • Published: 11 July 2018

Decision making with visualizations: a cognitive framework across disciplines

  • Lace M. Padilla   ORCID: orcid.org/0000-0001-9251-5279 1 , 2 ,
  • Sarah H. Creem-Regehr 2 ,
  • Mary Hegarty 3 &
  • Jeanine K. Stefanucci 2  

Cognitive Research: Principles and Implications volume  3 , Article number:  29 ( 2018 ) Cite this article

38k Accesses

136 Citations

18 Altmetric

Metrics details

A Correction to this article was published on 02 September 2018

This article has been updated

Visualizations—visual representations of information, depicted in graphics—are studied by researchers in numerous ways, ranging from the study of the basic principles of creating visualizations, to the cognitive processes underlying their use, as well as how visualizations communicate complex information (such as in medical risk or spatial patterns). However, findings from different domains are rarely shared across domains though there may be domain-general principles underlying visualizations and their use. The limited cross-domain communication may be due to a lack of a unifying cognitive framework. This review aims to address this gap by proposing an integrative model that is grounded in models of visualization comprehension and a dual-process account of decision making. We review empirical studies of decision making with static two-dimensional visualizations motivated by a wide range of research goals and find significant direct and indirect support for a dual-process account of decision making with visualizations. Consistent with a dual-process model, the first type of visualization decision mechanism produces fast, easy, and computationally light decisions with visualizations. The second facilitates slower, more contemplative, and effortful decisions with visualizations. We illustrate the utility of a dual-process account of decision making with visualizations using four cross-domain findings that may constitute universal visualization principles. Further, we offer guidance for future research, including novel areas of exploration and practical recommendations for visualization designers based on cognitive theory and empirical findings.

Significance

People use visualizations to make large-scale decisions, such as whether to evacuate a town before a hurricane strike, and more personal decisions, such as which medical treatment to undergo. Given their widespread use and social impact, researchers in many domains, including cognitive psychology, information visualization, and medical decision making, study how we make decisions with visualizations. Even though researchers continue to develop a wealth of knowledge on decision making with visualizations, there are obstacles for scientists interested in integrating findings from other domains—including the lack of a cognitive model that accurately describes decision making with visualizations. Research that does not capitalize on all relevant findings progresses slower, lacks generalizability, and may miss novel solutions and insights. Considering the importance and impact of decisions made with visualizations, it is critical that researchers have the resources to utilize cross-domain findings on this topic. This review provides a cognitive model of decision making with visualizations that can be used to synthesize multiple approaches to visualization research. Further, it offers practical recommendations for visualization designers based on the reviewed studies while deepening our understanding of the cognitive processes involved when making decisions with visualizations.

Introduction

Every day we make numerous decisions with the aid of visualizations , including selecting a driving route, deciding whether to undergo a medical treatment, and comparing figures in a research paper. Visualizations are external visual representations that are systematically related to the information that they represent (Bertin, 1983 ; Stenning & Oberlander, 1995 ). The information represented might be about objects, events, or more abstract information (Hegarty, 2011 ). The scope of the previously mentioned examples illustrates the diversity of disciplines that have a vested interest in the influence of visualizations on decision making. While the term decision has a range of meanings in everyday language, here decision making is defined as a choice between two or more competing courses of action (Balleine, 2007 ).

We argue that for visualizations to be most effective, researchers need to integrate decision-making frameworks into visualization cognition research. Reviews of decision making with visual-spatial uncertainty also agree there has been a general lack of emphasis on mental processes within the visualization decision-making literature (Kinkeldey, MacEachren, Riveiro, & Schiewe, 2017 ; Kinkeldey, MacEachren, & Schiewe, 2014 ). The framework that has dominated applied decision-making research for the last 30 years is a dual-process account of decision making. Dual-process theories propose that we have two types of decision processes: one for automatic, easy decisions (Type 1); and another for more contemplative decisions (Type 2) (Kahneman & Frederick, 2002 ; Stanovich, 1999 ). Footnote 1 Even though many research areas involving higher-level cognition have made significant efforts to incorporate dual-process theories (Evans, 2008 ), visualization research has yet to directly test the application of current decision-making frameworks or develop an effective cognitive model for decision making with visualizations. The goal of this work is to integrate a dual-process account of decision making with established cognitive frameworks of visualization comprehension.

In this paper, we present an overview of current decision-making theories and existing visualization cognition frameworks, followed by a proposal for an integrated model of decision making with visualizations, and a selective review of visualization decision-making studies to determine if there is cross-domain support for a dual-process account of decision making with visualizations. As a preview, we will illustrate Type 1 and 2 processing in decision making with visualizations using four cross-domain findings that we observed in the literature review. Our focus here is on demonstrating how dual-processing can be a useful framework for examining visualization decision-making research. We selected the cross-domain findings as relevant demonstrations of Type 1 and 2 processing that were shared across the studies reviewed, but they do not represent all possible examples of dual-processing in visualization decision-making research. The review documents each of the cross-domain findings, in turn, using examples from studies in multiple domains. These cross-domain findings differ in their reliance on Type 1 and Type 2 processing. We conclude with recommendations for future work and implications for visualization designers.

Decision-making frameworks

Decision-making researchers have pursued two dominant research paths to study how humans make decisions under risk. The first assumes that humans make rational decisions, which are based on weighted and ordered probability functions and can be mathematically modeled (e.g. Kunz, 2004 ; Von Neumann, 1953 ). The second proposes that people often make intuitive decisions using heuristics (Gigerenzer, Todd, & ABC Research Group, 2000 ; Kahneman & Tversky, 1982 ). While there is fervent disagreement on the efficacy of heuristics and whether human behavior is rational (Vranas, 2000 ), there is more consensus that we can make both intuitive and strategic decisions (Epstein, Pacini, Denes-Raj, & Heier, 1996 ; Evans, 2008 ; Evans & Stanovich, 2013 ; cf. Keren & Schul, 2009 ). The capacity to make intuitive and strategic decisions is described by a dual-process account of decision making, which suggests that humans make fast, easy, and computationally light decisions (known as Type 1 processing) by default, but can also make slow, contemplative, and effortful decisions by employing Type 2 processing (Kahneman, 2011 ). Various versions of dual-processing theory exist, with the key distinctions being in the attributes associated with each type of process (for a more detailed review of dual-process theories, see Evans & Stanovich, 2013 ). For example, older dual-systems accounts of decision making suggest that each process is associated with specific cognitive or neurological systems. In contrast, dual-process (sometimes termed dual-type) theories propose that the processes are distinct but do not necessarily occur in separate cognitive or neurological systems (hence the use of process over system) (Evans & Stanovich, 2013 ).

Many applied domains have adapted a dual-processing model to explain task- and domain-specific decisions, with varying degrees of success (Evans, 2008 ). For example, when a physician is deciding if a patient should be assigned to a coronary care unit or a regular nursing bed, the doctor can use a heuristic or utilize heart disease predictive instruments to make the decision (Marewski & Gigerenzer, 2012 ). In the case of the heuristic, the doctor would employ a few simple rules (diagrammed in Fig.  1 ) that would guide her decision, such as considering the patient’s chief complaint being chest pain. Another approach is to apply deliberate mental effort to make a more time-consuming and effortful decision, which could include using heart disease predictive instruments (Marewski & Gigerenzer, 2012 ). In a review of how applied domains in higher-level cognition have implemented a dual-processing model for domain-specific decisions, Evans ( 2008 ) argues that prior work has conflicting accounts of Type 1 and 2 processing. Some studies suggest that the two types work in parallel while others reveal conflicts between the Types (Sloman, 2002 ). In the physician example proposed by Marewski and Gigerenzer ( 2012 ), the two types are not mutually exclusive, as doctors can utilize Type 2 to make a more thoughtful decision that is also influenced by some rules of thumb or Type 1. In sum, Evans ( 2008 ) argues that due to the inconsistency of classifying Type 1 and 2, the distinction between only two types is likely an oversimplification. Evans ( 2008 ) suggests that the literature only consistently supports the identification of processes that require a capacity-limited, working memory resource versus those that do not. Evans and Stanovich ( 2013 ) updated their definition based on new behavioral and neuroscience evidence stating, “the defining characteristic of Type 1 processes is their autonomy. They do not require ‘controlled attention,’ which is another way of saying that they make minimal demands on working memory resources” (p. 236). There is also debate on how to define the term working memory (Cowan, 2017 ). In line with prior work on decision making with visualizations (Patterson et al., 2014 ), we adopt the definition that working memory consists of multiple components that maintain a limited amount of information (their capacity) for a finite period (Cowan, 2017 ). Contemporary theories of working memory also stress the ability to engage attention in a controlled manner to suppress automatic responses and maintain the most task-relevant information with limited capacity (Engle, Kane, & Tuholski, 1999 ; Kane, Bleckley, Conway, & Engle, 2001 ; Shipstead, Harrison, & Engle, 2015 ).

figure 1

Coronary care unit decision tree, which illustrates a sequence of rules that a doctor could use to guide treatment decisions. Redrawn from “Heuristic decision making in medicine” by J. Marewski, and G. Gigerenzer 2012, Dialogues in clinical neuroscience, 14(1) , 77. ST-segment change refers to if certain anomaly appears in the patient’s electrocardiogram. NTG nitroglycerin, MI myocardial infarction, T T-waves with peaking or inversion

Identifying processes that require significant working memory provides a definition of Type 2 processing with observable neural correlates. Therefore, in line with Evans and Stanovich ( 2013 ), in the remainder of this manuscript, we will use significant working memory capacity demands and significant need for cognitive control, as defined above, as the criterion for Type 2 processing. In the context of visualization decision making, processes that require significant working memory are those that depend on the deliberate application of working memory to function. Type 1 processing occurs outside of users’ conscious awareness and may utilize small amounts of working memory but does not rely on conscious processing in working memory to drive the process. It should be noted that Type 1 and 2 processing are not mutually exclusive and many real-world decisions likely incorporate all processes. This review will attempt to identify tasks in visualization decision making that require significant working memory and capacity (Type 2 processing) and those that rely more heavily on Type 1 processing, as a first step to combining decision theory with visualization cognition.

Visualization cognition

Visualization cognition is a subset of visuospatial reasoning, which involves deriving meaning from external representations of visual information that maintain consistent spatial relations (Tversky, 2005 ). Broadly, two distinct approaches delineate visualization cognition models (Shah, Freedman, & Vekiri, 2005 ). The first approach refers to perceptually focused frameworks which attempt to specify the processes involved in perceiving visual information in displays and make predictions about the speed and efficiency of acquiring information from a visualization (e.g. Hollands & Spence, 1992 ; Lohse, 1993 ; Meyer, 2000 ; Simkin & Hastie, 1987 ). The second approach considers the influence of prior knowledge as well as perception. For example, Cognitive Fit Theory (Vessey, 1991), suggests that the user compares a learned graphic convention (mental schema) to the visual depiction. Visualizations that do not match the mental schema require cognitive transformations to make the visualization and mental representation align. For example, Fig.  2 illustrates a fictional relationship between the population growth of Species X and a predator species. At first glance, it may appear that when the predator species was introduced that the population of Species X dropped. However, after careful observation, you may notice that the higher population values are located lower on the Y-axis, which does not match our mental schema for graphs. With some effort, you can mentally reorder the values on the Y-axis to match your mental schema and then you may notice that the introduction of the predator species actually correlates with growth in the population of Species X. When the viewer is forced to mentally transform the visualization to match their mental schema, processing steps are increased, which may increase errors, time to complete a task, and demand on working memory (Vessey, 1991).

figure 2

Fictional relationship between the population growth of Species X and a predator species, where the Y-axis ordering does not match standard graphic conventions. Notice that the y-axis is reverse ordered. This figure was inspired by a controversial graphic produced by Christine Chan of Reuters, which showed the relationship between Florida’s “Stand Your Ground” law and firearm murders with the Y-axis reversed ordered (Lallanilla, 2014 )

Pinker ( 1990 ) proposed a cognitive model (see Fig.  3 ), which provides an integrative structure that denotes the distinction between top-down and bottom-up encoding mechanisms in understanding data graphs. Researchers have generalized this model to propose theories of comprehension, learning, and memory with visual information (Hegarty, 2011 ; Kriz & Hegarty, 2007 ; Shah & Freedman, 2011 ). The Pinker ( 1990 ) model suggests that from the visual array , defined as the unprocessed neuronal firing in response to visualizations, bottom-up encoding mechanisms are utilized to construct a visual description , which is the mental encoding of the visual stimulus. Following encoding, viewers mentally search long-term memory for knowledge relevant for interpreting the visualization. This knowledge is proposed to be in the form of a graph schema.

figure 3

Adapted figure from the Pinker ( 1990 ) model of visualization comprehension, which illustrates each process

Then viewers use a match process, where the graph schema that is the most similar to the visual array is retrieved. When a matching graph schema is found, the schema becomes instantiated . The visualization conventions associated with the graph schema can then help the viewer interpret the visualization ( message assembly process). For example, Fig. 3 illustrates comprehension of a bar chart using the Pinker ( 1990 ) model. In this example, the matched graph schema for a bar graph specifies that the dependent variable is on the Y-axis and the independent variable is on the X-axis; the instantiated graph schema incorporates the visual description and this additional information. The conceptual message is the resulting mental representation of the visualization that includes all supplemental information from long-term memory and any mental transformations the viewer may perform on the visualization. Viewers may need to transform their mental representation of the visualization based on their task or conceptual question . In this example, the viewer’s task is to find the average of A and B. To do this, the viewer must interpolate information in the bar chart and update the conceptual message with this additional information. The conceptual question can guide the construction of the mental representation through interrogation , which is the process of seeking out information that is necessary to answer the conceptual question. Top-down encoding mechanisms can influence each of the processes.

The influences of top-down processes are also emphasized in a previous attempt by Patterson et al. ( 2014 ) to extend visualization cognition theories to decision making. The Patterson et al. ( 2014 ) model illustrates how top-down cognitive processing influences encoding, pattern recognition, and working memory, but not decision making or the response. Patterson et al. ( 2014 ) use the multicomponent definition of working memory, proposed by Baddeley and Hitch ( 1974 ) and summarized by Cowan ( 2017 ) as a “multicomponent system that holds information temporarily and mediates its use in ongoing mental activities” (p. 1160). In this conception of working memory, a central executive controls the functions of working memory. The central executive can, among other functions, control attention and hold information in a visuo-spatial temporary store , which is where information can be maintained temporally for decision making without being stored in long-term memory (Baddeley & Hitch, 1974 ).

While incorporating working memory into a visualization decision-making model is valuable, the Patterson et al. ( 2014 ) model leaves some open questions about relationships between components and processes. For example, their model lacks a pathway for working memory to influence decisions based on top-down processing, which is inconsistent with well-established research in decision science (e.g. Gigerenzer & Todd, 1999; Kahneman & Tversky, 1982 ). Additionally, the normal processing pathway, depicted in the Patterson model, is an oversimplification of the interaction between top-down and bottom-up processing that is documented in a large body of literature (e.g. Engel, Fries, & Singer, 2001 ; Mechelli, Price, Friston, & Ishai, 2004 ).

A proposed integrated model of decision making with visualizations

Our proposed model (Fig.  4 ) introduces a dual-process account of decision making (Evans & Stanovich, 2013 ; Gigerenzer & Gaissmaier, 2011 ; Kahneman, 2011 ) into the Pinker ( 1990 ) model of visualization comprehension. A primary addition of our model is the inclusion of working memory, which is utilized to answer the conceptual question and could have a subsequent impact on each stage of the decision-making process, except bottom-up attention. The final stage of our model includes a decision-making process that derives from the conceptual message and informs behavior. In line with a dual-process account (Evans & Stanovich, 2013 ; Gigerenzer & Gaissmaier, 2011 ; Kahneman, 2011 ), the decision step can either be completed with Type 1 processing, which only uses minimal working memory (Evans & Stanovich, 2013 ) or recruit significant working memory, constituting Type 2 processing. Also following Evans and Stanovich ( 2013 ), we argue that people can make a decision with a visualization while using minimal amounts of working memory. We classify this as Type 1 thinking. Lohse ( 1997 ) found that when participants made judgments about budget allocation using profit charts, individuals with less working memory capacity performed equally well compared to those with more working memory capacity, when they only made decisions about three regions (easier task). However, when participants made judgments about nine regions (harder task), individuals with more working memory capacity outperformed those with less working memory capacity. The results of the study reveal that individual differences in working memory capacity only influence performance on complex decision-making tasks (Lohse, 1997 ). Figure  5 (top) illustrates one way that a viewer could make a Type 1 decision about whether the average value of bars A and B is closer to 2 or 2.2. Figure 5 (top) illustrates how a viewer might make a fast and computationally light decision in which she decides that the middle point between the two bars is closer to the salient tick mark of 2 on the Y-axis and answers 2 (which is incorrect). In contrast, Fig.  5 (bottom) shows a second possible method of solving the same problem by utilizing significant working memory (Type 2 processing). In this example, the viewer has recently learned a strategy to address similar problems, uses working memory to guide a top-down attentional search of the visual array, and identifies the values of A and B. Next, she instantiates a different graph schema than in the prior example by utilizing working memory and completes an effortful mental computation of 2.4 + 1.9/2. Ultimately, the application of working memory leads to a different and more effortful decision than in Fig. 5 (top). This example illustrates how significant amounts of working memory can be used at early stages of the decision-making process and produce downstream effects and more considered responses. In the following sections, we provide a selective review of work on decision making with visualizations that demonstrates direct and indirect evidence for our proposed model.

figure 4

Model of visualization decision making, which emphasizes the influence of working memory. Long-term memory can influence all components and processes in the model either via pre-attentive processes or by conscious application of knowledge

figure 5

Examples of a fast Type 1 (top) and slow Type 2 (bottom) decision outlined in our proposed model of decision making with visualizations. In these examples, the viewer’s task is to decide if the average value of bars A and B are closer to 2 or 2.2. The thick dotted line denotes significant working memory and the thin dotted line negligible working memory

Empirical studies of visualization decision making

Review method.

To determine if there is cross-domain empirical support for a dual-process account of decision making with visualizations, we selectively reviewed studies of complex decision making with computer-generated two-dimensional (2D) static visualizations. To illustrate the application of a dual-process account of decision making to visualization research, this review highlights representative studies from diverse application areas. Interdisciplinary groups conducted many of these studies and, as such, it is not accurate to classify the studies in a single discipline. However, to help the reader evaluate the cross-domain nature of these findings, Table  1 includes the application area for the specific tasks used in each study.

In reviewing this work, we observed four key cross-domain findings that support a dual-process account of decision making (see Table  2 ). The first two support the inclusion of Type 1 processing, which is illustrated by the direct path for bottom-up attention to guide decision making with the minimal application of working memory (see Fig. 5 top). The first finding is that visualizations direct viewers’ bottom-up attention , which can both help and hinder decision making (see “ Bottom-up attention ”). The second finding is that visual-spatial biases comprise a unique category of bias that is a direct result of the visual encoding technique (see “ Visual-Spatial Biases ”). The third finding supports the inclusion of Type 2 processing in our proposed model and suggests that visualizations vary in cognitive fit between the visual description, graph schema, and conceptual question. If the fit is poor (i.e. there is a mismatch between the visualization and a decision-making component), working memory is used to perform corrective mental transformations (see “ Cognitive fit ”). The final cross-domain finding proposes that knowledge-driven processes may interact with the effects of the visual encoding technique (see “ Knowledge-driven processing ”) and could be a function of either Type 1 or 2 processes. Each of these findings will be detailed at length in the relevant sections. The four cross-domain findings do not represent an exhaustive list of all cross-domain findings that pertain to visualization cognition. However, these were selected as illustrative examples of Type 1 and 2 processing that include significant contributions from multiple domains. Further, some of the studies could fit into multiple sections and were included in a particular section as illustrative examples.

Bottom-up attention

The first cross-domain finding that characterizes Type 1 processing in visualization decision making is that visualizations direct participants’ bottom-up attention to specific visual features, which can be either beneficial or detrimental to decision making. Bottom-up attention consists of involuntary shifts in focus to salient features of a visualization and does not utilize working memory (Connor, Egeth, & Yantis, 2004 ), therefore it is a Type 1 process. The research reviewed in this section illustrates that bottom-up attention has a profound influence on decision making with visualizations. A summary of visual features that studies have used to attract bottom-up attention can be found in Table  3 .

Numerous studies show that salient information in a visualization draws viewers’ attention (Fabrikant, Hespanha, & Hegarty, 2010 ; Hegarty, Canham, & Fabrikant, 2010 ; Hegarty, Friedman, Boone, & Barrett, 2016 ; Padilla, Ruginski, & Creem-Regehr, 2017 ; Schirillo & Stone, 2005 ; Stone et al., 2003 ; Stone, Yates, & Parker, 1997 ). The most common methods for demonstrating that visualizations focus viewers’ attention is by showing that viewers miss non-salient but task-relevant information (Schirillo & Stone, 2005 ; Stone et al., 1997 ; Stone et al., 2003 ), viewers are biased by salient information (Hegarty et al., 2016 ; Padilla, Ruginski et al., 2017 ) or viewers spend more time looking at salient information in a visualization (Fabrikant et al., 2010 ; Hegarty et al., 2010 ). For example, Stone et al. ( 1997 ) demonstrated that when viewers are asked how much they would pay for an improved product using the visualizations in Fig.  6 , they focus on the number of icons while missing the base rate of 5,000,000. If a viewer simply totals the icons, the standard product appears to be twice as dangerous as the improved product, but because the base rate is large, the actual difference between the two products is insignificantly small (0.0000003; Stone et al., 1997 ). In one experiment, participants were willing to pay $125 more for improved tires when viewing the visualizations in Fig. 6 compared to a purely textual representation of the information. The authors also demonstrated the same effect for improved toothpaste, with participants paying $0.95 more when viewing a visual depiction compared to text. The authors’ term this heuristic of focusing on salient information and ignoring other data the foreground effect (Stone et al., 1997 ) (see also Schirillo & Stone, 2005 ; Stone et al., 2003 ).

figure 6

Icon arrays used to illustrate the risk of standard or improved tires. Participants were tasked with deciding how much they would pay for the improved tires. Note the base rate of 5 M drivers was represented in text. Redrawn from “Effects of numerical and graphical displays on professed risk-taking behavior” by E. R. Stone, J. F. Yates, & A. M. Parker. 1997, Journal of Experimental Psychology: Applied , 3 (4), 243

A more direct test of visualizations guiding bottom-up attention is to examine if salient information biases viewers’ judgments. One method involves identifying salient features using a behaviorally validated saliency model, which predicts the locations that will attract viewers’ bottom-up attention (Harel, 2015 ; Itti, Koch, & Niebur, 1998 ; Rosenholtz & Jin, 2005 ). In one study, researchers compared participants’ judgments with different hurricane forecast visualizations and then, using the Itti et al. ( 1998 ) saliency algorithm, found that the differences in what was salient in the two visualizations correlated with participants’ performance (Padilla, Ruginski et al., 2017 ). Specifically, they suggested that the salient borders of the Cone of Uncertainty (see Fig.  7 , left), which is used by the National Hurricane Center to display hurricane track forecasts, leads some people to incorrectly believe that the hurricane is growing in physical size, which is a misunderstanding of the probability distribution of hurricane paths that the cone is intended to represent (Padilla, Ruginski et al., 2017 ; see also Ruginski et al., 2016 ). Further, they found that when the same data were represented as individual hurricane paths, such that there was no salient boundary (see Fig. 7 , right), viewers intuited the probability of hurricane paths more effectively than the Cone of Uncertainty. However, an individual hurricane path biased viewers’ judgments if it intersected a point of interest. For example, in Fig. 7 (right), participants accurately judged that locations closer to the densely populated lines (highest likelihood of storm path) would receive more damage. This correct judgment changed when a location farther from the center of the storm was intersected by a path, but the closer location was not (see locations a and b in Fig. 7 right). With both visualizations, the researchers found that viewers were negatively biased by the salient features for some tasks (Padilla, Ruginski et al., 2017 ; Ruginski et al., 2016 ).

figure 7

An example of the Cone of Uncertainty ( left ) and the same data represented as hurricane paths ( right ). Participants were tasked with evaluating the level of damage that would incur to offshore oil rigs at specific locations, based on the hurricane forecast visualization. Redrawn from “Effects of ensemble and summary displays on interpretations of geospatial uncertainty data” by L. M. Padilla, I. Ruginski, and S. H. Creem-Regehr. 2017, Cognitive Research: Principles and Implications , 2 (1), 40

That is not to say that saliency only negatively impacts decisions. When incorporated into visualization design, saliency can guide bottom-up attention to task-relevant information, thereby improving performance (e.g. Fabrikant et al., 2010 ; Fagerlin, Wang, & Ubel, 2005 ; Hegarty et al., 2010 ; Schirillo & Stone, 2005 ; Stone et al., 2003 ; Waters, Weinstein, Colditz, & Emmons, 2007 ). One compelling example using both eye-tracking measures and a saliency algorithm demonstrated that salient features of weather maps directed viewers’ attention to different variables that were visualized on the maps (Hegarty et al., 2010 ) (see also Fabrikant et al., 2010 ). Interestingly, when the researchers manipulated the relative salience of temperature versus pressure (see Fig.  8 ), the salient features captured viewers’ overt attention (as measured by eye fixations) but did not influence performance, until participants were trained on how to effectively interpret the features. Once viewers were trained, their judgments were facilitated when the relevant features were more salient (Hegarty et al., 2010 ). This is an instructive example of how saliency may direct viewers’ bottom-up attention but may not influence their performance until viewers have the relevant top-down knowledge to capitalize on the affordances of the visualization.

figure 8

Eye-tracking data from Hegarty et al. ( 2010 ). Participants viewed an arrow located in Utah (obscured by eye-tracking data in the figure) and made judgments about whether the arrow correctly identified the wind direction. The black isobars were the task-relevant information. Notice that after instructions, viewers with the pressure-salient visualizations focused on the isobars surrounding Utah, rather than on the legend or in other regions. The panels correspond to the conditions in the original study

In sum, the reviewed studies suggest that bottom-up attention has a profound influence on decision making with visualizations. This is noteworthy because bottom-up attention is a Type 1 process. At a minimum, the work suggests that Type 1 processing influences the first stages of decision making with visualizations. Further, the studies cited in this section provide support for the inclusion of bottom-up attention in our proposed model.

  • Visual-spatial biases

A second cross-domain finding that relates to Type 1 processing is that visualizations can give rise to visual-spatial biases that can be either beneficial or detrimental to decision making. We are proposing the new concept of visual-spatial biases and defining this term as a bias that elicits heuristics, which are a direct result of the visual encoding technique. Visual-spatial biases likely originate as a Type 1 process as we suspect they are connected to bottom-up attention, and if detrimental to decision making, have to be actively suppressed by top-down knowledge and cognitive control mechanisms (see Table  4 for summary of biases documented in this section). Visual-spatial biases can also improve decision-making performance. As Card, Mackinlay, and Shneiderman ( 1999 ) point out, we can use vision to think , meaning that visualizations can capitalize on visual perception to interpret a visualization without effort when the visual biases elucidated by the visualization are consistent with the correct interpretation.

Tversky ( 2011 ) presents a taxonomy of visual-spatial communications that are intrinsically related to thought, which are likely the bases for visual-spatial biases (see also Fabrikant & Skupin, 2005 ). One of the most commonly documented visual-spatial biases that we observed across domains is a containment conceptualization of boundary representations in visualizations. Tversky ( 2011 ) makes the analogy, “Framing a picture is a way of saying that what is inside the picture has a different status from what is outside the picture” (p. 522). Similarly, Fabrikant and Skupin ( 2005 ) describe how, “They [boundaries] help partition an information space into zones of relative semantic homogeneity” (p. 673). However, in visualization design, it is common to take continuous data and visually represent them with boundaries (i.e. summary statistics, error bars, isocontours, or regions of interest; Padilla et al., 2015 ; Padilla, Quinan, Meyer, & Creem-Regehr, 2017 ). Binning continuous data is a reasonable approach, particularly when intended to make the data simpler for viewers to understand (Padilla, Quinan, et al., 2017 ). However, it may have the unintended consequence of creating artificial boundaries that can bias users—leading them to respond as if data within a containment is more similar than data across boundaries. For example, McKenzie, Hegarty, Barrett, and Goodchild ( 2016 ) showed that participants were more likely to use a containment heuristic to make decisions about Google Map’s blue dot visualization when the positional uncertainty data were visualized as a bounded circle (Fig.  9 right) compared to a Gaussian fade (Fig. 9 left) (see also Newman & Scholl, 2012 ; Ruginski et al., 2016 ). Recent work by Grounds, Joslyn, and Otsuka ( 2017 ) found that viewers demonstrate a “deterministic construal error” or the belief that visualizations of temperature uncertainty represent a deterministic forecast. However, the deterministic construal error was not observed with textual representations of the same data (see also Joslyn & LeClerc, 2013 ).

figure 9

Example stimuli from McKenzie et al. ( 2016 ) showing circular semi-transparent overlays used by Google Maps to indicate the uncertainty of the users’ location. Participants compared two versions of these visualizations and determined which represented the most accurate positional location. Redrawn from “Assessing the effectiveness of different visualizations for judgments of positional uncertainty” by G. McKenzie, M. Hegarty, T. Barrett, and M. Goodchild. 2016, International Journal of Geographical Information Science , 30 (2), 221–239

Additionally, some visual-spatial biases follow the same principles as more well-known decision-making biases revealed by researchers in behavioral economics and decision science. In fact, some decision-making biases, such as anchoring , the tendency to use the first data point to make relative judgments, seem to have visual correlates (Belia, Fidler, Williams, & Cumming, 2005 ). For example, Belia et al. ( 2005 ) asked experts with experience in statistics to align two means (representing “Group 1” and “Group 2”) with error bars so that they represented data ranges that were just significantly different (see Fig.  10 for example of stimuli). They found that when the starting position of Group 2 was around 800 ms, participants placed Group 2 higher than when the starting position for Group 2 was at around 300 ms. This work demonstrates that participants used the starting mean of Group 2 as an anchor or starting point of reference, even though the starting position was arbitrary. Other work finds that visualizations can be used to reduce some decision-making biases including anecdotal evidence bias (Fagerlin et al., 2005 ), side effect aversion (Waters et al., 2007 ; Waters, Weinstein, Colditz, & Emmons, 2006 ), and risk aversion (Schirillo & Stone, 2005 ).

figure 10

Example display and instructions from Belia et al. ( 2005 ). Redrawn from “Researchers misunderstand confidence intervals and standard error bars” by S. Belia, F. Fidler, J. Williams, and G. Cumming. 2005, Psychological Methods, 10 (4), 390. Copyright 2005 by “American Psychological Association”

Additionally, the mere presence of a visualization may inherently bias viewers. For example, viewers find scientific articles with high-quality neuroimaging figures to have greater scientific reasoning than the same article with a bar chart or without a figure (McCabe & Castel, 2008 ). People tend to unconsciously believe that high-quality scientific images reflect high-quality science—as illustrated by work from Keehner, Mayberry, and Fischer ( 2011 ) showing that viewers rate articles with three-dimensional brain images as more scientific than those with 2D images, schematic drawings, or diagrams (See Fig.  11 ). Unintuitively, however, high-quality complex images can be detrimental to performance compared to simpler visualizations (Hegarty, Smallman, & Stull, 2012 ; St. John, Cowen, Smallman, & Oonk, 2001 ; Wilkening & Fabrikant, 2011 ). Hegarty et al. ( 2012 ) demonstrated that novice users prefer realistically depicted maps (see Fig.  12 ), even though these maps increased the time taken to complete the task and focused participants’ attention on irrelevant information (Ancker, Senathirajah, Kukafka, & Starren, 2006 ; Brügger, Fabrikant, & Çöltekin, 2017 ; St. John et al., 2001 ; Wainer, Hambleton, & Meara, 1999 ; Wilkening & Fabrikant, 2011 ). Interestingly, professional meteorologists also demonstrated the same biases as novice viewers (Hegarty et al., 2012 ) (see also Nadav-Greenberg, Joslyn, & Taing, 2008 ).

figure 11

Image showing participants’ ratings of three-dimensionality and scientific credibility for a given neuroimaging visualization, originally published in grayscale (Keehner et al., 2011 )

figure 12

Example stimuli from Hegarty et al. ( 2012 ) showing maps with varying levels of realism. Both novice viewers and meteorologists were tasked with selecting a visualization to use and performing a geospatial task. The panels correspond to the conditions in the original study

We argue that visual-spatial biases reflect a Type 1 process, occurring automatically with minimal working memory. Work by Sanchez and Wiley ( 2006 ) provides direct evidence for this assertion using eye-tracking data to demonstrate that individuals with less working memory capacity attend to irrelevant images in a scientific article more than those with greater working memory capacity. The authors argue that we are naturally drawn to images (particularly high-quality depictions) and that significant working memory capacity is required to shift focus away from images that are task-irrelevant. The ease by which visualizations captivate our focus and direct our bottom-up attention to specific features likely increases the impact of these biases, which may be why some visual-spatial biases are notoriously difficult to override using working memory capacity (see Belia et al., 2005 ; Boone, Gunalp, & Hegarty, in press ; Joslyn & LeClerc, 2013 ; Newman & Scholl, 2012 ). We speculate that some visual-spatial biases are intertwined with bottom-up attention—occurring early in the decision-making process and influencing the down-stream processes (see our model in Fig. 4 for reference), making them particularly unremitting.

Cognitive fit

We also observe a cross-domain finding involving Type 2 processing, which suggests that if there is a mismatch between the visualization and a decision-making component, working memory is used to perform corrective mental transformations. Cognitive fit is a term used to describe the correspondence between the visualization and conceptual question or task (see our model for reference; for an overview of cognitive fit, see Vessey, Zhang, & Galletta, 2006 ). Those interested in examining cognitive fit generally attempt to identify and reduce mismatches between the visualization and one of the decision-making components (see Table  5 for a breakdown of the decision-making components that the reviewed studies evaluated). When there is a mismatch produced by the default Type 1 processing, it is argued that significant working memory (Type 2 processing) is required to resolve the discrepancy via mental transformations (Vessey et al., 2006 ). As working memory is capacity limited, the magnitude of mental transformation or amount of working memory required is one predictor of reaction times and errors.

Direct evidence for this claim comes from work demonstrating that cognitive fit differentially influenced the performance of individuals with more and less working memory capacity (Zhu & Watts, 2010 ). The task was to identify which two nodes in a social media network diagram should be removed to disconnect the maximal number of nodes. As predicted by cognitive fit theory, when the visualization did not facilitate the task (Fig.  13 left), participants with less working memory capacity were slower than those with more working memory capacity. However, when the visualization aligned with the task (Fig.  13 right), there was no difference in performance. This work suggests that when there is misalignment between the visualization and a decision-making process, people with more working memory capacity have the resources to resolve the conflict, while those with less resources show performance degradations. Footnote 2 Other work only found a modest relationship between working memory capacity and correct interpretations of high and low temperature forecast visualizations (Grounds et al., 2017 ), which suggests that, for some visualizations, viewers utilize little working memory.

figure 13

Examples of social media network diagrams from Zhu and Watts ( 2010 ). The authors argue that the figure on the right is more aligned with the task of identifying the most interconnected nodes than the figure on the left

As illustrated in our model, working memory can be recruited to aid all stages of the decision-making process except bottom-up attention. Work that examines cognitive fit theory provides indirect evidence that working memory is required to resolve conflicts in the schema matching and a decision-making component. For example, one way that a mismatch between a viewer’s mental schema and visualization can arise is when the viewer uses a schema that is not optimal for the task. Tversky, Corter, Yu, Mason, and Nickerson ( 2012 ) primed participants to use different schemas by describing the connections in Fig.  14 in terms of either transfer speed or security levels. Participants then decided on the most efficient or secure route for information to travel between computer nodes with either a visualization that encoded data using the thickness of connections, containment, or physical distance (see Fig.  14 ). Tversky et al. ( 2012 ) found that when the links were described based on their information transfer speed, thickness and distance visualizations were the most effective—suggesting that the speed mental schema was most closely matched to the thickness and distance visualizations, whereas the speed schema required mental transformations to align with the containment visualization. Similarly, the thickness and containment visualizations outperformed the distance visualization when the nodes were described as belonging to specific systems with different security levels. This work and others (Feeney, Hola, Liversedge, Findlay, & Metcalf, 2000 ; Gattis & Holyoak, 1996 ; Joslyn & LeClerc, 2013 ; Smelcer & Carmel, 1997 ) provides indirect evidence that gratuitous realignment between mental schema and the visualization can be error-prone and visualization designers should work to reduce the number of transformations required in the decision-making process.

figure 14

Example of stimuli from Tversky et al. ( 2012 ) showing three types of encoding techniques for connections between nodes (thickness, containment, and distance). Participants were asked to select routes between nodes with different descriptions of the visualizations. Redrawn from “Representing category and continuum: Visualizing thought” by B. Tversky, J. Corter, L. Yu, D. Mason, and J. Nickerson. In Diagrams 2012 (p. 27), P. Cox, P. Rodgers, and B. Plimmer (Eds.), 2012, Berlin Heidelberg: Springer-Verlag

Researchers from multiple domains have also documented cases of misalignment between the task, or conceptual question, and the visualization. For example, Vessey and Galletta ( 1991 ) found that participants completed a financial-based task faster when the visualization they chose (graph or table, see Fig.  15 ) matched the task (spatial or textual). For the spatial task, participants decided which month had the greatest difference between deposits and withdrawals. The textual or symbolic tasks involved reporting specific deposit and withdrawal amounts for various months. The authors argued that when there is a mismatch between the task and visualization, the additional transformation accounts for the increased time taken to complete the task (Vessey & Galletta, 1991 ) (see also Dennis & Carte, 1998 ; Huang et al., 2006 ), which likely takes place in the inference process of our proposed model.

figure 15

Examples of stimuli from Vessey and Galletta ( 1991 ) depicting deposits and withdraw amounts over the course of a year with a graph ( a ) and table ( b ). Participants completed either a spatial or textual task with the chart or table. Redrawn from “Cognitive fit: An empirical study of information acquisition” by I. Vessey, and D. Galletta. 1991, Information systems research, 2 (1), 72–73. Copyright 1991 by “INFORMS”

The aforementioned studies provide direct (Zhu & Watts, 2010 ) and indirect (Dennis & Carte, 1998 ; Feeney et al., 2000 ; Gattis & Holyoak, 1996 ; Huang et al., 2006 ; Joslyn & LeClerc, 2013 ; Smelcer & Carmel, 1997 ; Tversky et al., 2012 ; Vessey & Galletta, 1991 ) evidence that Type 2 processing recruits working memory to resolve misalignment between decision-making processes and the visualization that arise from default Type 1 processing. These examples of Type 2 processing using working memory to perform effortful mental computations are consistent with the assertions of Evans and Stanovich ( 2013 ) that Type 2 processes enact goal directed complex processing. However, it is not clear from the reviewed work how exactly the visualization and decision-making components are matched. Newman and Scholl ( 2012 ) propose that we match the schema and visualization based on the similarities between the salient visual features, although this proposal has not been tested. Further, work that assesses cognitive fit in terms of the visualization and task only examines the alignment of broad categories (i.e., spatial or semantic). Beyond these broad classifications, it is not clear how to predict if a task and visualization are aligned. In sum, there is not a sufficient cross-disciplinary theory for how mental schemas and tasks are matched to visualizations. However, it is apparent from the reviewed work that Type 2 processes (requiring working memory) can be recruited during the schema matching and inference processes.

Either type 1 and/or 2

Knowledge-driven processing.

In a review of map-reading cognition, Lobben ( 2004 ) states, “…research should focus not only on the needs of the map reader but also on their map-reading skills and abilities” (p. 271). In line with this statement, the final cross-domain finding is that the effects of knowledge can interact with the affordances or biases inherent in the visualization method. Knowledge may be held temporally in working memory (Type 2), held in long-term knowledge but effortfully used (Type 2), or held in long-term knowledge but automatically applied (Type 1). As a result, knowledge-driven processing can involve either Type 1 or Type 2 processes.

Both short- and long-term knowledge can influence visualization affordances and biases. However, it is difficult to distinguish whether Type 2 processing is using significant working memory capacity to temporarily hold knowledge or if participants have stored the relevant knowledge in long-term memory and processing is more automatic. Complicating the issue, knowledge stored in long-term memory can influence decision making with visualizations using both Type 1 and 2 processing. For example, if you try to remember Pythagorean’s Theorem, which you may have learned in high school or middle school, you may recall that a 2  + b 2  = c 2 , where c represents the length of the hypotenuse and a and b represent the lengths of the other two sides of a triangle. Unless you use geometry regularly, you likely had to strenuously search in long-term memory for the equation, which is a Type 2 process and requires significant working memory capacity. In contrast, if you are asked to recall your childhood phone number, the number might automatically come to mind with minimal working memory required (Type 1 processing).

In this section, we highlight cases where knowledge either influenced decision making with visualizations or was present but did not influence decisions (see Table  6 for the type of knowledge examined in each study). These studies are organized based on how much time the viewers had to incorporate the knowledge (i.e. short-term instructions and long-term individual differences in abilities and expertise), which may be indicative of where the knowledge is stored. However, many factors other than time influence the process of transferring knowledge by working memory capacity to long-term knowledge. Therefore, each of the studies cited in this section could be either Type 1, Type 2, or both types of processing.

One example of participants using short-term knowledge to override a familiarity bias comes from work by Bailey, Carswell, Grant, and Basham ( 2007 ) (see also Shen, Carswell, Santhanam, & Bailey, 2012 ). In a complex geospatial task for which participants made judgments about terrorism threats, participants were more likely to select familiar map-like visualizations rather than ones that would be optimal for the task (see Fig.  16 ) (Bailey et al., 2007 ). Using the same task and visualizations, Shen et al. ( 2012 ) showed that users were more likely to choose an efficacious visualization when given training concerning the importance of cognitive fit and effective visualization techniques. In this case, viewers were able to use knowledge-driven processing to improve their performance. However, Joslyn and LeClerc ( 2013 ) found that when participants viewed temperature uncertainty, visualized as error bars around a mean temperature prediction, they incorrectly believed that the error bars represented high and low temperatures. Surprisingly, participants maintained this belief despite a key, which detailed the correct way to interpret each temperature forecast (see also Boone et al., in press ). The authors speculated that the error bars might have matched viewers’ mental schema for high- and low-temperature forecasts (stored in long-term memory) and they incorrectly utilized the high-/low-temperature schema rather than incorporating new information from the key. Additionally, the authors propose that because the error bars were visually represented as discrete values, that viewers may have had difficulty reimagining the error bars as points on a distribution, which they term a deterministic construal error (Joslyn & LeClerc, 2013 ). Deterministic construal visual-spatial biases may also be one of the sources of misunderstanding of the Cone of Uncertainty (Padilla, Ruginski et al., 2017 ; Ruginski et al., 2016 ). A notable difference between these studies and the work of Shen et al. ( 2012 ) is that Shen et al. ( 2012 ) used instructions to correct a familiarity bias, which is a cognitive bias originally documented in the decision-making literature that is not based on the visual elements in the display. In contrast, the biases in Joslyn and LeClerc ( 2013 ) were visual-spatial biases. This provides further evidence that visual-spatial biases may be a unique category of biases that warrant dedicated exploration, as they are harder to influence with knowledge-driven processing.

figure 16

Example of different types of view orientations used by examined by Bailey et al. ( 2007 ). Participants selected one of these visualizations and then used their selection to make judgments including identifying safe passageways, determining appropriate locations for firefighters, and identifying suspicious locations based on the height of buildings. The panels correspond to the conditions in the original study

Regarding longer-term knowledge, there is substantial evidence that individual differences in knowledge impact decision making with visualizations. For example, numerous studies document the benefit of visualizations for individuals with less health literacy, graph literacy, and numeracy (Galesic & Garcia-Retamero, 2011 ; Galesic, Garcia-Retamero, & Gigerenzer, 2009 ; Keller, Siegrist, & Visschers, 2009 ; Okan, Galesic, & Garcia-Retamero, 2015 ; Okan, Garcia-Retamero, Cokely, & Maldonado, 2012 ; Okan, Garcia-Retamero, Galesic, & Cokely, 2012 ; Reyna, Nelson, Han, & Dieckmann, 2009 ; Rodríguez et al., 2013 ). Visual depictions of health data are particularly useful because health data often take the form of probabilities, which are unintuitive. Visualizations inherently illustrate probabilities (i.e. 10%) as natural frequencies (i.e. 10 out of 100), which are more intuitive (Hoffrage & Gigerenzer, 1998 ). Further, by depicting natural frequencies visually (see example in Fig.  17 ), viewers can make perceptual comparisons rather than mathematical calculations. This dual benefit is likely the reason visualizations produce facilitation for individuals with less health literacy, graph literacy, and numeracy.

figure 17

Example of stimuli used by Galesic et al. ( 2009 ) in a study demonstrating that natural frequency visualizations can help individuals overcome less numeracy. Participants completed three medical scenario tasks using similar visualizations as depicted here, in which they were asked about the effects of aspirin on risk of stroke or heart attack and about a hypothetical new drug. Redrawn from “Using icon arrays to communicate medical risks: overcoming less numeracy” by M. Galesic, R. Garcia-Retamero, and G. Gigerenzer. 2009, Health Psychology, 28 (2), 210

These studies are good examples of how designers can create visualizations that capitalize on Type 1 processing to help viewers accurately make decisions with complex data even when they lack relevant knowledge. Based on the reviewed work, we speculate that well-designed visualizations that utilize Type 1 processing to intuitively illustrate task-relevant relationships in the data may be particularly beneficial for individuals with less numeracy and graph literacy, even for simple tasks. However, poorly designed visualizations that require superfluous mental transformations may be detrimental to the same individuals. Further, individual differences in expertise, such as graph literacy, which have received more attention in healthcare communication (Galesic & Garcia-Retamero, 2011 ; Nayak et al., 2016 ; Okan et al., 2015 ; Okan, Garcia-Retamero, Cokely, & Maldonado, 2012 ; Okan, Garcia-Retamero, Galesic, & Cokely, 2012 ; Rodríguez et al., 2013 ), may play a large role in how viewers complete even simple tasks in other domains such as map-reading (Kinkeldey et al., 2017 ).

Less consistent are findings on how more experienced users incorporate knowledge acquired over longer periods of time to make decisions with visualizations. Some research finds that students’ decision-making and spatial abilities improved during a semester-long course on Geographic Information Science (GIS) (Lee & Bednarz, 2009 ). Other work finds that experts perform the same as novices (Riveiro, 2016 ), experts can exhibit visual-spatial biases (St. John et al., 2001 ) and experts perform more poorly than expected in their domain of visual expertise (Belia et al., 2005 ). This inconsistency may be due in part to the difficulty in identifying when and if more experienced viewers are automatically applying their knowledge or employing working memory. For example, it is unclear if the students in the GIS course documented by Lee and Bednarz ( 2009 ) developed automatic responses (Type 1) or if they learned the information and used working memory capacity to apply their training (Type 2).

Cheong et al. ( 2016 ) offer one way to gauge how performance may change when one is forced to use Type 1 processing, but then allowed to use Type 2 processing. In a wildfire task using multiple depictions of uncertainty (see Fig.  18 ), Cheong et al. ( 2016 ) found that the type of uncertainty visualization mattered when participants had to make fast Type 1 decisions (5 s) about evacuating from a wildfire. But when given sufficient time to make Type 2 decisions (30 s), participants were not influenced by the visualization technique (see also Wilkening & Fabrikant, 2011 ).

figure 18

Example of multiple uncertainty visualization techniques for wildfire risk by Cheong et al. ( 2016 ). Participants were presented with a house location (indicated by an X), and asked if they would stay or leave based on one of the wildfire hazard communication techniques shown here. The panels correspond to the conditions in the original study

Interesting future work could limit experts’ time to complete a task (forcing Type 1 processing) and then determine if their judgments change when given more time to complete the task (allowing for Type 2 processing). To test this possibility further, a dual-task paradigm could be used such that experts’ working memory capacity is depleted by a difficult secondary task that also required working memory capacity. Some examples of secondary tasks in a dual-task paradigm include span tasks that require participants to remember or follow patterns of information, while completing the primary task, then report the remembered or relevant information from the pattern (for a full description of theoretical bases for a dual-task paradigm see Pashler, 1994 ). To our knowledge, only one study has used a dual-task paradigm to evaluate cognitive load of a visualization decision-making task (Bandlow et al., 2011 ). However, a growing body of research on other domains, such as wayfinding and spatial cognition, demonstrates the utility of using dual-task paradigms to understand the types of working memory that users employ for a task (Caffò, Picucci, Di Masi, & Bosco, 2011 ; Meilinger, Knauff, & Bülthoff, 2008 ; Ratliff & Newcombe, 2005 ; Trueswell & Papafragou, 2010 ).

Span tasks are examples of spatial or verbal secondary tasks, which include remembering the orientations of an arrow (taxes visual-spatial memory, (Shah & Miyake, 1996 ) or counting backward by 3 s (taxes verbal processing and short-term memory) (Castro, Strayer, Matzke, & Heathcote, 2018 ). One should expect more interference if the primary and secondary tasks recruit the same processes (i.e. visual-spatial primary task paired with a visual-spatial memory span task). An example of such an experimental design is illustrated in Fig.  19 . In the dual-task trial illustrated in Fig.  19 , if participants responses are as fast and accurate as the baseline trial then participants are likely not using significant amounts of working memory capacity for that task. If the task does require significant working memory capacity, then the inclusion of the secondary task should increase the time taken to complete the primary task and potentially produce errors in both the secondary and primary tasks. In visualization decision-making research, this is an open area of exploration for researchers and designers that are interested in understanding how working memory capacity and a dual-process account of decision making applies to their visualizations and application domains.

figure 19

A diagram of a dual-tasking experiment is shown using the same task as in Fig. 5 . Responses resulting from Type 1 and 2 processing are illustrated. The dual-task trial illustrates how to place additional load on working memory capacity by having the participant perform a demanding secondary task. The impact of the secondary task is illustrated for both time and accuracy. Long-term memory can influence all components and processes in the model either via pre-attentive processes or by conscious application of knowledge

In sum, this section documents cases where knowledge-driven processing does and does not influence decision making with visualizations. Notably, we describe numerous studies where well-designed visualizations (capitalizing on Type 1 processing) focus viewers’ attention on task-relevant relationships in the data, which improves decision accuracy for individuals with less developed health literacy, graph literacy, and numeracy. However, the current work does not test how knowledge-driven processing maps on to the dual-process model of decision making. Knowledge may be held temporally by working memory capacity (Type 2), held in long-term knowledge but strenuously utilized (Type 2), or held in long-term knowledge but automatically applied (Type 1). More work is needed to understand if a dual-process account of decision making accurately describes the influence of knowledge-driven processing on decision making with visualizations. Finally, we detailed an example of a dual-task paradigm as one way to evaluate if viewers are employing Type 1 processing.

Review summary

Throughout this review, we have provided significant direct and indirect evidence that a dual-process account of decision making effectively describes prior findings from numerous domains interested in visualization decision making. The reviewed work provides support for specific processes in our proposed model including the influences of working memory, bottom-up attention, schema matching, inference processes, and decision making. Further, we identified key commonalities in the reviewed work relating to Type 1 and Type 2 processing, which we added to our proposed visualization decision-making model. The first is that utilizing Type 1 processing, visualizations serve to direct participants’ bottom-up attention to specific information, which can be either beneficial or detrimental for decision making (Fabrikant et al., 2010 ; Fagerlin et al., 2005 ; Hegarty et al., 2010 ; Hegarty et al., 2016 ; Padilla, Ruginski et al., 2017 ; Ruginski et al., 2016 ; Schirillo & Stone, 2005 ; Stone et al., 1997 ; Stone et al., 2003 ; Waters et al., 2007 ). Consistent with assertions from cognitive science and scientific visualization (Munzner, 2014 ), we propose that visualization designers should identify the critical information needed for a task and use a visual encoding technique that directs participants’ attention to this information. We encourage visualization designers who are interested in determining which elements in their visualizations will likely attract viewers’ bottom-up attention, to see the Itti et al. ( 1998 ) saliency model, which has been validated with eye-tracking measures (for implementation of this model along with Matlab code see Padilla, Ruginski et al., 2017 ). If deliberate effort is not made to capitalize on Type 1 processing by focusing the viewer’s attention on task-relevant information, then the viewer will likely focus on distractors via Type 1 processing, resulting in poor decision outcomes.

A second cross-domain finding is the introduction of a new concept, visual-spatial biases , which can also be both beneficial and detrimental to decision making. We define this term as a bias that elicits heuristics, which is a direct result of the visual encoding technique. We provide numerous examples of visual-spatial biases across domains (for implementation of this model along with Matlab code, see Padilla, Ruginski et al., 2017 ). The novel utility of identifying visual-spatial biases is that they potentially arise early in the decision-making process during bottom-up attention, thus influencing the entire downstream process, whereas standard heuristics do not exclusively occur at the first stage of decision making. This possibly accounts for the fact that visual-spatial biases have proven difficult to overcome (Belia et al., 2005 ; Grounds et al., 2017 ; Joslyn & LeClerc, 2013 ; Liu et al., 2016 ; McKenzie et al., 2016 ; Newman & Scholl, 2012 ; Padilla, Ruginski et al., 2017 ; Ruginski et al., 2016 ). Work by Tversky ( 2011 ) presents a taxonomy of visual-spatial communications that are intrinsically related to thought, which are likely the bases for visual-spatial biases.

We have also revealed cross-domain findings involving Type 2 processing, which suggest that if there is a mismatch between the visualization and a decision-making component, working memory is used to perform corrective mental transformations. In scenarios where the visualization is aligned with the mental schema and task, performance is fast and accurate (Joslyn & LeClerc, 2013 ). The types of mismatches observed in the reviewed literature are likely both domain-specific and domain-general. For example, situations where viewers employ the correct graph schema for the visualization, but the graph schema does not align with the task, are likely domain-specific (Dennis & Carte, 1998 ; Frownfelter-Lohrke, 1998 ; Gattis & Holyoak, 1996 ; Huang et al., 2006 ; Joslyn & LeClerc, 2013 ; Smelcer & Carmel, 1997 ; Tversky et al., 2012 ). However, other work demonstrates cases where viewers employ a graph schema that does not match the visualization, which is likely domain-general (e.g. Feeney et al., 2000 ; Gattis & Holyoak, 1996 ; Tversky et al., 2012 ). In these cases, viewers could accidentally use the wrong graph schema because it appears to match the visualization or they might not have learned a relevant schema. The likelihood of viewers making attribution errors because they do not know the corresponding schema increases when the visualization is less common, such as with uncertainty visualizations. When there is a mismatch, additional working memory is required resulting in increased time taken to complete the task and in some cases errors (e.g. Joslyn & LeClerc, 2013 ; McKenzie et al., 2016 ; Padilla, Ruginski et al., 2017 ). Based on these findings, we recommend that visualization designers should aim to create visualizations that most closely align with a viewer’s mental schema and task. However, additional empirical research is required to understand the nature of the alignment processes, including the exact method we use to mentally select a schema and the classifications of tasks that match visualizations.

The final cross-domain finding is that knowledge-driven processes can interact or override effects of visualization methods. We find that short-term (Dennis & Carte, 1998 ; Feeney et al., 2000 ; Gattis & Holyoak, 1996 ; Joslyn & LeClerc, 2013 ; Smelcer & Carmel, 1997 ; Tversky et al., 2012 ) and long-term knowledge acquisition (Shen et al., 2012 ) can influence decision making with visualizations. However, there are also examples of knowledge having little influence on decisions, even when prior knowledge could be used to improve performance (Galesic et al., 2009 ; Galesic & Garcia-Retamero, 2011 ; Keller et al., 2009 ; Lee & Bednarz, 2009 ; Okan et al., 2015 ; Okan, Garcia-Retamero, Cokely, & Maldonado, 2012 ; Okan, Garcia-Retamero, Galesic, & Cokely, 2012 ; Reyna et al., 2009 ; Rodríguez et al., 2013 ). We point out that prior knowledge seems to have more of an effect on non-visual-spatial biases, such as a familiarity bias (Belia et al., 2005 ; Joslyn & LeClerc, 2013 ; Riveiro, 2016 ; St. John et al., 2001 ), which suggests that visual-spatial biases may be closely related to bottom-up attention. Further, it is unclear from the reviewed work when knowledge switches from relying on working memory capacity for application to automatic application. We argue that Type 1 and 2 processing have unique advantages and disadvantages for visualization decision making. Therefore, it is valuable to understand which process users are applying for specific tasks in order to make visualizations that elicit optimal performance. In the case of experts and long-term knowledge, we propose that one interesting way to test if users are utilizing significant working memory capacity is to employ a dual-task paradigm (illustrated in Fig.  19 ). A dual-task paradigm can be used to evaluate the amount of working memory required and compare the relative working memory required between competing visualization techniques.

We have also proposed a variety of practical recommendations for visualization designers based on the empirical findings and our cognitive framework. Below is a summary list of our recommendations along with relevant section numbers for reference:

Identify the critical information needed for a task and use a visual encoding technique that directs participants’ attention to this information (“ Bottom-up attention ” section);

To determine which elements in a visualization will likely attract viewers’ bottom-up attention try employing a saliency algorithm (see Padilla, Quinan, et al., 2017 ) (see “ Bottom-up attention ”);

Aim to create visualizations that most closely align with a viewer’s mental schema and task demands (see “ Visual-Spatial Biases ”);

Work to reduce the number of transformations required in the decision-making process (see " Cognitive fit ");

To understand if a viewer is using Type 1 or 2 processing employ a dual-task paradigm (see Fig.  19 );

Consider evaluating the impact of individual differences such as graphic literacy and numeracy on visualization decision making.

Conclusions

We use visual information to inform many important decisions. To develop visualizations that account for real-life decision making, we must understand how and why we come to conclusions with visual information. We propose a dual-process cognitive framework expanding on visualization comprehension theory that is supported by empirical studies to describe the process of decision making with visualizations. We offer practical recommendations for visualization designers that take into account human decision-making processes. Finally, we propose a new avenue of research focused on the influence of visual-spatial biases on decision making.

Change history

02 september 2018.

The original article (Padilla et al., 2018) contained a formatting error in Table 2; this has now been corrected with the appropriate boxes marked clearly.

Dual-process theory will be described in greater detail in next section.

It should be noted that in some cases the activation of Type 2 processing should improve decision accuracy. More research is needed that examines cases where Type 2 could improve decision performance with visualizations.

Ancker, J. S., Senathirajah, Y., Kukafka, R., & Starren, J. B. (2006). Design features of graphs in health risk communication: A systematic review. Journal of the American Medical Informatics Association , 13 (6), 608–618.

Article   Google Scholar  

Baddeley, A. D., & Hitch, G. (1974). Working memory. Psychology of Learning and Motivation , 8 , 47–89.

Bailey, K., Carswell, C. M., Grant, R., & Basham, L. (2007). Geospatial perspective-taking: how well do decision makers choose their views? ​In  Proceedings of the Human Factors and Ergonomics Society Annual Meeting  (Vol. 51, No. 18, pp. 1246-1248). Los Angeles: SAGE Publications.

Balleine, B. W. (2007). The neural basis of choice and decision making. Journal of Neuroscience , 27 (31), 8159–8160.

Bandlow, A., Matzen, L. E., Cole, K. S., Dornburg, C. C., Geiseler, C. J., Greenfield, J. A., … Stevens-Adams, S. M. (2011). Evaluating Information Visualizations with Working Memory Metrics. In HCI International 2011–Posters’ Extended Abstracts , (pp. 265–269).

Chapter   Google Scholar  

Belia, S., Fidler, F., Williams, J., & Cumming, G. (2005). Researchers misunderstand confidence intervals and standard error bars. Psychological Methods , 10 (4), 389.

Bertin, J. (1983). Semiology of graphics: Diagrams, networks, maps . ​Madison: University of Wisconsin Press.

Boone, A., Gunalp, P., & Hegarty, M. (in press). Explicit versus Actionable Knowledge: The Influence of Explaining Graphical Conventions on Interpretation of Hurricane Forecast Visualizations. Journal of Experimental Psychology: Applied .

Brügger, A., Fabrikant, S. I., & Çöltekin, A. (2017). An empirical evaluation of three elevation change symbolization methods along routes in bicycle maps. Cartography and Geographic Information Science , 44 (5), 436–451.

Caffò, A. O., Picucci, L., Di Masi, M. N., & Bosco, A. (2011). Working memory components and virtual reorientation: A dual-task study. In Working memory: capacity, developments and improvement techniques , (pp. 249–266). Hauppage: Nova Science Publishers.

Google Scholar  

Card, S. K., Mackinlay, J. D., & Shneiderman, B. (1999). Readings in information visualization: using vision to think .  San Francisco: Morgan Kaufmann Publishers Inc.

Castro, S. C., Strayer, D. L., Matzke, D., & Heathcote, A. (2018). Cognitive Workload Measurement and Modeling Under Divided Attention. Journal of Experimental Psychology: General .

Cheong, L., Bleisch, S., Kealy, A., Tolhurst, K., Wilkening, T., & Duckham, M. (2016). Evaluating the impact of visualization of wildfire hazard upon decision-making under uncertainty. International Journal of Geographical Information Science , 30 (7), 1377–1404.

Connor, C. E., Egeth, H. E., & Yantis, S. (2004). Visual attention: Bottom-up versus top-down. Current Biology , 14 (19), R850–R852.

Cowan, N. (2017). The many faces of working memory and short-term storage. Psychonomic Bulletin & Review , 24 (4), 1158–1170.

Dennis, A. R., & Carte, T. A. (1998). Using geographical information systems for decision making: Extending cognitive fit theory to map-based presentations. Information Systems Research , 9 (2), 194–203.

Engel, A. K., Fries, P., & Singer, W. (2001). Dynamic predictions: Oscillations and synchrony in top–down processing. Nature Reviews Neuroscience , 2 (10), 704–716.

Engle, R. W., Kane, M. J., & Tuholski, S. W. (1999). Individual differences in working memory capacity and what they tell us about controlled attention, general fluid intelligence, and functions of the prefrontal cortex. ​ In A. Miyake & P. Shah (Eds.),  Models of working memory: Mechanisms of active maintenance and executive control  (pp. 102-134). New York: Cambridge University Press.

Epstein, S., Pacini, R., Denes-Raj, V., & Heier, H. (1996). Individual differences in intuitive–experiential and analytical–rational thinking styles. Journal of Personality and Social Psychology , 71 (2), 390.

Evans, J. S. B. (2008). Dual-processing accounts of reasoning, judgment, and social cognition. Annual Review of Psychology , 59 , 255–278.

Evans, J. S. B., & Stanovich, K. E. (2013). Dual-process theories of higher cognition: Advancing the debate. Perspectives on Psychological Science , 8 (3), 223–241.

Fabrikant, S. I., Hespanha, S. R., & Hegarty, M. (2010). Cognitively inspired and perceptually salient graphic displays for efficient spatial inference making. Annals of the Association of American Geographers , 100 (1), 13–29.

Fabrikant, S. I., & Skupin, A. (2005). Cognitively plausible information visualization. In Exploring geovisualization , (pp. 667–690). Oxford: Elsevier.

Fagerlin, A., Wang, C., & Ubel, P. A. (2005). Reducing the influence of anecdotal reasoning on people’s health care decisions: Is a picture worth a thousand statistics? Medical Decision Making , 25 (4), 398–405.

Feeney, A., Hola, A. K. W., Liversedge, S. P., Findlay, J. M., & Metcalf, R. (2000). How people extract information from graphs: Evidence from a sentence-graph verification paradigm. ​In  International Conference on Theory and Application of Diagrams  (pp. 149-161). Berlin, Heidelberg: Springer.

Frownfelter-Lohrke, C. (1998). The effects of differing information presentations of general purpose financial statements on users’ decisions. Journal of Information Systems , 12 (2), 99–107.

Galesic, M., & Garcia-Retamero, R. (2011). Graph literacy: A cross-cultural comparison. Medical Decision Making , 31 (3), 444–457.

Galesic, M., Garcia-Retamero, R., & Gigerenzer, G. (2009). Using icon arrays to communicate medical risks: Overcoming low numeracy. Health Psychology , 28 (2), 210.

Garcia-Retamero, R., & Galesic, M. (2009). Trust in healthcare. In Kattan (Ed.), Encyclopedia of medical decision making , (pp. 1153–1155). Thousand Oaks: SAGE Publications.

Gattis, M., & Holyoak, K. J. (1996). Mapping conceptual to spatial relations in visual reasoning. Journal of Experimental Psychology: Learning, Memory, and Cognition , 22 (1), 231.

PubMed   Google Scholar  

Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual Review of Psychology , 62 , 451–482.

Gigerenzer, G., Todd, P. M., & ABC Research Group (2000). Simple Heuristics That Make Us Smart . ​Oxford: Oxford University Press.

Grounds, M. A., Joslyn, S., & Otsuka, K. (2017). Probabilistic interval forecasts: An individual differences approach to understanding forecast communication. Advances in Meteorology , 2017,  1-18.

Harel, J. (2015, July 24, 2012). A Saliency Implementation in MATLAB. Retrieved from http://www.vision.caltech.edu/~harel/share/gbvs.php

Hegarty, M. (2011). The cognitive science of visual-spatial displays: Implications for design. Topics in Cognitive Science , 3 (3), 446–474.

Hegarty, M., Canham, M. S., & Fabrikant, S. I. (2010). Thinking about the weather: How display salience and knowledge affect performance in a graphic inference task. Journal of Experimental Psychology: Learning, Memory, and Cognition , 36 (1), 37.

Hegarty, M., Friedman, A., Boone, A. P., & Barrett, T. J. (2016). Where are you? The effect of uncertainty and its visual representation on location judgments in GPS-like displays. Journal of Experimental Psychology: Applied , 22 (4), 381.

Hegarty, M., Smallman, H. S., & Stull, A. T. (2012). Choosing and using geospatial displays: Effects of design on performance and metacognition. Journal of Experimental Psychology: Applied , 18 (1), 1.

Hoffrage, U., & Gigerenzer, G. (1998). Using natural frequencies to improve diagnostic inferences. Academic Medicine , 73 (5), 538–540.

Hollands, J. G., & Spence, I. (1992). Judgments of change and proportion in graphical perception. Human Factors: The Journal of the Human Factors and Ergonomics Society , 34 (3), 313–334.

Huang, Z., Chen, H., Guo, F., Xu, J. J., Wu, S., & Chen, W.-H. (2006). Expertise visualization: An implementation and study based on cognitive fit theory. Decision Support Systems , 42 (3), 1539–1557.

Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence , 20 (11), 1254–1259.

Joslyn, S., & LeClerc, J. (2013). Decisions with uncertainty: The glass half full. Current Directions in Psychological Science , 22 (4), 308–315.

Kahneman, D. (2011). Thinking, fast and slow . (Vol. 1). New York: Farrar, Straus and Giroux.

Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In Heuristics and biases: The psychology of intuitive judgment , (p. 49).

Kahneman, D., & Tversky, A. (1982). Judgment under Uncertainty: Heuristics and Biases , (1st ed., ). Cambridge; NY: Cambridge University Press.

Book   Google Scholar  

Kane, M. J., Bleckley, M. K., Conway, A. R. A., & Engle, R. W. (2001). A controlled-attention view of working-memory capacity. Journal of Experimental Psychology: General , 130 (2), 169.

Keehner, M., Mayberry, L., & Fischer, M. H. (2011). Different clues from different views: The role of image format in public perceptions of neuroimaging results. Psychonomic Bulletin & Review , 18 (2), 422–428.

Keller, C., Siegrist, M., & Visschers, V. (2009). Effect of risk ladder format on risk perception in high-and low-numerate individuals. Risk Analysis , 29 (9), 1255–1264.

Keren, G., & Schul, Y. (2009). Two is not always better than one: A critical evaluation of two-system theories. Perspectives on Psychological Science , 4 (6), 533–550.

Kinkeldey, C., MacEachren, A. M., Riveiro, M., & Schiewe, J. (2017). Evaluating the effect of visually represented geodata uncertainty on decision-making: Systematic review, lessons learned, and recommendations. Cartography and Geographic Information Science , 44 (1), 1–21. https://doi.org/10.1080/15230406.2015.1089792 .

Kinkeldey, C., MacEachren, A. M., & Schiewe, J. (2014). How to assess visual communication of uncertainty? A systematic review of geospatial uncertainty visualisation user studies. The Cartographic Journal , 51 (4), 372–386.

Kriz, S., & Hegarty, M. (2007). Top-down and bottom-up influences on learning from animations. International Journal of Human-Computer Studies , 65 (11), 911–930.

Kunz, V. (2004). Rational choice . Frankfurt: Campus Verlag.

Lallanilla, M. (2014, April 24, 2014 10:15 am). Misleading Gun-Death Chart Draws Fire. https://www.livescience.com/45083-misleading-gun-death-chart.html

Lee, J., & Bednarz, R. (2009). Effect of GIS learning on spatial thinking. Journal of Geography in Higher Education , 33 (2), 183–198.

Liu, L., Boone, A., Ruginski, I., Padilla, L., Hegarty, M., Creem-Regehr, S. H., … House, D. H. (2016). Uncertainty Visualization by Representative Sampling from Prediction Ensembles.  IEEE transactions on visualization and computer graphics, 23 (9), 2165-2178.

Lobben, A. K. (2004). Tasks, strategies, and cognitive processes associated with navigational map reading: A review perspective. The Professional Geographer , 56 (2), 270–281.

Lohse, G. L. (1993). A cognitive model for understanding graphical perception. Human Computer Interaction , 8 (4), 353–388.

Lohse, G. L. (1997). The role of working memory on graphical information processing. Behaviour & Information Technology , 16 (6), 297–308.

Marewski, J. N., & Gigerenzer, G. (2012). Heuristic decision making in medicine. Dialogues in Clinical Neuroscience , 14 (1), 77–89.

PubMed   PubMed Central   Google Scholar  

McCabe, D. P., & Castel, A. D. (2008). Seeing is believing: The effect of brain images on judgments of scientific reasoning. Cognition , 107 (1), 343–352.

McKenzie, G., Hegarty, M., Barrett, T., & Goodchild, M. (2016). Assessing the effectiveness of different visualizations for judgments of positional uncertainty. International Journal of Geographical Information Science , 30 (2), 221–239.

Mechelli, A., Price, C. J., Friston, K. J., & Ishai, A. (2004). Where bottom-up meets top-down: Neuronal interactions during perception and imagery. Cerebral Cortex , 14 (11), 1256–1265.

Meilinger, T., Knauff, M., & Bülthoff, H. H. (2008). Working memory in wayfinding—A dual task experiment in a virtual city. Cognitive Science , 32 (4), 755–770.

Meyer, J. (2000). Performance with tables and graphs: Effects of training and a visual search model. Ergonomics , 43 (11), 1840–1865.

Munzner, T. (2014). Visualization analysis and design . Boca Raton, FL: CRC Press.

Nadav-Greenberg, L., Joslyn, S. L., & Taing, M. U. (2008). The effect of uncertainty visualizations on decision making in weather forecasting. Journal of Cognitive Engineering and Decision Making , 2 (1), 24–47.

Nayak, J. G., Hartzler, A. L., Macleod, L. C., Izard, J. P., Dalkin, B. M., & Gore, J. L. (2016). Relevance of graph literacy in the development of patient-centered communication tools. Patient Education and Counseling , 99 (3), 448–454.

Newman, G. E., & Scholl, B. J. (2012). Bar graphs depicting averages are perceptually misinterpreted: The within-the-bar bias. Psychonomic Bulletin & Review , 19 (4), 601–607. https://doi.org/10.3758/s13423-012-0247-5 .

Okan, Y., Galesic, M., & Garcia-Retamero, R. (2015). How people with low and high graph literacy process health graphs: Evidence from eye-tracking. Journal of Behavioral Decision Making .

Okan, Y., Garcia-Retamero, R., Cokely, E. T., & Maldonado, A. (2012). Individual differences in graph literacy: Overcoming denominator neglect in risk comprehension. Journal of Behavioral Decision Making , 25 (4), 390–401.

Okan, Y., Garcia-Retamero, R., Galesic, M., & Cokely, E. T. (2012). When higher bars are not larger quantities: On individual differences in the use of spatial information in graph comprehension. Spatial Cognition and Computation , 12 (2–3), 195–218.

Padilla, L., Hansen, G., Ruginski, I. T., Kramer, H. S., Thompson, W. B., & Creem-Regehr, S. H. (2015). The influence of different graphical displays on nonexpert decision making under uncertainty. Journal of Experimental Psychology: Applied , 21 (1), 37.

Padilla, L., Quinan, P. S., Meyer, M., & Creem-Regehr, S. H. (2017). Evaluating the impact of binning 2d scalar fields. IEEE Transactions on Visualization and Computer Graphics , 23 (1), 431–440.

Padilla, L., Ruginski, I. T., & Creem-Regehr, S. H. (2017). Effects of ensemble and summary displays on interpretations of geospatial uncertainty data. Cognitive Research: Principles and Implications , 2 (1), 40.

Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory. Psychological Bulletin , 116 (2), 220.

Patterson, R. E., Blaha, L. M., Grinstein, G. G., Liggett, K. K., Kaveney, D. E., Sheldon, K. C., … Moore, J. A. (2014). A human cognition framework for information visualization. Computers & Graphics , 42 , 42–58.

Pinker, S. (1990). A theory of graph comprehension. In Artificial intelligence and the future of testing , (pp. 73–126).

Ratliff, K. R., & Newcombe, N. S. (2005). Human spatial reorientation using dual task paradigms . Paper presented at the Proceedings of the Annual Cognitive Science Society.

Reyna, V. F., Nelson, W. L., Han, P. K., & Dieckmann, N. F. (2009). How numeracy influences risk comprehension and medical decision making. Psychological Bulletin , 135 (6), 943.

Riveiro, M. (2016). Visually supported reasoning under uncertain conditions: Effects of domain expertise on air traffic risk assessment. Spatial Cognition and Computation , 16 (2), 133–153.

Rodríguez, V., Andrade, A. D., García-Retamero, R., Anam, R., Rodríguez, R., Lisigurski, M., … Ruiz, J. G. (2013). Health literacy, numeracy, and graphical literacy among veterans in primary care and their effect on shared decision making and trust in physicians. Journal of Health Communication , 18 (sup1), 273–289.

Rosenholtz, R., & Jin, Z. (2005). A computational form of the statistical saliency model for visual search. Journal of Vision , 5 (8), 777–777.

Ruginski, I. T., Boone, A. P., Padilla, L., Liu, L., Heydari, N., Kramer, H. S., … Creem-Regehr, S. H. (2016). Non-expert interpretations of hurricane forecast uncertainty visualizations. Spatial Cognition and Computation , 16 (2), 154–172.

Sanchez, C. A., & Wiley, J. (2006). An examination of the seductive details effect in terms of working memory capacity. Memory & Cognition , 34 (2), 344–355.

Schirillo, J. A., & Stone, E. R. (2005). The greater ability of graphical versus numerical displays to increase risk avoidance involves a common mechanism. Risk Analysis , 25 (3), 555–566.

Shah, P., & Freedman, E. G. (2011). Bar and line graph comprehension: An interaction of top-down and bottom-up processes. Topics in Cognitive Science , 3 (3), 560–578.

Shah, P., Freedman, E. G., & Vekiri, I. (2005). The Comprehension of Quantitative Information in Graphical Displays . In P. Shah (Ed.) & A. Miyake, The Cambridge Handbook of Visuospatial Thinking (pp. 426-476). New York: Cambridge University Press.

Shah, P., & Miyake, A. (1996). The separability of working memory resources for spatial thinking and language processing: An individual differences approach. Journal of Experimental Psychology: General , 125 (1), 4.

Shen, M., Carswell, M., Santhanam, R., & Bailey, K. (2012). Emergency management information systems: Could decision makers be supported in choosing display formats? Decision Support Systems , 52 (2), 318–330.

Shipstead, Z., Harrison, T. L., & Engle, R. W. (2015). Working memory capacity and the scope and control of attention. Attention, Perception, & Psychophysics , 77 (6), 1863–1880.

Simkin, D., & Hastie, R. (1987). An information-processing analysis of graph perception. Journal of the American Statistical Association , 82 (398), 454–465.

Sloman, S. A. (2002). Two systems of reasoning. ​ In T. Gilovich, D. Griffin, & D. Kahneman (Eds.),  Heuristics and biases : The psychology of intuitive judgment (pp. 379-396). New York: Cambridge University Press.

Smelcer, J. B., & Carmel, E. (1997). The effectiveness of different representations for managerial problem solving: Comparing tables and maps. Decision Sciences , 28 (2), 391.

St. John, M., Cowen, M. B., Smallman, H. S., & Oonk, H. M. (2001). The use of 2D and 3D displays for shape-understanding versus relative-position tasks. Human Factors , 43 (1), 79–98.

Stanovich, K. E. (1999). Who is rational? Studies of individual differences in reasoning . New York City: Psychology Press.

Stenning, K., & Oberlander, J. (1995). A cognitive theory of graphical and linguistic reasoning: Logic and implementation. Cognitive Science , 19 (1), 97–140.

Stone, E. R., Sieck, W. R., Bull, B. E., Yates, J. F., Parks, S. C., & Rush, C. J. (2003). Foreground: Background salience: Explaining the effects of graphical displays on risk avoidance. Organizational Behavior and Human Decision Processes , 90 (1), 19–36.

Stone, E. R., Yates, J. F., & Parker, A. M. (1997). Effects of numerical and graphical displays on professed risk-taking behavior. Journal of Experimental Psychology: Applied , 3 (4), 243.

Trueswell, J. C., & Papafragou, A. (2010). Perceiving and remembering events cross-linguistically: Evidence from dual-task paradigms. Journal of Memory and Language , 63 (1), 64–82.

Tversky, B. (2005). Visuospatial reasoning. In K. Holyoak and R. G. Morrison (eds.), The Cambridge Handbook of Thinking and Reasoning , (pp. 209-240). Cambridge: Cambridge University Press.

Tversky, B. (2011). Visualizing thought. Topics in Cognitive Science , 3 (3), 499–535.

Tversky, B., Corter, J. E., Yu, L., Mason, D. L., & Nickerson, J. V. (2012). Representing Category and Continuum: Visualizing Thought . Paper presented at the International Conference on Theory and Application of Diagrams, Berlin, Heidelberg.

Vessey, I., & Galletta, D. (1991). Cognitive fit: An empirical study of information acquisition. Information Systems Research , 2 (1), 63–84.

Vessey, I., Zhang, P., & Galletta, D. (2006). The theory of cognitive fit. In Human-computer interaction and management information systems: Foundations , (pp. 141–183).

Von Neumann, J. (1953). Morgenstern, 0.(1944) theory of games and economic behavior . Princeton, NJ: Princeton UP.

Vranas, P. B. M. (2000). Gigerenzer's normative critique of Kahneman and Tversky. Cognition , 76 (3), 179–193.

Wainer, H., Hambleton, R. K., & Meara, K. (1999). Alternative displays for communicating NAEP results: A redesign and validity study. Journal of Educational Measurement , 36 (4), 301–335.

Waters, E. A., Weinstein, N. D., Colditz, G. A., & Emmons, K. (2006). Formats for improving risk communication in medical tradeoff decisions. Journal of Health Communication , 11 (2), 167–182.

Waters, E. A., Weinstein, N. D., Colditz, G. A., & Emmons, K. M. (2007). Reducing aversion to side effects in preventive medical treatment decisions. Journal of Experimental Psychology: Applied , 13 (1), 11.

Wilkening, J., & Fabrikant, S. I. (2011). How do decision time and realism affect map-based decision making? Paper presented at the International Conference on Spatial Information Theory.

Zhu, B., & Watts, S. A. (2010). Visualization of network concepts: The impact of working memory capacity differences. Information Systems Research , 21 (2), 327–344.

Download references

This research is based upon work supported by the National Science Foundation under Grants 1212806, 1810498, and 1212577.

Availability of data and materials

No data were collected for this review.

Author information

Authors and affiliations.

Northwestern University, Evanston, USA

Lace M. Padilla

Department of Psychology, University of Utah, 380 S. 1530 E., Room 502, Salt Lake City, UT, 84112, USA

Lace M. Padilla, Sarah H. Creem-Regehr & Jeanine K. Stefanucci

Department of Psychology, University of California–Santa Barbara, Santa Barbara, USA

Mary Hegarty

You can also search for this author in PubMed   Google Scholar

Contributions

LMP is the primary author of this study; she was central to the development, writing, and conclusions of this work. SHC, MH, and JS contributed to the theoretical development and manuscript preparation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lace M. Padilla .

Ethics declarations

Authors’ information.

LMP is a Ph.D. student at the University of Utah in the Cognitive Neural Science department. LMP is a member of the Visual Perception and Spatial Cognition Research Group directed by Sarah Creem-Regehr, Ph.D., Jeanine Stefanucci, Ph.D., and William Thompson, Ph.D. Her work focuses on graphical cognition, decision making with visualizations, and visual perception. She works on large interdisciplinary projects with visualization scientists and anthropologists.

SHC is a Professor in the Psychology Department of the University of Utah. She received her MA and Ph.D. in Psychology from the University of Virginia. Her research serves joint goals of developing theories of perception-action processing mechanisms and applying these theories to relevant real-world problems in order to facilitate observers’ understanding of their spatial environments. In particular, her interests are in space perception, spatial cognition, embodied cognition, and virtual environments. She co-authored the book Visual Perception from a Computer Graphics Perspective ; previously, she was Associate Editor of Psychonomic Bulletin & Review and Experimental Psychology: Human Perception and Performance .

MH is a Professor in the Department of Psychological & Brain Sciences at the University of California, Santa Barbara. She received her Ph.D. in Psychology from Carnegie Mellon University. Her research is concerned with spatial cognition, broadly defined, and includes research on small-scale spatial abilities (e.g. mental rotation and perspective taking), large-scale spatial abilities involved in navigation, comprehension of graphics, and the role of spatial cognition in STEM learning. She served as chair of the governing board of the Cognitive Science Society and is associate editor of Topics in Cognitive Science and past Associate Editor of Journal of Experimental Psychology: Applied .

JS is an Associate Professor in the Psychology Department at the University of Utah. She received her M.A. and Ph.D. in Psychology from the University of Virginia. Her research focuses on better understanding if a person’s bodily states, whether emotional, physiological, or physical, affects their spatial perception and cognition. She conducts this research in natural settings (outdoor or indoor) and in virtual environments. This work is inherently interdisciplinary given it spans research on emotion, health, spatial perception and cognition, and virtual environments. She is on the editorial boards for the Journal of Experimental Psychology: General and Virtual Environments: Frontiers in Robotics and AI . She also co-authored the book Visual Perception from a Computer Graphics Perspective .

Ethics approval and consent to participate

The research reported in this paper was conducted in adherence to the Declaration of Helsinki and received IRB approval from the University of Utah, #IRB_00057678. No human subject data were collected for this work; therefore, no consent to participate was acquired.

Consent for publication

Consent to publish was not required for this review.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional information

The original version of this article has been revised. Table 2 was corrected to be presented appropriately.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article.

Padilla, L.M., Creem-Regehr, S.H., Hegarty, M. et al. Decision making with visualizations: a cognitive framework across disciplines. Cogn. Research 3 , 29 (2018). https://doi.org/10.1186/s41235-018-0120-9

Download citation

Received : 20 September 2017

Accepted : 05 June 2018

Published : 11 July 2018

DOI : https://doi.org/10.1186/s41235-018-0120-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Decision making with visualizations review
  • Cognitive model
  • Geospatial visualizations
  • Healthcare visualizations
  • Weather forecast visualizations
  • Uncertainty visualizations
  • Graphical decision making
  • Dual-process

visual representation definition psychology

  • Reviews / Why join our community?
  • For companies
  • Frequently asked questions

Visual Representation

What is visual representation.

Visual Representation refers to the principles by which markings on a surface are made and interpreted. Designers use representations like typography and illustrations to communicate information, emotions and concepts. Color, imagery, typography and layout are crucial in this communication.

Alan Blackwell, cognition scientist and professor, gives a brief introduction to visual representation:

  • Transcript loading…

We can see visual representation throughout human history, from cave drawings to data visualization :

Art uses visual representation to express emotions and abstract ideas.

Financial forecasting graphs condense data and research into a more straightforward format.

Icons on user interfaces (UI) represent different actions users can take.

The color of a notification indicates its nature and meaning.

A painting of an abstract night sky over a village, with a tree in the foreground.

Van Gogh's "The Starry Night" uses visuals to evoke deep emotions, representing an abstract, dreamy night sky. It exemplifies how art can communicate complex feelings and ideas.

© Public domain

Importance of Visual Representation in Design

Designers use visual representation for internal and external use throughout the design process . For example:

Storyboards are illustrations that outline users’ actions and where they perform them.

Sitemaps are diagrams that show the hierarchy and navigation structure of a website.

Wireframes are sketches that bring together elements of a user interface's structure.

Usability reports use graphs and charts to communicate data gathered from usability testing.

User interfaces visually represent information contained in applications and computerized devices.

A sample usability report that shows a few statistics, a bell curve and a donut chart.

This usability report is straightforward to understand. Yet, the data behind the visualizations could come from thousands of answered surveys.

© Interaction Design Foundation, CC BY-SA 4.0

Visual representation simplifies complex ideas and data and makes them easy to understand. Without these visual aids, designers would struggle to communicate their ideas, findings and products . For example, it would be easier to create a mockup of an e-commerce website interface than to describe it with words.

A side-by-side comparison of a simple mockup, and a very verbose description of the same mockup. A developer understands the simple one, and is confused by the verbose one.

Visual representation simplifies the communication of designs. Without mockups, it would be difficult for developers to reproduce designs using words alone.

Types of Visual Representation

Below are some of the most common forms of visual representation designers use.

Text and Typography

Text represents language and ideas through written characters and symbols. Readers visually perceive and interpret these characters. Typography turns text into a visual form, influencing its perception and interpretation.

We have developed the conventions of typography over centuries , for example, in documents, newspapers and magazines. These conventions include:

Text arranged on a grid brings clarity and structure. Gridded text makes complex information easier to navigate and understand. Tables, columns and other formats help organize content logically and enhance readability.

Contrasting text sizes create a visual hierarchy and draw attention to critical areas. For example, headings use larger text while body copy uses smaller text. This contrast helps readers distinguish between primary and secondary information.

Adequate spacing and paragraphing improve the readability and appearance of the text. These conventions prevent the content from appearing cluttered. Spacing and paragraphing make it easier for the eye to follow and for the brain to process the information.

Balanced image-to-text ratios create engaging layouts. Images break the monotony of text, provide visual relief and illustrate or emphasize points made in the text. A well-planned ratio ensures neither text nor images overwhelm each other. Effective ratios make designs more effective and appealing.

Designers use these conventions because people are familiar with them and better understand text presented in this manner.

A table of names and numbers indicating the funerals of victims of the plague in London in 1665.

This table of funerals from the plague in London in 1665 uses typographic conventions still used today. For example, the author arranged the information in a table and used contrasting text styling to highlight information in the header.

Illustrations and Drawings

Designers use illustrations and drawings independently or alongside text. An example of illustration used to communicate information is the assembly instructions created by furniture retailer IKEA. If IKEA used text instead of illustrations in their instructions, people would find it harder to assemble the furniture.

A diagram showing how to assemble a chest of drawers from furniture retailer IKEA.

IKEA assembly instructions use illustrations to inform customers how to build their furniture. The only text used is numeric to denote step and part numbers. IKEA communicates this information visually to: 1. Enable simple communication, 2. Ensure their instructions are easy to follow, regardless of the customer’s language.

© IKEA, Fair use

Illustrations and drawings can often convey the core message of a visual representation more effectively than a photograph. They focus on the core message , while a photograph might distract a viewer with additional details (such as who this person is, where they are from, etc.)

For example, in IKEA’s case, photographing a person building a piece of furniture might be complicated. Further, photographs may not be easy to understand in a black-and-white print, leading to higher printing costs. To be useful, the pictures would also need to be larger and would occupy more space on a printed manual, further adding to the costs.

But imagine a girl winking—this is something we can easily photograph. 

Ivan Sutherland, creator of the first graphical user interface, used his computer program Sketchpad to draw a winking girl. While not realistic, Sutherland's representation effectively portrays a winking girl. The drawing's abstract, generic elements contrast with the distinct winking eye. The graphical conventions of lines and shapes represent the eyes and mouth. The simplicity of the drawing does not draw attention away from the winking.

A simple illustration of a winking girl next to a photograph of a winking girl.

A photo might distract from the focused message compared to Sutherland's representation. In the photo, the other aspects of the image (i.e., the particular person) distract the viewer from this message.

© Ivan Sutherland, CC BY-SA 3.0 and Amina Filkins, Pexels License

Information and Data Visualization

Designers and other stakeholders use data and information visualization across many industries.

Data visualization uses charts and graphs to show raw data in a graphic form. Information visualization goes further, including more context and complex data sets. Information visualization often uses interactive elements to share a deeper understanding.

For example, most computerized devices have a battery level indicator. This is a type of data visualization. IV takes this further by allowing you to click on the battery indicator for further insights. These insights may include the apps that use the most battery and the last time you charged your device.

A simple battery level icon next to a screenshot of a battery information dashboard.

macOS displays a battery icon in the menu bar that visualizes your device’s battery level. This is an example of data visualization. Meanwhile, macOS’s settings tell you battery level over time, screen-on-usage and when you last charged your device. These insights are actionable; users may notice their battery drains at a specific time. This is an example of information visualization.

© Low Battery by Jemis Mali, CC BY-NC-ND 4.0, and Apple, Fair use

Information visualization is not exclusive to numeric data. It encompasses representations like diagrams and maps. For example, Google Maps collates various types of data and information into one interface:

Data Representation: Google Maps transforms complex geographical data into an easily understandable and navigable visual map.

Interactivity: Users can interactively customize views that show traffic, satellite imagery and more in real-time.

Layered Information: Google Maps layers multiple data types (e.g., traffic, weather) over geographical maps for comprehensive visualization.

User-Centered Design : The interface is intuitive and user-friendly, with symbols and colors for straightforward data interpretation.

A screenshot of Google Maps showing the Design Museum in London, UK. On the left is a profile of the location, on the right is the map.

The volume of data contained in one screenshot of Google Maps is massive. However, this information is presented clearly to the user. Google Maps highlights different terrains with colors and local places and businesses with icons and colors. The panel on the left lists the selected location’s profile, which includes an image, rating and contact information.

© Google, Fair use

Symbolic Correspondence

Symbolic correspondence uses universally recognized symbols and signs to convey specific meanings . This method employs widely recognized visual cues for immediate understanding. Symbolic correspondence removes the need for textual explanation.

For instance, a magnifying glass icon in UI design signifies the search function. Similarly, in environmental design, symbols for restrooms, parking and amenities guide visitors effectively.

A screenshot of the homepage Interaction Design Foundation website. Across the top is a menu bar. Beneath the menu bar is a header image with a call to action.

The Interaction Design Foundation (IxDF) website uses the universal magnifying glass symbol to signify the search function. Similarly, the play icon draws attention to a link to watch a video.

How Designers Create Visual Representations

Visual language.

Designers use elements like color , shape and texture to create a communicative visual experience. Designers use these 8 principles:

Size – Larger elements tend to capture users' attention readily.

Color – Users are typically drawn to bright colors over muted shades.

Contrast – Colors with stark contrasts catch the eye more effectively.

Alignment – Unaligned elements are more noticeable than those aligned ones.

Repetition – Similar styles repeated imply a relationship in content.

Proximity – Elements placed near each other appear to be connected.

Whitespace – Elements surrounded by ample space attract the eye.

Texture and Style – Users often notice richer textures before flat designs.

visual representation definition psychology

The 8 visual design principles.

In web design , visual hierarchy uses color and repetition to direct the user's attention. Color choice is crucial as it creates contrast between different elements. Repetition helps to organize the design—it uses recurring elements to establish consistency and familiarity.

In this video, Alan Dix, Professor and Expert in Human-Computer Interaction, explains how visual alignment affects how we read and absorb information:

Correspondence Techniques

Designers use correspondence techniques to align visual elements with their conceptual meanings. These techniques include color coding, spatial arrangement and specific imagery. In information visualization, different colors can represent various data sets. This correspondence aids users in quickly identifying trends and relationships .

Two pie charts showing user satisfaction. One visualizes data 1 day after release, and the other 1 month after release. The colors are consistent between both charts, but the segment sizes are different.

Color coding enables the stakeholder to see the relationship and trend between the two pie charts easily.

In user interface design, correspondence techniques link elements with meaning. An example is color-coding notifications to state their nature. For instance, red for warnings and green for confirmation. These techniques are informative and intuitive and enhance the user experience.

A screenshot of an Interaction Design Foundation course page. It features information about the course and a video. Beneath this is a pop-up asking the user if they want to drop this course.

The IxDF website uses blue for call-to-actions (CTAs) and red for warnings. These colors inform the user of the nature of the action of buttons and other interactive elements.

Perception and Interpretation

If visual language is how designers create representations, then visual perception and interpretation are how users receive those representations. Consider a painting—the viewer’s eyes take in colors, shapes and lines, and the brain perceives these visual elements as a painting.

In this video, Alan Dix explains how the interplay of sensation, perception and culture is crucial to understanding visual experiences in design:

Copyright holder: Michael Murphy _ Appearance time: 07:19 - 07:37 _ Link: https://www.youtube.com/watch?v=C67JuZnBBDc

Visual perception principles are essential for creating compelling, engaging visual representations. For example, Gestalt principles explain how we perceive visual information. These rules describe how we group similar items, spot patterns and simplify complex images. Designers apply Gestalt principles to arrange content on websites and other interfaces. This application creates visually appealing and easily understood designs.

In this video, design expert and teacher Mia Cinelli discusses the significance of Gestalt principles in visual design . She introduces fundamental principles, like figure/ground relationships, similarity and proximity.

Interpretation

Everyone's experiences, culture and physical abilities dictate how they interpret visual representations. For this reason, designers carefully consider how users interpret their visual representations. They employ user research and testing to ensure their designs are attractive and functional.

A painting of a woman sitting and looking straight at the viewer. Her expression is difficult to read.

Leonardo da Vinci's "Mona Lisa", is one of the most famous paintings in the world. The piece is renowned for its subject's enigmatic expression. Some interpret her smile as content and serene, while others see it as sad or mischievous. Not everyone interprets this visual representation in the same way.

Color is an excellent example of how one person, compared to another, may interpret a visual element. Take the color red:

In Chinese culture, red symbolizes luck, while in some parts of Africa, it can mean death or illness.

A personal experience may mean a user has a negative or positive connotation with red.

People with protanopia and deuteranopia color blindness cannot distinguish between red and green.

In this video, Joann and Arielle Eckstut, leading color consultants and authors, explain how many factors influence how we perceive and interpret color:

Learn More about Visual Representation

Read Alan Blackwell’s chapter on visual representation from The Encyclopedia of Human-Computer Interaction.

Learn about the F-Shaped Pattern For Reading Web Content from Jakob Nielsen.

Read Smashing Magazine’s article, Visual Design Language: The Building Blocks Of Design .

Take the IxDF’s course, Perception and Memory in HCI and UX .

Questions related to Visual Representation

Some highly cited research on visual representation and related topics includes:

Roland, P. E., & Gulyás, B. (1994). Visual imagery and visual representation. Trends in Neurosciences, 17(7), 281-287. Roland and Gulyás' study explores how the brain creates visual imagination. They look at whether imagining things like objects and scenes uses the same parts of the brain as seeing them does. Their research shows the brain uses certain areas specifically for imagination. These areas are different from the areas used for seeing. This research is essential for understanding how our brain works with vision.

Lurie, N. H., & Mason, C. H. (2007). Visual Representation: Implications for Decision Making. Journal of Marketing, 71(1), 160-177.

This article looks at how visualization tools help in understanding complicated marketing data. It discusses how these tools affect decision-making in marketing. The article gives a detailed method to assess the impact of visuals on the study and combination of vast quantities of marketing data. It explores the benefits and possible biases visuals can bring to marketing choices. These factors make the article an essential resource for researchers and marketing experts. The article suggests using visual tools and detailed analysis together for the best results.

Lohse, G. L., Biolsi, K., Walker, N., & Rueter, H. H. (1994, December). A classification of visual representations. Communications of the ACM, 37(12), 36+.

This publication looks at how visuals help communicate and make information easier to understand. It divides these visuals into six types: graphs, tables, maps, diagrams, networks and icons. The article also looks at different ways these visuals share information effectively.

​​If you’d like to cite content from the IxDF website , click the ‘cite this article’ button near the top of your screen.

Some recommended books on visual representation and related topics include:

Chaplin, E. (1994). Sociology and Visual Representation (1st ed.) . Routledge.

Chaplin's book describes how visual art analysis has changed from ancient times to today. It shows how photography, post-modernism and feminism have changed how we see art. The book combines words and images in its analysis and looks into real-life social sciences studies.

Mitchell, W. J. T. (1994). Picture Theory. The University of Chicago Press.

Mitchell's book explores the important role and meaning of pictures in the late twentieth century. It discusses the change from focusing on language to focusing on images in cultural studies. The book deeply examines the interaction between images and text in different cultural forms like literature, art and media. This detailed study of how we see and read visual representations has become an essential reference for scholars and professionals.

Koffka, K. (1935). Principles of Gestalt Psychology. Harcourt, Brace & World.

"Principles of Gestalt Psychology" by Koffka, released in 1935, is a critical book in its field. It's known as a foundational work in Gestalt psychology, laying out the basic ideas of the theory and how they apply to how we see and think. Koffka's thorough study of Gestalt psychology's principles has profoundly influenced how we understand human perception. This book has been a significant reference in later research and writings.

A visual representation, like an infographic or chart, uses visual elements to show information or data. These types of visuals make complicated information easier to understand and more user-friendly.

Designers harness visual representations in design and communication. Infographics and charts, for instance, distill data for easier audience comprehension and retention.

For an introduction to designing basic information visualizations, take our course, Information Visualization .

Text is a crucial design and communication element, transforming language visually. Designers use font style, size, color and layout to convey emotions and messages effectively.

Designers utilize text for both literal communication and aesthetic enhancement. Their typography choices significantly impact design aesthetics, user experience and readability.

Designers should always consider text's visual impact in their designs. This consideration includes font choice, placement, color and interaction with other design elements.

In this video, design expert and teacher Mia Cinelli teaches how Gestalt principles apply to typography:

Designers use visual elements in projects to convey information, ideas, and messages. Designers use images, colors, shapes and typography for impactful designs.

In UI/UX design, visual representation is vital. Icons, buttons and colors provide contrast for intuitive, user-friendly website and app interfaces.

Graphic design leverages visual representation to create attention-grabbing marketing materials. Careful color, imagery and layout choices create an emotional connection.

Product design relies on visual representation for prototyping and idea presentation. Designers and stakeholders use visual representations to envision functional, aesthetically pleasing products.

Our brains process visuals 60,000 times faster than text. This fact highlights the crucial role of visual representation in design.

Our course, Visual Design: The Ultimate Guide , teaches you how to use visual design elements and principles in your work effectively.

Visual representation, crucial in UX, facilitates interaction, comprehension and emotion. It combines elements like images and typography for better interfaces.

Effective visuals guide users, highlight features and improve navigation. Icons and color schemes communicate functions and set interaction tones.

UX design research shows visual elements significantly impact emotions. 90% of brain-transmitted information is visual.

To create functional, accessible visuals, designers use color contrast and consistent iconography. These elements improve readability and inclusivity.

An excellent example of visual representation in UX is Apple's iOS interface. iOS combines a clean, minimalist design with intuitive navigation. As a result, the operating system is both visually appealing and user-friendly.

Michal Malewicz, Creative Director and CEO at Hype4, explains why visual skills are important in design:

Learn more about UI design from Michal in our Master Class, Beyond Interfaces: The UI Design Skills You Need to Know .

The fundamental principles of effective visual representation are:

Clarity : Designers convey messages clearly, avoiding clutter.

Simplicity : Embrace simple designs for ease and recall.

Emphasis : Designers highlight key elements distinctively.

Balance : Balance ensures design stability and structure.

Alignment : Designers enhance coherence through alignment.

Contrast : Use contrast for dynamic, distinct designs.

Repetition : Repeating elements unify and guide designs.

Designers practice these principles in their projects. They also analyze successful designs and seek feedback to improve their skills.

Read our topic description of Gestalt principles to learn more about creating effective visual designs. The Gestalt principles explain how humans group elements, recognize patterns and simplify object perception.

Color theory is vital in design, helping designers craft visually appealing and compelling works. Designers understand color interactions, psychological impacts and symbolism. These elements help designers enhance communication and guide attention.

Designers use complementary , analogous and triadic colors for contrast, harmony and balance. Understanding color temperature also plays a crucial role in design perception.

Color symbolism is crucial, as different colors can represent specific emotions and messages. For instance, blue can symbolize trust and calmness, while red can indicate energy and urgency.

Cultural variations significantly influence color perception and symbolism. Designers consider these differences to ensure their designs resonate with diverse audiences.

For actionable insights, designers should:

Experiment with color schemes for effective messaging. 

Assess colors' psychological impact on the audience. 

Use color contrast to highlight critical elements. 

Ensure color choices are accessible to all.

In this video, Joann and Arielle Eckstut, leading color consultants and authors, give their six tips for choosing color:

Learn more about color from Joann and Arielle in our Master Class, How To Use Color Theory To Enhance Your Designs .

Typography and font choice are crucial in design, impacting readability and mood. Designers utilize them for effective communication and expression.

Designers' perception of information varies with font type. Serif fonts can imply formality, while sans-serifs can give a more modern look.

Typography choices by designers influence readability and user experience. Well-spaced, distinct fonts enhance readability, whereas decorative fonts may hinder it.

Designers use typography to evoke emotions and set a design's tone. Choices in font size, style and color affect the emotional impact and message clarity.

Designers use typography to direct attention, create hierarchy and establish rhythm. These benefits help with brand recognition and consistency across mediums.

Read our article to learn how web fonts are critical to the online user experience .

Designers create a balance between simplicity and complexity in their work. They focus on the main messages and highlight important parts. Designers use the principles of visual hierarchy, like size, color and spacing. They also use empty space to make their designs clear and understandable.

The Gestalt law of Prägnanz suggests people naturally simplify complex images. This principle aids in making even intricate information accessible and engaging.

Through iteration and feedback, designers refine visuals. They remove extraneous elements and highlight vital information. Testing with the target audience ensures the design resonates and is comprehensible.

Michal Malewicz explains how to master hierarchy in UI design using the Gestalt rule of proximity:

Answer a Short Quiz to Earn a Gift

Why do designers use visual representation?

  • To guarantee only a specific audience can understand the information
  • To replace the need for any form of written communication
  • To simplify complex information and make it understandable

Which type of visual representation helps to compare data?

  • Article images
  • Line charts
  • Text paragraphs

What is the main purpose of visual hierarchy in design?

  • To decorate the design with more colors
  • To guide the viewer’s attention to the most important elements first
  • To provide complex text for high-level readers

How does color impact visual representation?

  • It has no impact on the design at all.
  • It helps to distinguish different elements and set the mood.
  • It makes the design less engaging for a serious mood.

Why is consistency important in visual representation?

  • It limits creativity, but allows variation in design.
  • It makes sure the visual elements are cohesive and easy to understand.
  • It makes the design unpredictable yet interesting.

Better luck next time!

Do you want to improve your UX / UI Design skills? Join us now

Congratulations! You did amazing

You earned your gift with a perfect score! Let us send it to you.

Check Your Inbox

We’ve emailed your gift to [email protected] .

Literature on Visual Representation

Here’s the entire UX literature on Visual Representation by the Interaction Design Foundation, collated in one place:

Learn more about Visual Representation

Take a deep dive into Visual Representation with our course Perception and Memory in HCI and UX .

How does all of this fit with interaction design and user experience? The simple answer is that most of our understanding of human experience comes from our own experiences and just being ourselves. That might extend to people like us, but it gives us no real grasp of the whole range of human experience and abilities. By considering more closely how humans perceive and interact with our world, we can gain real insights into what designs will work for a broader audience: those younger or older than us, more or less capable, more or less skilled and so on.

“You can design for all the people some of the time, and some of the people all the time, but you cannot design for all the people all the time.“ – William Hudson (with apologies to Abraham Lincoln)

While “design for all of the people all of the time” is an impossible goal, understanding how the human machine operates is essential to getting ever closer. And of course, building solutions for people with a wide range of abilities, including those with accessibility issues, involves knowing how and why some human faculties fail. As our course tutor, Professor Alan Dix, points out, this is not only a moral duty but, in most countries, also a legal obligation.

Portfolio Project

In the “ Build Your Portfolio: Perception and Memory Project ”, you’ll find a series of practical exercises that will give you first-hand experience in applying what we’ll cover. If you want to complete these optional exercises, you’ll create a series of case studies for your portfolio which you can show your future employer or freelance customers.

This in-depth, video-based course is created with the amazing Alan Dix , the co-author of the internationally best-selling textbook  Human-Computer Interaction and a superstar in the field of Human-Computer Interaction . Alan is currently a professor and Director of the Computational Foundry at Swansea University.

Gain an Industry-Recognized UX Course Certificate

Use your industry-recognized Course Certificate on your resume , CV , LinkedIn profile or your website.

All open-source articles on Visual Representation

Data visualization for human perception.

visual representation definition psychology

The Key Elements & Principles of Visual Design

visual representation definition psychology

  • 1.1k shares

Guidelines for Good Visual Information Representations

visual representation definition psychology

  • 5 years ago

Philosophy of Interaction

Information visualization – an introduction to multivariate analysis.

visual representation definition psychology

  • 8 years ago

Aesthetic Computing

How to represent linear data visually for information visualization.

visual representation definition psychology

Open Access—Link to us!

We believe in Open Access and the  democratization of knowledge . Unfortunately, world-class educational materials such as this page are normally hidden behind paywalls or in expensive textbooks.

If you want this to change , cite this page , link to us, or join us to help us democratize design knowledge !

Privacy Settings

Our digital services use necessary tracking technologies, including third-party cookies, for security, functionality, and to uphold user rights. Optional cookies offer enhanced features, and analytics.

Experience the full potential of our site that remembers your preferences and supports secure sign-in.

Governs the storage of data necessary for maintaining website security, user authentication, and fraud prevention mechanisms.

Enhanced Functionality

Saves your settings and preferences, like your location, for a more personalized experience.

Referral Program

We use cookies to enable our referral program, giving you and your friends discounts.

Error Reporting

We share user ID with Bugsnag and NewRelic to help us track errors and fix issues.

Optimize your experience by allowing us to monitor site usage. You’ll enjoy a smoother, more personalized journey without compromising your privacy.

Analytics Storage

Collects anonymous data on how you navigate and interact, helping us make informed improvements.

Differentiates real visitors from automated bots, ensuring accurate usage data and improving your website experience.

Lets us tailor your digital ads to match your interests, making them more relevant and useful to you.

Advertising Storage

Stores information for better-targeted advertising, enhancing your online ad experience.

Personalization Storage

Permits storing data to personalize content and ads across Google services based on user behavior, enhancing overall user experience.

Advertising Personalization

Allows for content and ad personalization across Google services based on user behavior. This consent enhances user experiences.

Enables personalizing ads based on user data and interactions, allowing for more relevant advertising experiences across Google services.

Receive more relevant advertisements by sharing your interests and behavior with our trusted advertising partners.

Enables better ad targeting and measurement on Meta platforms, making ads you see more relevant.

Allows for improved ad effectiveness and measurement through Meta’s Conversions API, ensuring privacy-compliant data sharing.

LinkedIn Insights

Tracks conversions, retargeting, and web analytics for LinkedIn ad campaigns, enhancing ad relevance and performance.

LinkedIn CAPI

Enhances LinkedIn advertising through server-side event tracking, offering more accurate measurement and personalization.

Google Ads Tag

Tracks ad performance and user engagement, helping deliver ads that are most useful to you.

Share Knowledge, Get Respect!

or copy link

Cite according to academic standards

Simply copy and paste the text below into your bibliographic reference list, onto your blog, or anywhere else. You can also just hyperlink to this page.

New to UX Design? We’re Giving You a Free ebook!

The Basics of User Experience Design

Download our free ebook The Basics of User Experience Design to learn about core concepts of UX design.

In 9 chapters, we’ll cover: conducting user interviews, design thinking, interaction design, mobile UX design, usability, UX research, and many more!

ORIGINAL RESEARCH article

Differences between spatial and visual mental representations.

visual representation definition psychology

  • SFB/TR 8 Spatial Cognition, Universität Bremen, Bremen, Germany

This article investigates the relationship between visual mental representations and spatial mental representations in human visuo-spatial processing. By comparing two common theories of visuo-spatial processing – mental model theory and the theory of mental imagery – we identified two open questions: (1) which representations are modality-specific, and (2) what is the role of the two representations in reasoning. Two experiments examining eye movements and preferences for under-specified problems were conducted to investigate these questions. We found that significant spontaneous eye movements along the processed spatial relations occurred only when a visual mental representation is employed, but not with a spatial mental representation. Furthermore, the preferences for the answers of the under-specified problems differed between the two mental representations. The results challenge assumptions made by mental model theory and the theory of mental imagery.

1. Introduction

Our everyday behavior relies on our ability to process visual and spatial information. Describing the route to work, taking another person’s perspective, or imagining a familiar face or object all depend on our capability to process and reason with visual and spatial information.

Two main theoretic frameworks of visual and spatial knowledge processing have been proposed in cognitive science: mental model theory ( Johnson-Laird, 1989 , 1998 ; Tversky, 1993 ) and mental imagery ( Finke, 1989 ; Kosslyn, 1994 ; Kosslyn et al., 2006 ). Furthermore, there is also the conception of verbal or propositional mental representations ( Rips, 1994 ; Pylyshyn, 2002 ) that employ a sort of logical inference to reason about visual and/or spatial information. However, considerable evidence indicates that analogical mental representations, i.e., mental models or mental images, can better predict and explain the empirical data, in particular, for spatial reasoning (e.g., Byrne and Johnson-Laird, 1989 ; Kosslyn, 1994 ; Johnson-Laird, 2001 ).

In line with behavioral and neuroscientific evidence (e.g., Ungerleider and Mishkin, 1982 ; Levine et al., 1985 ; Newcombe et al., 1987 ; Farah et al., 1988 ; Courtney et al., 1996 ; Smith and Jonides, 1997 ; Mellet et al., 2000 ; Knauff and Johnson-Laird, 2002 ; Klauer and Zhao, 2004 ), mental model theory and the theory of mental imagery both propose a distinction between spatial and visual mental representations. The theory of mental imagery proposes spatial mental images and visual mental images; mental model theory proposes (spatial) mental models and visual mental images. Research based on the theories has, however, mostly focused on one of the two representations: the investigation of the properties of visual mental images in the theory of mental imagery and the investigation of reasoning with (spatial) mental models in mental model theory. Consequently, the relationship and interaction between the two types of mental representations is left largely unspecified in both theories. Although initial attempts exist (e.g., Schultheis and Barkowsky, 2011 ) to explain how visual and spatial mental representations interact and relate to each other, empirical data on the issue is largely missing. Accordingly, the primary aim of this article is to examine the differences and the relationship between visual and spatial mental representations. To achieve this, we first review how mental model theory on the one hand and the theory of mental imagery on the other hand understand spatial and visual mental representations as well as how they interpret the relationship between them. Even though there is much theoretic and empirical work on both theories, the literature lacks a systematic comparison of the theories. In the following, we present such a comparison. From this comparison, it will become clear that the theories actually propose very similar conceptions of spatial and visual mental representations but that their foci of investigation are mostly on different aspects and include phenomena not investigated within the respectively other theory. We examined these different aspects and used them in our experiments to gain new insights into the open issues of the relationship between visual and spatial mental representations. The results can be applied to complement gaps in the two theories.

2. Theories

2.1. mental model theory.

Mental model theory ( Johnson-Laird, 1998 ) postulates that there are three representational levels involved in human thinking: propositional representations, mental models, and mental images. The relationships between these three levels are hierarchical in the sense that their construction depends on each other. The following example helps to illustrate this point. Three-term series problems ( Johnson-Laird, 1972 ) are common experimental tasks in the study of mental models. They contain two premises and one conclusion that has to be validated or inferred based on the premises. Let the two premises be “A is left of B” and “B is right of C” and let the to-be-drawn conclusion be the relationship between A and C. According to mental model theory the premises are first encoded propositionally. From these propositional premises a mental model of the described configuration is constructed. As it is an essential property of mental models that “the structural relations between the parts of the model are analogous to the structural relations in the world” ( Johnson-Laird, 1998 , p. 447), one valid mental model of our example can be depicted in the following way:

We note that a mental model is a special case of the situation defined by the premises, because it only represents one valid situation with respect to the premises. For our example another mental model that satisfies the premises is:

Just like the situation represented by a mental model is a special case of what is described in the premises, mental model theory poses that a mental image is a special case of a given mental model. The mental image that is constructed from a mental model is one specific instance out of many valid instances described by the model, because the image has to specify, for example, the distance between the entities. The underlying mental model is in contrast invariant with respect to the distances. Summarizing the hierarchical structure of mental model theory, we note that a mental image is one out of many projections of the visualizable aspects of a mental model, and a mental model is one out of many analogically structured configurations that are valid given the propositionally represented premises. This suggests a clear hierarchy in which it is necessary to have the more general representations in order to construct a more specific one.

Mental models are described to be analogically structured, amodal, and abstract, e.g., they can represent abstract, non-visualizable relations such as “smarter than.” In contrast, mental images can only represent “visualizable” information and are modality-specific to visual perception (e.g., Johnson-Laird, 1998 ; Knauff and Johnson-Laird, 2002 ). It has been suggested that the analogical nature of mental models might be generally spatial ( Knauff et al., 2003 ), i.e., even reasoning with abstract relations like “worse than” or “better than” is handled by a spatio-analogical mental model. This view is supported by the association of mental model reasoning with activation in the parietal lobe (e.g., Goel and Dolan, 2001 ; Knauff et al., 2003 ), which is associated with several processes of spatial cognition (for an overview, see Sack, 2009 ). It was found that the use of “visual” relations, e.g., “dirtier than,” in relational reasoning tasks led to activation in the early visual cortex in contrast to tasks with other (abstract) relations, e.g., “worse than” ( Knauff et al., 2003 ). The study also found that “visual” relations led to longer reaction times and it was concluded that tasks using such “visual” relations induce the employment of visual mental images during the mental-model-based reasoning process.

Most of the literature on mental model theory focuses on how mental models explain reasoning. Johnson-Laird and Byrne (1991) state that reasoning according to the mental model theory consists of three stages: (1) the construction of one mental model (construction phase), (2) the inspection of the mental model (inspection phase), and (3) the variation of the mental model (variation phase). Slightly simplified, the reasoning process works as follows. One first mental model is constructed based on the given premises. This model represents one situation that is valid given the premises. This situation is inspected and can yield a possible conclusion. This conclusion is then verified to be valid in all other possible mental models that can be derived from the premises. If a conclusion is not contradicted in the other valid mental models, the conclusion is confirmed. There is much empirical support for this three stage process in human reasoning (e.g., Johnson-Laird, 2001 ). One interesting phenomenon in reasoning with mental models is the occurrence of preferred mental models when there are multiple valid conclusions. An example for such multiple valid conclusions are the two configurations “CAB” and “ACB” of the above example. It can be observed that there are reliable within-subject and between-subject preferences for which model is constructed first out of several valid mental models. This firstly constructed mental model is termed a preferred mental model. As a consequence, if there are several valid conclusions that can be inferred, there is a preference for one conclusion which corresponds to the preferred mental model. Preferred mental models have been investigated in different domains, but in particular in the domain of spatial reasoning (e.g., Rauh et al., 2005 ; Jahn et al., 2007 ; Schultheis and Barkowsky, 2013 ).

2.2. Theory of Mental Imagery

The theory of mental imagery ( Kosslyn, 1994 ; Kosslyn et al., 2006 ) makes a distinction between spatial mental images and visual mental images. These two mental representations differ in the content they represent and are distinct in their anatomical localization. But they are both assumed to have a (at least partially) spatio-analogical structure. Furthermore, there is also a propositional representation referred to as associative memory, which contains propositional descriptions of the structure of an object or a scene. This information can be used to construct spatial and visual mental images. For the latter, however, one needs to further retrieve encoded shape information from another source, i.e., the object-properties-processing subsystem, which can be thought of as a sort of non-analogical visual memory store located in the temporal lobe.

Spatial mental images (sometimes referred to as object maps) are located in the spatial-properties-processing subsystem in the framework of Kosslyn (1994) . They contain information about the location, size, and orientation of entities. The spatial-properties-processing subsystem is (at least partially) placed in the parietal lobe. Given that areas of the parietal lobe are topographically organized ( Sereno et al., 2001 ), it is assumed that spatial mental images are also at least partially spatio-analogical ( Kosslyn et al., 2006 ).

Visual mental images are constructed and processed in a structure called the visual buffer. The visual buffer consists of the topographically organized areas of the visual cortex. Visual mental images are thus assumed to be spatio-analogical or “depictive,” i.e., the metrics of what is represented, e.g., a shape, are reflected in the metrics of the representation. Visual mental images represent shape information, as well as, for example, color and depth.

A difference between spatial and visual mental images is that spatial mental images contain more information, in the sense that the current visual mental image in the visual buffer only contains a “visualized” part of what is represented in the spatial mental image ( Kosslyn et al., 2006 , p. 138). A visual mental image is a specification of a part of a spatial mental image.

Four types of functions are proposed for visual and spatial mental images: generation, inspection, maintenance, and manipulation. The generation of a mental image can either be just the retrieval of a spatial configuration of entities as a spatial mental image if no visual information is necessary for a given task or it can furthermore include the retrieval of shape information to generate a visual mental image in the visual buffer. Note that the visual buffer does not need to be employed for spatial mental images. Kosslyn et al. (2006) states that the processing of spatial and visual mental images occurs in parallel, i.e., the image of a shape is generated while a spatial image is generated. They furthermore state that this parallel processing might not always be useful, as the proper construction of a shape requires information about its spatial properties, i.e., location, size, and orientation which are provided by a respective spatial mental image ( Kosslyn et al., 2006 , p. 143). For the generation of multi-part visual mental images, a corresponding spatial mental image is necessary to guide the placement of shapes in the visual buffer by specifying the location, orientation, and size. The inspection process can make previously implicit information in a visual or spatial image explicit, i.e., new information is inferred. Visual mental images are inspected by shifting an attention window over the visual buffer. Through this inspection visual information, e.g., properties of a shape, as well as spatial information, e.g., spatial relations, can be inferred. It is also possible that new information is inferred from only a spatial mental image. However, no detailed information on the inspection of/inference in spatial mental images is provided by the theory. The function of image maintenance is used to re-construct parts of mental images as the information fades over time. The function of image manipulation allows the imagination of transformations of mental images. The theory posits that such manipulations are realized by altering the object map, i.e., the spatial mental image, underlying the visual mental image. One would, for example, change the location or size of an entity in the spatial mental image to alter the visual mental image that contains the shape information of that entity.

One interesting phenomenon of mental imagery is the observation of spontaneous eye movements during different visual mental imagery tasks. Brandt and Stark (1997) had participants imagine a previously memorized grid pattern and found that the eye movements during imagination reflected the content of the original stimuli. Spontaneous eye movements that reflect the processed spatial relations during mental imagery have since been found, for example, during imagination of natural scenes ( Holsanova et al., 1998 ), during imagination of detailed paintings and detailed descriptions of scenes while facing a white board as well as in total darkness ( Johansson et al., 2006 ), during reasoning with “visual” syllogisms, e.g., “a jar of pickles is below a box of tea bags,” ( Demarais and Cohen, 1998 ), and while listening to verbal descriptions of spatial scenes, e.g., “at the bottom there is a doorman in blue” ( Spivey and Geng, 2001 ). Johansson et al. (2012) report a series of experiments, in which participants were selectively forced to not move their eyes during mental imagery. They found that the suppression of eye movements has an impact on the quantity and quality of mental imagery. Their results strongly indicate a functional role of eye movements during mental imagery.

2.3. Open Questions

The previous two sections are summarized in Table 1 which provides a comparative overview of the two theories. From the comparison of the two theories, a great overlap in the assumptions made and structures and processes proposed by the two theories is evident. Many aspects of the two theories are revealed to be rather similar, perhaps more similar than one would have expected. In particular, they provide very similar descriptions of a spatial and a visual mental representation with respect to information content, localization, and hierarchical structure between the two representations. There are, however, some diverging predictions with respect to the modality of these representations and their role in reasoning. In the following, we discuss these differences and identify two main questions that arise from the comparison of these two theories.

www.frontiersin.org

Table 1 . Comparison of mental model theory and the theory of mental images .

The theory of mental imagery states that spatial mental images are processed in a component called the spatial-properties-processing subsystem. This subsystem is explicitly linked to the dorsal processing stream, which processes spatial information during visual perception ( Kosslyn et al., 2006 , p.138). Processing of spatial mental images uses (at least partly) the same processes used during processing of spatial information in visual perception. Mental models on the other hand are commonly assumed to be amodal or multi-modal (e.g., Johnson-Laird and Byrne, 1991 ). Accordingly, mental models are assumed to be used to also reason about abstract, non-spatial, information, e.g., “A is better than B” ( Knauff et al., 2003 ), whereas spatial mental images are assumed to process only spatial information. It has, however, been assumed that abstract information, e.g., “better than,” can be translated into spatial information in mental models ( Knauff et al., 2003 ). An opinion seemingly shared by Kosslyn (1994) , who states that information like “A is smarter than B” can be represented by dots on a line in a spatial mental image which would then correspond to a mental model in the sense of Johnson-Laird ( Kosslyn, 1994 , p. 324). The question that remains is whether the spatial representation, described as a mental model or a spatial mental image, is actually amodal/multi-modal (as claimed by mental model theory) or linked to the modality of visual perception (as seemingly proposed by the theory of mental imagery). Results pointing either way would help refining the theories.

Another open issue is the theories’ seemingly different prediction on the role of the spatial mental representation in reasoning. Unfortunately, both theories remain vague regarding the details of how spatial and visual representations interact during reasoning. In mental model theory it is often explicitly stated that it is mental models and not mental images that underlie human reasoning ( Knauff and Johnson-Laird, 2002 ; Knauff et al., 2003 ). The automatic generation of mental images through “visual” relations, e.g., “the dog is dirtier than the cat” is even considered to impede the reasoning process that happens on the level of mental models ( Knauff and Johnson-Laird, 2002 ). Of course, mental images can be important for reasoning if certain visual information is necessary, but it is not described how such visual information would be interpreted by nor how it would be transferred into the mental model for further reasoning. In the theory of mental imagery, it is made clear that visual mental images play a major role in reasoning: “[I]magery plays a critical role in many types of reasoning” ( Kosslyn, 1994 , p.404). And, contrasting mental model theory, visual mental images are assumed to be much more than just the provider of visual information for spatial mental images, in general, and particularly in reasoning ( Kosslyn, 1994 ; Kosslyn et al., 2006 ). The inspection of visual mental images constructed in the visual buffer can lead to new insights and is thus directly involved in the reasoning processes. According to Kosslyn et al. (2006) a visual mental image is generated using an underlying spatial mental image. However, the concrete role of the spatial mental image in the reasoning process is never elaborated in a way that would suggest that the spatial mental image is of specific importance to reasoning or even that it might be the actual reasoning component (as proposed in mental model theory).

Summarizing, we pointed out two main open issues regarding the differences between spatial mental representations and visual mental representations: (1) whether the spatial mental representation is rather amodal/multi-modal or whether it is also directly linked to visual perception like the visual mental representation; (2) to which extent the two mental representations are involved in reasoning, i.e., whether the spatial mental representation is the primary reasoning component or not.

3. Experiments

The comparison of the two theories, furthermore, showed that there are phenomena which have mostly been investigated only within the framework of one of the two theories. Preferences in under-specified problems have so far only been investigated within the framework of mental model theory while eye movements have been a focus of investigation almost only with mental images. In the presented experiments, we investigated to which extent these two phenomena are transferable to the respectively other type of mental representation. That is, we checked for spontaneous eye movements during reasoning with a spatial mental representation, i.e., a (spatial) mental model, and we checked for possible preferences when employing a visual mental representation, i.e., a visual mental image. In the following, we describe how the investigation of these phenomena informs us about the open questions stated in Section 3.3.

The tasks used in the experiments are three-term series relational reasoning problems about orientation knowledge. The two experiments differed only in their instructions which were formulated so that they induced the employment of a spatial mental representation in the first experiments and a visual mental representation in the second experiment.

We assume that we will confirm the findings of the literature that systematic eye movements occur during the second experiment (employing a visual mental representation) and that there are significant preferences in the answers of the participants in the first experiment (employing a spatial mental representation). The apparent functional role of eye movements during visual mental imagery provides strong evidence that visual mental representations are linked to processes of visual perception. These spontaneous eye movements reflect the spatial relations of the processed information. Both mental model theory and the theory of mental imagery assume spatial relations to be represented by a spatial mental representation, which supports the construction of a visual mental representation by providing the required spatial information. We tested whether such eye movements along the processed spatial relations would occur during employment of only a spatial mental representation, i.e., without the representation of visual content. The investigation of eye movements in this context can inform us about the question of the modality of spatial mental representations: if systematic eye movements occur during reasoning with spatial mental representations, then this would be a strong indication that mental models are not amodal, but are, in fact, linked to attentional processes of visual perception. A lack of systematic eye movements during reasoning with spatial mental representations, on the other hand, would support the assumption of mental model theory that mental models are amodal. More specifically, this would indicate that the processes of spatial mental representations do not employ the overt attentional processes of visual perception as it is the case for visual mental representations.

Preferred mental models are preferences for certain answers to under-specified reasoning problems that have been found for reasoning with mental models. These preferences are assumed to emerge because participants first construct one, perhaps the most parsimonious, mental model out of several valid models (e.g., Rauh et al., 2005 ). Visual mental images are also assumed to “depict” just one situation at a time; in fact it is hard to imagine how a “depictive” representation could represent more than one situation simultaneously. There are three possible outcomes for our investigation of such preferences for reasoning with visual mental representations: (1) we find no significant preferred answers, (2) we find different preferences for the two mental representations, or (3) we find the same preferences in reasoning with both mental representations. Finding no significant preferences in the answers when a visual mental representation is employed would strongly indicate that the assumption that visual mental representations build upon corresponding spatial mental representations is incorrect. Furthermore, this would indicate that the construction of visual mental representations can be subject to very strong individual differences. Such a finding seems unlikely and would not be predicted by any of the two theories. Should we find the same preferences in both experiments, i.e., for reasoning with both a spatial and a visual mental representation, the assumption of a hierarchal relationship between the two mental representations would be supported. This would strongly suggest that indeed the spatial configuration of a visual mental representation is taken from an underlying spatial mental representation. Should we find different preferences for the two mental representations, refinements of both mental model theory and the theory of mental imagery would be required to explain this disparity. In particular, such a finding would challenge the two theories to elaborate on their assumption that the construction of visual mental representations depends on an underlying spatial mental representation. Additionally, the strong claim made by mental model theory that reasoning is realized by spatial mental representations and not visual mental representations would without additional hypotheses be contradicted by this result.

In the following, the materials and methods employed in both conducted experiments are described.

3.1. Materials and Apparatus

The tasks used in the experiments are under-specified three-term series problems about orientation knowledge, specifically cardinal directions. We chose these problems because problems of this type, i.e., three-term series relational spatial reasoning, have been used in several studies of mental model theory before (e.g., Knauff et al., 2003 ; Byrne and Johnson-Laird, 1989 ; Schultheis et al., in revision). We use an eight-sector model of cardinal directions, i.e., the eight directions are north, north-east, east, south-east, south, south-west, west, and north-west. The problems are of the following form:

Premise 1: A is [direction 1] of B, e.g., A is north of B

Premise 2: B is [direction 2] of C, e.g., B is east of C

Conclusion: As seen from A, where is C?

The premises provide two spatial relations between three entities and the third spatial relation has to be inferred. In general, these problems are under-specified, i.e., there can be more than one correct conclusion given the premises. We used two classes of these problems, which we term 45° problems and 90° problems. These problems can be visualized as triangles with one of the three edges missing. This missing edge corresponds to the to-be-inferred spatial relation. We used all possible combinations in which the two given edges form either a 45° or a 90° angle. Figure 1 depicts an overview of all these problems.

www.frontiersin.org

Figure 1. The 16 different types of problems used in the experiments . The upper eight are 45° problems and the lower eight are 90° problems.

We can identify all possible correct solutions for the two problem sets. The 90° problems have three possible solutions and the 45° problems have 4 possible solutions. The different configurations leading to these solutions are depicted in Figure 2 . To distinguish the different solutions, we classify the underlying mental representations based on a visualization of the solution as triangles. In this context we use the term “model” to describe the underlying mental configuration, whether it might be a spatial mental representation or a visual mental representation. Models with very different distances for the given spatial relations are termed distorted models (DM) ; models with roughly equal distances for the given relations are termed equal-distance models (EDM) . The remaining valid solution for the 45° problems, the third solution in Figure 2 , always leads to one of the four main cardinal directions being inferred and is therefore termed cardinal model (CM) .

www.frontiersin.org

Figure 2. Possible valid models for a 45° problem are depicted as 1, 2, 3, and 4 . Possible valid models for a 90° problem are depicted as 5, 6, and 7. The models 1, 4, 5, and 7 are termed distorted models (DM) because the distances between the entities vary a lot from each other. The models 2 and 6 have equal distances and are termed equal-distance models (EDM). The model 3 is termed cardinal model (CM) because the to-be-inferred relation corresponds to one of the main cardinal directions, i.e., north, east, south, or west.

There are 16 different possible problems. We used them all twice with different letters resulting in a total of 32 problems. The 16 different problems consist of eight 45° and eight 90° problems, as depicted in Figure 1 .

Participants wore a head-mounted SensoMotoric Instruments (SMI) iView X HED eye tracker with a 200 Hz sampling rate to record their eye movements. To prevent expectancy effects, participants were told that the experiment investigates the size of their pupils. A post-experimental questionnaire verified that participants were not aware of the eye tracking.

3.2. Procedure

3.2.1. instructions.

The two experiments used slightly different instructions, so that they conformed with the usual instructions of both studies on mental models as well as studies on visual mental images. At the same time, the minimal change between the experiments helped to keep the tasks as similar as possible and minimize any differences besides the induced mental representation.

The instructions of the first experiment did not contain any suggestions to use visualization or visual information, but simply asked participants to infer the missing relation as fast and as accurately as possible. It is in line with previous experimental studies to assume the employment of mental models, i.e., a spatial mental representation, based on the fact that no visual information is required, given or asked for in the task (e.g., Johnson-Laird, 2001 ; Knauff and Johnson-Laird, 2002 ; Jahn et al., 2007 ).

The instructions of the second experiment only differed slightly from those of the first one. The participants were told that the letters represent cities that are to be imagined as little red squares with the respective letter next to them, which are all placed on a map. This slight variation made the instructions conform with those of several other visual mental imagery studies, i.e., using phrases such as “imagine […]” or “try to mentally see […]” (e.g., Kosslyn et al., 1983 ; Chambers and Reisberg, 1985 ; Borst et al., 2006 ).

In both experiments participants were asked to work as accurately and as fast as possible.

3.2.2. Setup

Participants were seated facing a blank white wall at a distance of approximately 1 m. Their hands were placed on their legs under a table holding a computer mouse in the one hand and a small ball in the other one. This was to prevent participants from using their fingers as an aid to solve the tasks. The eye tracker was mounted on the participant’s head and calibrated. All initial instructions of the experiment were projected on the white wall.

3.2.3. Learning phase

The experiment started with a learning phase to familiarize the participants with the cardinal directions. The learning phase consisted of acoustically presented statements and an answer screen with a question. Each statement was of the form “ K is [direction] of U .” After 4 s the answer screen appeared, which depicted the reference entity U surrounded by the numbers 1 to 8 in a counterclockwise circular order together with the question “ As seen from U, where is K? ” The eight numbers represented the eight cardinal directions (1 = north, 2 = north-west, 3 = west, … 8 = north-east). Participants answered by naming the respective number. In case of an incorrect answer, the correct answer was projected on the wall. The training phase ended as soon as each of the eight cardinal directions was recognized correctly twice in a row.

3.2.4. Problem trials

Participants were presented with a total of 48 trials. Out of those the first four were pre-trials intended to familiarize the participants with the form and procedure of the problems. Out of the remaining 44 trials, 12 were designed as filler trials. These filler trials differed in the order in which the entities were presented: AB, AC, BC, e.g., “ A is north of B; A is west of C; B is? of C ,” in contrast to the order of the remaining 32 problem trials: AB, BC, CA, e.g., “ A is north of B; B is east of C; C is? of A .” The filler trials served a double purpose. First, they were meant to prevent memory effects due to the identical order of all problem trials. Second, filler trials were employed to identify those time intervals in which participants show eye movements along the given directions. We elaborate on this method in Section 4.3. After the presentation of the four pre-trials, the remaining 44 trials were presented in randomized order.

3.2.5. Presentation

All premises and questions were presented acoustically. There was no projection on the white wall during the premises; after the conclusion phase an answer screen was projected onto the wall. Participants used the mouse to trigger the acoustic presentation of the first premise in each trial. As soon as they understood the statement, they clicked again for the presentation of the second premise. Similarly, they triggered the acoustic presentation of the question after having understood the second premise. Only after participants found an answer, they clicked the mouse again making the answer screen appear. The answer screen was the same as the one used in the learning phase. Participants verbally gave their answer by naming the number associated with the resulting direction. Participants continued to the next trial by clicking the mouse again.

The participants took between 35 and 50 min to complete the experiment.

3.3. Processing of the Eye Tracking Data

We processed the eye tracking data to identify whether eye movements occurred along the spatial relations given in each trial. We employed the same method for both experiments.

The raw eye tracking data collected by the iView X software was first converted using the IDF Event Detector to generate a list of fixations made by the participant. Saccades were calculated automatically from the sequence and coordinates of the participant’s fixations. Using the starting and ending coordinates of each saccade, we classified them into one of eight categories corresponding to the eight cardinal directions used in the trials. All possible angles of a saccade, interpreted as a vector in a Cartesian plane, were uniformly mapped to the set of cardinal directions. Each direction corresponds to a range of angles on the degree circle with each direction taking up (360°/8) = 45°. For example, north corresponded to all angles in the range of 0° ± (45°/2) = 0° ± 22.5° = [337.5°; 22.5°]. Note that the eye movements classified in this way are relative eye movements, i.e., the absolute coordinates do not matter. This is reasonable considering that participants moved their head during trials and that arbitrary eye movements occurred in between. Given this classification, we were able to investigate a possible coupling between the given direction and observed eye movements during a trial. If eye movements are linked to the processing of spatial relations, we expected eye movements to occur not only along the given direction, but also along the opposite one. Assuming a mental representation of, for example, A being north of B, it is plausible to not only expect attention shifts from A to B but also from B to A during inspection as well as construction of the representation. Thus, we always compared the absolute number of observed saccades to the sum of saccades made along the given and the opposite direction. For the first premise, we used the given direction, e.g., for the premise A is north of B we looked for saccades along the north-south axis. For the second premise, we used the direction given in the first premise as well as the new direction given in the second premise, e.g., for B is west of C , we looked for north-south (from premise 1) and for east-west. For the conclusion phase, we used the direction (and its opposite) that was given as the answer by the participant. We applied a binomial test with a probability of 1/4 to test whether the two expected directions were above chance for each participant for the first premise and the conclusion. For the second premise we applied a binomial test with a probability of 1/2 to test whether the four expected directions (two directions from each relation of the two premises) were above chance. For each phase we then applied a binomial test with a probability of 0.05 to check whether the number of participants showing significant eye movements is significantly above chance. The probability of 0.05 corresponds to how often a false positive of the previous binomial test is expected.

No prior information was available on when during the processing of the premises or the conclusion eye movements are to be expected. It is likely that participants spent some time understanding and verbally processing the presented premise or question before they started constructing the mental representation. Similarly, participants required some time preparing the action of clicking the mouse to trigger the next step after they finished the processing of the respective premise or question. We, therefore, used the obtained data during the first premises of the filler trials to gather information on when exactly participants started showing eye movements and whether we could find a temporal pattern. We only looked at eye movements during the first premise, because the filler trials are identical to the problem trials for the first premise. The difference in the order of the presented letters only became evident with the second premise. Therefore, we assumed the same behavior in the first premises of both the problem and the filler trials. We looked at the time interval between the first mention of the direction in the first premise and the time participants click to initiate the second premise. This interval was divided into ten equally long time slots. For each of these ten slots we summed up the eye movements of all participants for each experiment. We checked whether eye movements along the expected directions, i.e., those given in the respective premise (and its opposite), were significantly above chance in each of these intervals. We applied a binomial test using a probability of 1/4 for each of the four pairs of cardinal directions, e.g., north/south compared to east/west, north-east/south-west, and north-west/south-east. We applied this method independently for both experiments and used the identified time slots for the eye movement analysis of the problem trials.

3.4. Ethics Statement

The study was conducted within the Collaborative Research Center Spatial Cognition SFB/TR 8 funded by the German Research Foundation (DFG). The DFG’s board of ethics passed the research proposal that underlies the present study. DFG-funded projects do not require additional approval by other ethics committees. The studies are in full agreement with the ethical guidelines of the German Psychological Society (DGPs). Written informed consent was acquired from all participants.

4.1. Experiment 1: Spatial Mental Representation

4.1.1. participants.

Thirty undergraduate students of the University of Bremen, 12 male and 18 female, volunteered to take part in the experiment for monetary compensation.

Out of the 30 participants, one aborted the experiment and four were discarded due to an error rate of more than 30% incorrectly answered trials. The remaining 25 participants comprised 11 males and 14 females. The 0.05 level of significance was used for all statistical tests in both experiments.

4.1.2. Preferences

For the analysis of the preferences we discarded those trials for which the participants gave no or incorrect answers (12% of all trials). We compared the answers of all participants for all remaining trials to identify possible preferences. We differentiated between 90° and 45° problems and assumed that the given answers indicate the employment of the corresponding model. If no preferences existed, one would expect to observe distorted models and equal-distance models in 66% and 33% of all 90° problem trials, respectively. Likewise, distorted models, equal-distance models, and cardinal models should occur in 50%, 25%, and 25% of all 45° problem trials, respectively. To check for the existence of preferences, we compared the observed model percentages to these hypothetical ones. Figure 3 shows the resulting preferences for both problem types. There is a clear preference for the equal-distance model in the 90° problems. The answer corresponding to this model was given in 88.34% of all trials ( t (24) = 17.233; p < 0.001). The distorted models were employed significantly less than expected by chance with 11.66% ( t (24) = −17.233; p < 0.001). We found a significant preference for the equal-distance model in the 45° problems with 62.88% ( t (24) = 5.352; p < 0.001), whereas the 23.46% of the cardinal model did not differ significantly from the expected value ( t (24) = −0.215; p > 0.8). The distorted models were used significantly less than expected by chance with 13.66% ( t (24) = −9.995; p < 0.001).

www.frontiersin.org

Figure 3. Preferences in the first experiment . The vertical axis represents the frequency of the given answer. Top: 90° problems; bottom: 45° problems. Error bars show the standard error of the mean. EDM, equal-distance model; CM, cardinal model; DM, distorted models.

4.1.3. Eye movements

Table 2 shows the time slots identified by analyzing the eye movements during the filler trials. We used the last six out of ten time slots for our analysis of the eye movements during the actual problem trials. We decided to use all six slots despite the fact that two out of those did not show significant eye movements in the filler trials, because it is plausible that processing was not interrupted in between, but ran continuously after participants have understood the premise. Table 3 shows that the amount of participants showing eye movements along the given directions is not significant in neither the first nor the second premise (all p > 0.35), but significant during the conclusion phase ( p < 0.05).

www.frontiersin.org

Table 2 . Analysis of eye tracking data from the first premise of all filler trials .

www.frontiersin.org

Table 3 . The number of participants showing significant eye movements along the given directions .

The left parts of the Figures 4 and 5 show diagrams of the recorded eye movements during all first premises of the form A is west of B and A is north-west of B , respectively. It is evident that the percentage of saccades along the given direction and the opposing direction are not above chance, i.e., 12.5%, for both types of premises.

www.frontiersin.org

Figure 4. Distribution of eye movements during first premises of the form “A is west of B.” Amplitude represents the percentage of saccades mapped onto the respective cardinal direction.

www.frontiersin.org

Figure 5. Distribution of eye movements during first premises of the form “A is north-west of B.” Amplitude represents the percentage of saccades mapped onto the respective cardinal direction.

4.2. Experiment 2: Visual Mental Representation

4.2.1. participants.

Thirty one undergraduate students of the University of Bremen, 15 male and 16 female, participated in the study for monetary compensation.

Eight of the 31 participants were discarded due to an error rate of more than 30% incorrectly answered trials. The remaining 23 participants comprised 12 males and 11 females.

4.2.2. Preferences

Preferences were analyzed in the same way as in Experiment 1. We discarded those trials for which the participants gave no or an incorrect answer for the analysis of the preferences (9% of all trials). Figure 6 shows the preferences for both problem types. For the 90° problems, the equal-distance model was used in 93.2% of all trials, which shows a significant preference ( t (22) = 29.350; p < 0.001). Consequently, the distorted models are employed significantly below chance with 6.8% ( t (22) = −29.350; p < 0.001). For the 45° problems, we found a significant preference for the equal-distance model with 46.32% ( t (22) = 2.512; p < 0.05) as well as for the cardinal model with 47.9% ( t (22) = 2.683; p < 0.05). The distorted models were used significantly less compared to their expected value with 5.78% ( t (22) = −25.360; p < 0.001).

www.frontiersin.org

Figure 6. Preferences in the second experiment . The vertical axis represents the frequency of the given answer. Top: 90° problems; bottom: 45° problems. Error bars show the standard error of the mean. EDM, equal-distance model; CM, cardinal model; DM, distorted models.

4.2.3. Eye movements

Table 2 shows the time slots during which participants showed significant eye movements during the filler trials. Based on this, we used the last seven out of ten time slots for the eye movement analysis for the problem trials. We decided to use all seven slots despite the fact that one out of those did not contain significant eye movements, because we assumed, just as in the first experiment, that processing is not paused in between. Contrasting the first experiment, we found a significant amount of participants showing significant eye movements during all three phases (Prem. 1: p < 0.001; Prem. 2: p < 0.05; Concl.: p < 0.01) as shown in Table 3 .

The right parts of the Figures 4 and 5 show diagrams of the recorded eye movements during all first premises of the form A is west of B and A is north-west of B , respectively. The Figures show that saccades along the given direction as well as the opposing direction are above the frequency of chance (i.e., 12.5%) for both types of premises.

4.2.4. Comparison of eye-movers to non-eye-movers

Based on the literature, we expected to find spontaneous eye movements corresponding to the processed spatial relations when a visual mental representation is employed. In line with this assumption, a majority of participants exhibited systematic eye movements. We compared the participants that showed a significant amount of eye movements along the given directions in any of the phases (both premises or the conclusion) with those that did not show significant eye movements in any of the phases. Given this definition, 13 out of the 23 participants qualified as eye-movers; the 10 remaining participants will be referred to as non-eye-movers.

There was no significant difference between eye-movers and non-eye-movers regarding error rate, reaction times, and sex (all p > 0.19). The eye-movers and non-eye-movers, however, showed different preferences for the 45° problems as shown in Figure 7 . The eye-movers showed a significant preference for the cardinal direction model with 54.84% ( t (12) = 2.7884; p < 0.05). The equal-distance model was not significantly preferred with 37.85% ( t (12) = 1.2527; p > 0.23) and the distorted models were employed significantly less than expected with 7.31% ( t (12) = −16.1961; p < 0.001). The non-eye-movers showed no significant preference for the cardinal direction model with 38.88% ( t (9) = 0.994; p > 0.34) but for the equal-distance model with 57.32% ( t (9) = 2.2926; p < 0.05). The distorted models were significantly below expectation with 3.8% ( t (9) = −22.3421; p < 0.001).

www.frontiersin.org

Figure 7. Preferences of the 45° problems in the second experiment . The vertical axis represents the frequency of the given answer. Top: non-eye-movers; bottom: eye-movers. Error bars show the standard error of the mean. EDM, equal-distance model; CM, cardinal model; DM, distorted models.

4.3. Comparison of the Experiments

In the first experiment, it is only during the conclusion phase that the number of participants that showed systematic eye movements becomes significant. This finding seems unexpected given that the number of eye-movers of the second experiment is highest during the first premise whereas the number of eye-movers in the first experiment is not significant for neither premise. Furthermore, analysis of the eye movements should be most accurate for the first premise as participants are only aware of one spatial relation at that time and all saccades along the other directions can be assumed not to have any relation to the mental representation constructed. In contrast, during the second premise or the conclusion, all three spatial relations are (at least implicitly) available to the participant and could also result in eye movements, which would, however, not all be counted as “correct” eye movements, because we only checked for the spatial relations of the two premises during the second premise and we only checked for the relation that is given as the answer during the conclusion. Thus, the chance for finding significant eye movements during specifically the conclusion phase should be lower than for the first premise. It can, accordingly, be argued that eye movements during the conclusion phase did not necessarily result from the internal processing of spatial relations but that some participants moved their gaze in anticipation of the answer screen. The answer screen was projected on the wall just after participants clicked to indicate they found an answer. A saccade from the middle of their visual field toward the appropriate number on the answer screen, i.e., the number which represents their given answer, would have been mapped onto the cardinal direction that corresponds to their answer. Thus, there is reason to doubt that the significant number of eye-movers that we find for the conclusion phase in the first experiment is a result of the employed mental representation.

Given the lack of spontaneous eye movements along the processed relations for the non-eye-movers of the second experiment, we conclude that these participants did not employ a visual mental representation. This conclusion is based on the literature (see Section 3.2) which shows that employment of visual mental representations is related to the occurrence of such spontaneous eye movements and, furthermore, that these eye movements have a functional role in the employment of visual mental representations ( Johansson et al., 2012 ). It may be that the non-eye-movers likely used a spatial mental representation like the participants of the first experiment; this conclusion does, however, not follow from the observation or the literature. We, therefore, remain agnostic regarding the mental representation of the non-eye-movers of the second experiment.

4.3.1. Comparing reasoning with visual and spatial mental representations

As the two experiments consisted of the same task with only slightly different instructions, we compared participants across the experiments 1 . Reaction times that were outside a 2.5*SD range from the mean reaction time of the corresponding phase (first and second premise and the conclusion) were excluded from analysis (3%).

In order to compare the employment of visual mental representations with that of spatial mental representations, we defined two groups: the visual group and the spatial group. The spatial group comprises all participants of the first experiment. The eye-movers of the second experiment constitute the visual group. That is, the spatial group contains those participants which employed a spatial mental representation and the visual group contains those that employed a visual mental representation. There were no significant differences regarding error rate, reaction times, and sex (all p > 0.35) between the visual and the spatial group. However, the preferences of the two groups differed as indicated by a significant interaction between group (spatial or visual) and type of model (cardinal or equal-distance), F (1,36) = 5.644 ,p < 0.05. Figure 8 shows the preferences of the visual and the spatial group. The spatial group showed a preference for the equal-distance model (EDM) but not for the cardinal model (CM). In contrast, the visual group showed a preference for the cardinal model (CM) but not the equal-distance model (EDM). Table 4 shows an overview of the preferences for the different groups and experiments. Interestingly, the non-eye-movers of the second experiment showed the same preferences as the participants of the first experiment, i.e., a significant preference for the equal-distance model (EDM) and no significant preference for the cardinal model (CM). This may be taken to indicate that the non-eye-movers employed a spatial mental representation despite the fact that the instructions are formulated to induce a visual mental representation.

www.frontiersin.org

Figure 8. Preferences of the 45° problems for the spatial group (top) and the visual group (bottom) . The vertical axis represents the frequency of the given answer. Error bars show the standard error of the mean. EDM, equal-distance model; CM, cardinal model; DM, distorted models.

www.frontiersin.org

Table 4 . Comparison of preferences for the 45° problems between different groups; S+, frequency significantly above chance; S−, frequency significantly below chance; NS, frequency does not significantly differ from chance; CM, cardinal model; EDM, equal-distance model; DM, distorted models .

5. Discussion

The conducted experiments yielded two main results. First, the employed reasoning task led to no significant systematic eye movements when a spatial mental representation was employed, i.e., for the spatial group. In contrast, we found significant systematic eye movements for a majority of the participants in the second experiment, i.e., the visual group which employed a visual mental representation. Second, there are significant preferences in the answers for the under-specified problems in both the visual and the spatial group. The preferences did, however, differ between the employed mental representations.

These results relate to the two main open issues about the relationship between spatial and visual mental representation (identified in Section 3.3): (1) whether spatial mental representations are modality-specific, and (2) whether human visuo-spatial reasoning is realized on the level of spatial mental representations.

Regarding the first issue, we observed systematic eye movements in the second experiment but not for the first experiment. The eye movements observed in second experiment, i.e., the one in which the employment of a visual mental representation was induced, corroborate several studies reporting spontaneous eye movements during visual mental imagery. The fact that we did not find these eye movements for the essentially same reasoning task in the first experiment, i.e., the one in which the employment of a spatial mental representation was induced, suggests that other (attentional) processes are employed when reasoning with spatial mental representations. Since eye movements have been found to play a functional role in processing visual mental representations ( Johansson et al., 2012 ) and are therefore not epiphenomenal, we can conclude that reasoning with visual mental representations draws on overt attentional processes of visual perception and reasoning with spatial mental representations does not. This finding lends support to the assumption of mental model theory that spatial mental representations are amodal or multi-modal.

Regarding the second issue – whether reasoning is realized on the level of spatial mental representations – our results show different preferences depending on the employed mental representation. The visual group showed a significant preference for the cardinal model (CM) but not for the equal-distance model (EDM) for the 45° problems. In contrast, the spatial group showed a significant preference for the equal-distance model (EDM) but not for the cardinal model (CM) for the 45° problems. Mental model theory assumes that the hierarchical relationship between visual mental representations and spatial mental representations is such that reasoning happens on the level of the spatial mental representation ( Knauff and Johnson-Laird, 2002 ; Knauff et al., 2003 ) specifically when visual information is irrelevant to the task at hand (as it is the case in the presented experiments). This assumption seems in contradiction to the presented results. The fact that we observed different preferences for the two mental representations for essentially the same reasoning task challenges the claim that reasoning is based on spatial mental representations. This similarly affects the theory of mental imagery which also states that visual mental representations require underlying spatial mental representations. In order to construct, inspect and reason with a visual representation, spatial information is necessary to, for example, “know” the location, size, and spatial relations of the shapes that make up a visual mental representation. The results of the experiments are thus hard to reconcile with both the mental model theory and the theory of mental imagery. The assumed hierarchical relationship between spatial and visual mental representations has to be extended with additional explanations about how spatial information is transformed or processed differently in a visual mental representation. In the following, we interpret the results on the preferences with respect to this assumption of the two theories.

The preferred answer given by participants in the spatial group was such that the spatial configuration of the problem has equal distances between the entities. In contrast, the preferred answer of the participants in the visual group was such that the spatial configuration contains distances of different length. This is especially puzzling given the assumption that those spatial relations are supposed to be provided by the spatial mental representation to the visual mental representation. Sticking with the assumption that the spatial information is provided by the underlying spatial representation, one can think of two general explanations: (1) the spatial relations are somehow altered in the context of a visual representation, or (2) the spatial relations are the same but are processed differently in the two mental representations. Regarding the first option, spatial relations might become more specified when represented in a visual mental representation ( Schultheis et al., 2007 ). That is, additional properties such as distance are specified. On the level of the spatial mental representation, distance might only be represented with generic default values. This would fit with the preferred mental model of the spatial group, in which all distances are equal. This option is furthermore supported by an assumption of mental model theory: “[w]hen people understand spatial descriptions, they imagine symmetrical arrays in which adjacent objects have roughly equal distances between them […]” ( Johnson-Laird and Byrne, 1991 , p. 94). Regarding the second option, the way spatial relations are processed could differ between the two mental representations. This explanation would fit well with the fact that we found spontaneous eye movements that align with the currently processed spatial relations for the visual group, but we did not find such eye movements for the spatial group. An implementation of the second option is proposed by a new model of visuo-spatial mental imagery in which processing of spatial relations is affected by additional visual information and realized by attention shifts such as eye movements ( Sima, 2011 ; Sima and Freksa, 2012 ).

6. Conclusion

Our experiments provided two new insights on the so far little investigated relationship between visual and spatial mental representations: (1) visual and spatial mental representations differ in their employment of overt attentional processes of visual perception, (2) there are preferences when employing visual mental representations just as for spatial mental representations, but the preferences can differ for the same reasoning task. These findings are hard to reconcile with current theories on visuo-spatial processing and challenge some of their assumptions. Future work is necessary to shed more light on the exact relationship between visual and spatial mental representations. This will have to include the refinement of the existing theoretical frameworks on the one hand as well as further empirical research on the other hand. Regarding the theories, we have additionally presented a systematic comparison of mental model theory and the theory of mental imagery. This comparison showed that the two theories that are often investigated separately likely investigate the same visual and spatial mental representations. This comparison might serve as the basis of a new unified theory combining the results achieved within both mental model theory and the theory of mental imagery. Regarding the future empirical work, the presented experiments show one way of comparing visual and spatial mental representations while keeping the experimental task essentially the same.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The presented work was done in the project R1-[ImageSpace] of the Transregional Collaborative Research Center SFB/TR 8 Spatial Cognition. Funding by the German Research Foundation (DFG) is gratefully acknowledged. We thank Sven Bertel for fruitful discussions with respect to experimental design and Maren Lindner for her role in designing and conducting the experiments. We thank Thomas Lachmann, David Peebles, and Jelica Nejasmic for their comments and suggestions which helped improve this article.

  • ^ There is evidence that the ( a priori ) differences between the two groups of the two experiments were not more substantial or qualitatively different than if the groups had resulted from random assignments within a single experiment. First, the participants of both experiments were recruited from the same population of students from the University of Bremen. The setup of the two experiments was identical apart from the variation in instructions. This includes specifically the equipment, the room, the experimenter, and the materials. The experiments were conducted within two consecutive semesters. Second, the two groups did not significantly differ with respect to sex ( χ ( 1 ) 2 = 1 . 42 , p > 0 . 2 ) or field of study ( χ ( 2 ) 2 = 1 . 18 , p > 0 . 5 ) . The two groups also did not differ significantly in their age ( t (46) = –1.084; p > 0.28, two-tailed) or performance in the paper-folding test ( t (46) = –0.455; p > 0.65, two-tailed). Third, using the method described in Masson (2011) , the participants age and performance in the paper-folding test, provided positive evidence for the null hypothesis that the two groups did not differ ( p ( H 0 | D ) = 0.79, p ( H 0 | D ) = 0.86, respectively).

Borst, G., Kosslyn, S., and Denis, M. (2006). Different cognitive processes in two image-scanning paradigms. Mem. Cognit. 34, 475–490.

Pubmed Abstract | Pubmed Full Text | CrossRef Full Text

Brandt, S. A., and Stark, L. W. (1997). Spontaneous eye movements during visual imagery reflect the content of the visual scene. J. Cogn. Neurosci. 9, 27–38.

CrossRef Full Text

Byrne, R. M. J., and Johnson-Laird, P. (1989). Spatial reasoning. J. Mem. Lang. 28, 564–575.

Chambers, D., and Reisberg, D. (1985). Can mental images be ambiguous? J. Exp. Psychol. Hum. Percept. Perform. 11, 317–328.

Courtney, S. M., Ungerleider, L. G., Keil, K., and Haxby, J. V. (1996). Object and spatial visual working memory activate separate neural systems in human cortex. Cereb. Cortex 6, 39–49.

Demarais, A. M., and Cohen, B. H. (1998). Evidence for image-scanning eye movements during transitive inference. Biol. Psychol. 49, 229–247.

Farah, M. J., Levine, D. N., and Calvanio, R. (1988). A case study of mental imagery deficit. Brain Cogn. 8, 147–164.

Finke, R. A. (1989). Principles of Mental Imagery . Cambridge, MA: MIT Press.

Goel, V., and Dolan, R. J. (2001). Functional neuroanatomy of three-term relational reasoning. Neuropsychologia 39, 901–909.

Holsanova, J., Hedberg, B., and Nilsson, N. (1998). “Visual and verbal focus patterns when describing pictures,” in Current Oculomotor Research: Physiological and Psychological Aspects , eds W. Becker, H. Deubel, and T. Mergner (New York: Plenum) 303–304.

Jahn, G., Knauff, M., and Johnson-Laird, P. N. (2007). Preferred mental models in reasoning about spatial relations. Mem. Cognit. 35, 2075–2087.

Johansson, R., Holsanova, J., Dewhurst, R., and Holmqvist, K. (2012). Eye movements during scene recollection have a functional role, but they are not reinstatements of those produced during encoding. J. Exp. Psychol. Hum. Percept. Perform. 38, 1289–1314.

Johansson, R., Holsanova, J., and Holmqvist, K. (2006). Pictures and spoken descriptions elicit similar eye movements during mental imagery, both in light and in complete darkness. Cogn. Sci. 30, 1053–1079.

Johnson-Laird, P. N. (1972). The three-term series problem. Cognition 1, 57–82.

Johnson-Laird, P. N. (1989). “Mental models,” in Foundations of Cognitive Science , ed. M. I. Posner (Cambridge, MA: MIT Press), 469–499.

Johnson-Laird, P. N. (1998). “Imagery, visualization, and thinking,” in Perception and Cognition at Century’s End , ed. J. Hochberg (San Diego: Academic Press), 441–467.

Johnson-Laird, P. N. (2001). Mental models and deduction. Trends Cogn. Sci. (Regul. Ed.) 5, 434–442.

Johnson-Laird, P. N., and Byrne, R. M. J. (1991). Deduction . Hove: Erlbaum.

Klauer, K., and Zhao, Z. (2004). Double dissociations in visual and spatial short-term memory [Review]. J. Exp. Psychol. Gen. 133, 355–381.

Knauff, M., Fangmeier, T., Ruff, C. C., and Johnson-Laird, P. N. (2003). Reasoning, models, and images: behavioral measures and cortical activity. J. Cogn. Neurosci. 15, 559–573.

Knauff, M., and Johnson-Laird, P. (2002). Visual imagery can impede reasoning. Mem. Cognit. 30, 363–371.

Kosslyn, S. (1973). Scanning visual images – some structural implications. Percept. Psychophys. 14, 90–94.

Kosslyn, S. M. (1980). Image and Mind . Cambridge, MA: Harvard University Press.

Kosslyn, S. M. (1994). Image and Brain: The Resolution of the Imagery Debate . Cambridge, MA: The MIT Press.

Kosslyn, S. M., Reiser, B. J., Farah, M. J., and Fliegel, S. L. (1983). Generating visual images: units and relations. J. Exp. Psychol. Gen. 112, 278–303.

Kosslyn, S. M., and Thompson, W. L. (2003). When is early visual cortex activated during visual mental imagery? Psychol. Bull. 129, 723–746.

Kosslyn, S. M., Thompson, W. L., and Ganis, G. (2006). The Case for Mental Imagery . New York: Oxford University Press.

Levine, D. N., Warach, J., and Farah, M. (1985). Two visual systems in mental imagery: dissociation of “what” and “where” in imagery disorders due to bilateral posterior cerebral lesions. Neurology 35, 1010–1018.

Masson, M. E. J. (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behav. Res. Methods 43, 679–690.

Mellet, E., Tzourio-Mazoyer, N., Bricogne, S., Mazoyer, B., Kosslyn, S. M., and Denis, M. (2000). Functional anatomy of high-resolution visual mental imagery. J. Cogn. Neurosci. 12, 98–109.

Newcombe, F., Ratcliff, G., and Damasio, H. (1987). Dissociable visual and spatial impairments following right posterior cerebral lesions: clinical, neuropsychological and anatomical evidence. Neuropsychologia 25, 149–161.

Pylyshyn, Z. W. (2002). Mental imagery: in search of a theory. Behav. Brain Sci. 25, 157–238.

Rauh, R., Hagen, C., Knauff, M., Kuss, T., Schlieder, C., and Strube, G. (2005). Preferred and alternative mental models in spatial reasoning. Spat. Cogn. Comput. 5, 239–269.

Rips, L. J. (1994). The Psychology of Proof: Deductive Reasoning in Human Thinking . Cambridge, MA: MIT Press.

Sack, A. T. (2009). Parietal cortex and spatial cognition. Behav. Brain Res. 202, 153–161.

Schultheis, H., and Barkowsky, T. (2011). Casimir: an architecture for mental spatial knowledge processing. Top. Cogn. Sci. 3, 778–795.

Schultheis, H., and Barkowsky, T. (2013). Variable stability of preferences in spatial reasoning. Cogn. Process. doi:10.1007/s10339-013-0554-4

Schultheis, H., Bertel, S., Barkowsky, T., and Seifert, I. (2007). “The spatial and the visual in mental spatial reasoning: an ill-posed distinction,” in Spatial Cognition V – Reasoning, Action, Interaction , eds T. Barkowsky, M. Knauff, G. Ligozat, and D. R. Montello (Berlin: Springer Verlag), 191–209.

Sereno, M. I., Pitzalis, S., and Martinez, A. (2001). Mapping of contralateral space in retinotopic coordinates by a parietal cortical area in humans. Science 294, 1350–1354.

Sima, J. F. (2011). “The nature of mental images – an integrative computational theory,” in Proceedings of the 33rd Annual Conference of the Cognitive Science Society , eds L. Carlson, C. Hoelscher, and T. Shipley (Austin, TX: Cognitive Science Society), 2878–2883.

Sima, J. F., and Freksa, C. (2012). Towards computational cognitive modeling of mental imagery. Künstliche Intell. 26, 1–7.

Smith, E. E., and Jonides, J. (1997). Working memory: a view from neuroimaging. Cogn. Psychol. 33, 5–42.

Spivey, M. J., and Geng, J. J. (2001). Oculomotor mechanisms activated by imagery and memory: eye movements to absent objects. Psychol. Res. 65, 235–241.

Tversky, B. (1993). “Cognitive maps, cognitive collages, and spatial mental models,” in Spatial Information Theory: A Theoretical Basis for GIS – Proceedings of COSIT’93 , eds A. U. Frank and I. Campari (Berlin: Springer), 14–24.

Ungerleider, L., and Mishkin, M. (1982). “Two cortical systems,” in Analysis of Visual Behavior , eds D. J. Ingle, M. A. Goodale, and R. J. W. Mansfield (Cambridge: MIT Press), 549–586.

Keywords: mental representation, mental imagery, mental models, preferred mental models, visual mental representation, spatial mental representation, eye tracking

Citation: Sima JF, Schultheis H and Barkowsky T (2013) Differences between spatial and visual mental representations. Front. Psychol. 4 :240. doi: 10.3389/fpsyg.2013.00240

Parts of this research have been presented at the Spatial Cognition conference 2010.

Received: 07 December 2012; Accepted: 12 April 2013; Published online: 08 May 2013.

Reviewed by:

Copyright: © 2013 Sima, Schultheis and Barkowsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License , which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.

*Correspondence: Jan Frederik Sima, Department of Informatics, Cognitive Systems, Universitat Bremen, Enrique-Schmidt-Str. 5, 28359 Bremen, Germany. e-mail: sima@sfbtr8.uni-bremen.de

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Visual Representation

  • Reference work entry
  • pp 3405–3410
  • Cite this reference work entry

visual representation definition psychology

  • Yannis Ioannidis 3  

437 Accesses

Graphical representation

The concept of “representation” captures the signs that stand in for and take the place of something else [ 5 ]. Visual representation, in particular, refers to the special case when these signs are visual (as opposed to textual, mathematical, etc.). On the other hand, there is no limit on what may be (visually) represented, which may range from abstract concepts to concrete objects in the real world or data items.

In addition to the above, however, the term “representation” is often overloaded and used to imply the actual process of connecting the two worlds of the original items and of their representatives. Typically, the context determines quite clearly which of the two meanings is intended in each case, hence, the term is used for both without further explanation.

Underneath any visual representation lies a mapping between the set of items that are being represented and the set of visual elements that are used to represent them, i.e., to...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Card S.K., Mackinlay J.D., and Shneiderman B. Information visualization. In Readings in Information Visualization: Using Vision to Think, 1999, pp. 1–34.

Google Scholar  

Card S.K., Mackinlay J.D., and Shneiderman B. Readings in Information Visualization: Using Vision to Think. Morgan Kaufman, Los Altos, CA, 1999.

Foley J.D., van Dam A., Feiner S.K., and Hughes J.F. Computer Graphics: Principles and Practice. Addison-Wesley, Reading, MA, 1990.

Haber E.M., Ioannidis Y., and Livny M. Foundations of visual metaphors for schema display. J. Intell. Inf. Syst., 3(3/4):263–298, 1994.

Article   Google Scholar  

Mitchell W. Representation. In Critical Terms for Literary Study,Lentricchia F and McLaughlin T. (eds.), 2nd edn., Chicago, IL. University of Chicago Press, 1995.

Tufte E.R. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CO, 1983.

Tufte E.R. Envisioning Information. Graphics Press, Cheshire, CO, 1990.

Download references

Author information

Authors and affiliations.

University of Athens, Athens, Greece

Yannis Ioannidis

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

College of Computing, Georgia Institute of Technology, 266 Ferst Drive, 30332-0765, Atlanta, GA, USA

LING LIU ( Professor ) ( Professor )

Database Research Group David R. Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, N2L 3G1, Waterloo, ON, Canada

M. TAMER ÖZSU ( Professor and Director, University Research Chair ) ( Professor and Director, University Research Chair )

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry.

Ioannidis, Y. (2009). Visual Representation. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_449

Download citation

DOI : https://doi.org/10.1007/978-0-387-39940-9_449

Publisher Name : Springer, Boston, MA

Print ISBN : 978-0-387-35544-3

Online ISBN : 978-0-387-39940-9

eBook Packages : Computer Science Reference Module Computer Science and Engineering

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Search Menu

Sign in through your institution

  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical Numismatics
  • Classical Literature
  • Classical Reception
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Archaeology
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Late Antiquity
  • Religion in the Ancient World
  • Social History
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Variation
  • Language Families
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Modernism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Culture
  • Music and Media
  • Music and Religion
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Society
  • Law and Politics
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Legal System - Costs and Funding
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Restitution
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Medical Ethics
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Games
  • Computer Security
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business History
  • Business Ethics
  • Business Strategy
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Social Issues in Business and Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Methodology
  • Economic History
  • Economic Systems
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Management of Land and Natural Resources (Social Science)
  • Natural Disasters (Environment)
  • Pollution and Threats to the Environment (Social Science)
  • Social Impact of Environmental Issues (Social Science)
  • Sustainability
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • Ethnic Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Theory
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Politics and Law
  • Politics of Development
  • Public Policy
  • Public Administration
  • Qualitative Political Methodology
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Disability Studies
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Perception

  • < Previous chapter
  • Next chapter >

5 On the Function of Visual Representation

  • Published: April 1996
  • Cite Icon Cite
  • Permissions Icon Permissions

The advent of the computer age enabled significant developments in the study of visual representation, particularly in the emergence of computational theories which are able to make sense of the large volume of data collected. This background leads into a discussion of the “Literalist View,” which explains the phenomenon of perception as the product of similar logical computations by the brain to reconcile visual stimuli with existing mental “retinotopic structures,” which are assumed to be truthful representations of our world. The chapter then cites several works—namely, that of Churchland, Grimes, and the Nina experiments—that discuss the loopholes in the theory. An alternative, non-Literalist theory is then offered—the “Functional View”—which provides a different insight into how the brain interprets visual stimuli. Specifically, it is posited that there is evidence of selective visual representation, dependent on the importance of the visual stimuli to the particular individual.

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

Month: Total Views:
October 2022 1
November 2022 3
November 2023 2
December 2023 2
January 2024 2
February 2024 2
April 2024 2
June 2024 2
August 2024 2
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

psychology

Visual Imagery

Visual Imagery is the mental representation or recreation of something that is not physically present. It involves the mind’s ‘eye’ forming images, enabling us to ‘see’ a concept, idea, or physical object even when it is not before our eyes. This cognitive process can significantly impact our thought processes, memory recall, and even physiological responses.

Types of Visual Imagery

Visual Imagery is not restricted to a single form. It may take various shapes and formats, each with its unique benefits and uses.

Static Imagery

Static Imagery involves the mental visualization of still images . This can include anything from remembering a picture you saw in a book, envisioning the face of a loved one, or even recalling a scene from a movie.

Dynamic Imagery

Dynamic Imagery, on the other hand, involves the creation of moving images in the mind. This could include imagining a horse galloping across a field, visualizing the flow of a river, or picturing a car driving down a street.

Interactive Imagery

Interactive Imagery is a step beyond static and dynamic imagery, wherein you visualize yourself interacting with the visualized scene or object. Athletes often use this type of imagery, imagining themselves performing their sport to prepare mentally for the actual event.

Examples of Visual Imagery

Visual imagery is often used in literature to create vivid mental pictures that immerse readers in the story. Consider the following excerpt from “The Great Gatsby” by F. Scott Fitzgerald:

“In his blue gardens men and girls came and went like moths among the whisperings and the champagne and the stars.”

In this sentence, Fitzgerald uses visual imagery to describe Gatsby’s lavish parties. The readers can picture the men and girls moving around the blue gardens, sipping champagne under the starlit sky, much like moths fluttering around.

Everyday Life

Everyday life presents numerous instances of visual imagery. For example, when planning a trip, you might visualize the sights you’ll see, the hotel room you’ll stay in, or the food you’ll eat. Similarly, if you’re cooking a new recipe, you might imagine each step, visualizing how to chop the vegetables, stir the ingredients in the pan, or present the dish on a plate.

Visual artists use imagery as a crucial component of their work. For instance, painters visualize the final product before even touching the brush to the canvas. This mental image guides their hand movements and choice of colors as they bring their vision to life.

In sports, athletes often employ visual imagery to improve their performance. A basketball player might visualize making a successful free throw, picturing the trajectory of the ball, its arc towards the hoop, and the satisfying swish of the net. This mental practice can boost their confidence and improve their actual performance in the game.

In psychology, guided imagery therapy involves therapists directing patients to imagine a particular scene or scenario. For example, a person dealing with stress might be guided to picture a peaceful beach, feeling the warmth of the sun on their skin, hearing the gentle waves lapping at the shore, and smelling the salty sea air. Such visual imagery can promote relaxation and stress relief.

Visual Imagery is an innate capability that not only enables us to revisit the past and anticipate the future but also empowers us to conceptualize, learn, and even heal. Understanding its various forms and applications can provide us with an essential tool for enhancing various aspects of our lives.

  • Open access
  • Published: 19 July 2015

The role of visual representations in scientific practices: from conceptual understanding and knowledge generation to ‘seeing’ how science works

  • Maria Evagorou 1 ,
  • Sibel Erduran 2 &
  • Terhi Mäntylä 3  

International Journal of STEM Education volume  2 , Article number:  11 ( 2015 ) Cite this article

75k Accesses

78 Citations

13 Altmetric

Metrics details

The use of visual representations (i.e., photographs, diagrams, models) has been part of science, and their use makes it possible for scientists to interact with and represent complex phenomena, not observable in other ways. Despite a wealth of research in science education on visual representations, the emphasis of such research has mainly been on the conceptual understanding when using visual representations and less on visual representations as epistemic objects. In this paper, we argue that by positioning visual representations as epistemic objects of scientific practices, science education can bring a renewed focus on how visualization contributes to knowledge formation in science from the learners’ perspective.

This is a theoretical paper, and in order to argue about the role of visualization, we first present a case study, that of the discovery of the structure of DNA that highlights the epistemic components of visual information in science. The second case study focuses on Faraday’s use of the lines of magnetic force. Faraday is known of his exploratory, creative, and yet systemic way of experimenting, and the visual reasoning leading to theoretical development was an inherent part of the experimentation. Third, we trace a contemporary account from science focusing on the experimental practices and how reproducibility of experimental procedures can be reinforced through video data.

Conclusions

Our conclusions suggest that in teaching science, the emphasis in visualization should shift from cognitive understanding—using the products of science to understand the content—to engaging in the processes of visualization. Furthermore, we suggest that is it essential to design curriculum materials and learning environments that create a social and epistemic context and invite students to engage in the practice of visualization as evidence, reasoning, experimental procedure, or a means of communication and reflect on these practices. Implications for teacher education include the need for teacher professional development programs to problematize the use of visual representations as epistemic objects that are part of scientific practices.

During the last decades, research and reform documents in science education across the world have been calling for an emphasis not only on the content but also on the processes of science (Bybee 2014 ; Eurydice 2012 ; Duschl and Bybee 2014 ; Osborne 2014 ; Schwartz et al. 2012 ), in order to make science accessible to the students and enable them to understand the epistemic foundation of science. Scientific practices, part of the process of science, are the cognitive and discursive activities that are targeted in science education to develop epistemic understanding and appreciation of the nature of science (Duschl et al. 2008 ) and have been the emphasis of recent reform documents in science education across the world (Achieve 2013 ; Eurydice 2012 ). With the term scientific practices, we refer to the processes that take place during scientific discoveries and include among others: asking questions, developing and using models, engaging in arguments, and constructing and communicating explanations (National Research Council 2012 ). The emphasis on scientific practices aims to move the teaching of science from knowledge to the understanding of the processes and the epistemic aspects of science. Additionally, by placing an emphasis on engaging students in scientific practices, we aim to help students acquire scientific knowledge in meaningful contexts that resemble the reality of scientific discoveries.

Despite a wealth of research in science education on visual representations, the emphasis of such research has mainly been on the conceptual understanding when using visual representations and less on visual representations as epistemic objects. In this paper, we argue that by positioning visual representations as epistemic objects, science education can bring a renewed focus on how visualization contributes to knowledge formation in science from the learners’ perspective. Specifically, the use of visual representations (i.e., photographs, diagrams, tables, charts) has been part of science and over the years has evolved with the new technologies (i.e., from drawings to advanced digital images and three dimensional models). Visualization makes it possible for scientists to interact with complex phenomena (Richards 2003 ), and they might convey important evidence not observable in other ways. Visual representations as a tool to support cognitive understanding in science have been studied extensively (i.e., Gilbert 2010 ; Wu and Shah 2004 ). Studies in science education have explored the use of images in science textbooks (i.e., Dimopoulos et al. 2003 ; Bungum 2008 ), students’ representations or models when doing science (i.e., Gilbert et al. 2008 ; Dori et al. 2003 ; Lehrer and Schauble 2012 ; Schwarz et al. 2009 ), and students’ images of science and scientists (i.e., Chambers 1983 ). Therefore, studies in the field of science education have been using the term visualization as “the formation of an internal representation from an external representation” (Gilbert et al. 2008 , p. 4) or as a tool for conceptual understanding for students.

In this paper, we do not refer to visualization as mental image, model, or presentation only (Gilbert et al. 2008 ; Philips et al. 2010 ) but instead focus on visual representations or visualization as epistemic objects. Specifically, we refer to visualization as a process for knowledge production and growth in science. In this respect, modeling is an aspect of visualization, but what we are focusing on with visualization is not on the use of model as a tool for cognitive understanding (Gilbert 2010 ; Wu and Shah 2004 ) but the on the process of modeling as a scientific practice which includes the construction and use of models, the use of other representations, the communication in the groups with the use of the visual representation, and the appreciation of the difficulties that the science phase in this process. Therefore, the purpose of this paper is to present through the history of science how visualization can be considered not only as a cognitive tool in science education but also as an epistemic object that can potentially support students to understand aspects of the nature of science.

Scientific practices and science education

According to the New Generation Science Standards (Achieve 2013 ), scientific practices refer to: asking questions and defining problems; developing and using models; planning and carrying out investigations; analyzing and interpreting data; using mathematical and computational thinking; constructing explanations and designing solutions; engaging in argument from evidence; and obtaining, evaluating, and communicating information. A significant aspect of scientific practices is that science learning is more than just about learning facts, concepts, theories, and laws. A fuller appreciation of science necessitates the understanding of the science relative to its epistemological grounding and the process that are involved in the production of knowledge (Hogan and Maglienti 2001 ; Wickman 2004 ).

The New Generation Science Standards is, among other changes, shifting away from science inquiry and towards the inclusion of scientific practices (Duschl and Bybee 2014 ; Osborne 2014 ). By comparing the abilities to do scientific inquiry (National Research Council 2000 ) with the set of scientific practices, it is evident that the latter is about engaging in the processes of doing science and experiencing in that way science in a more authentic way. Engaging in scientific practices according to Osborne ( 2014 ) “presents a more authentic picture of the endeavor that is science” (p.183) and also helps the students to develop a deeper understanding of the epistemic aspects of science. Furthermore, as Bybee ( 2014 ) argues, by engaging students in scientific practices, we involve them in an understanding of the nature of science and an understanding on the nature of scientific knowledge.

Science as a practice and scientific practices as a term emerged by the philosopher of science, Kuhn (Osborne 2014 ), refers to the processes in which the scientists engage during knowledge production and communication. The work that is followed by historians, philosophers, and sociologists of science (Latour 2011 ; Longino 2002 ; Nersessian 2008 ) revealed the scientific practices in which the scientists engage in and include among others theory development and specific ways of talking, modeling, and communicating the outcomes of science.

Visualization as an epistemic object

Schematic, pictorial symbols in the design of scientific instruments and analysis of the perceptual and functional information that is being stored in those images have been areas of investigation in philosophy of scientific experimentation (Gooding et al. 1993 ). The nature of visual perception, the relationship between thought and vision, and the role of reproducibility as a norm for experimental research form a central aspect of this domain of research in philosophy of science. For instance, Rothbart ( 1997 ) has argued that visualizations are commonplace in the theoretical sciences even if every scientific theory may not be defined by visualized models.

Visual representations (i.e., photographs, diagrams, tables, charts, models) have been used in science over the years to enable scientists to interact with complex phenomena (Richards 2003 ) and might convey important evidence not observable in other ways (Barber et al. 2006 ). Some authors (e.g., Ruivenkamp and Rip 2010 ) have argued that visualization is as a core activity of some scientific communities of practice (e.g., nanotechnology) while others (e.g., Lynch and Edgerton 1988 ) have differentiated the role of particular visualization techniques (e.g., of digital image processing in astronomy). Visualization in science includes the complex process through which scientists develop or produce imagery, schemes, and graphical representation, and therefore, what is of importance in this process is not only the result but also the methodology employed by the scientists, namely, how this result was produced. Visual representations in science may refer to objects that are believed to have some kind of material or physical existence but equally might refer to purely mental, conceptual, and abstract constructs (Pauwels 2006 ). More specifically, visual representations can be found for: (a) phenomena that are not observable with the eye (i.e., microscopic or macroscopic); (b) phenomena that do not exist as visual representations but can be translated as such (i.e., sound); and (c) in experimental settings to provide visual data representations (i.e., graphs presenting velocity of moving objects). Additionally, since science is not only about replicating reality but also about making it more understandable to people (either to the public or other scientists), visual representations are not only about reproducing the nature but also about: (a) functioning in helping solving a problem, (b) filling gaps in our knowledge, and (c) facilitating knowledge building or transfer (Lynch 2006 ).

Using or developing visual representations in the scientific practice can range from a straightforward to a complicated situation. More specifically, scientists can observe a phenomenon (i.e., mitosis) and represent it visually using a picture or diagram, which is quite straightforward. But they can also use a variety of complicated techniques (i.e., crystallography in the case of DNA studies) that are either available or need to be developed or refined in order to acquire the visual information that can be used in the process of theory development (i.e., Latour and Woolgar 1979 ). Furthermore, some visual representations need decoding, and the scientists need to learn how to read these images (i.e., radiologists); therefore, using visual representations in the process of science requires learning a new language that is specific to the medium/methods that is used (i.e., understanding an X-ray picture is different from understanding an MRI scan) and then communicating that language to other scientists and the public.

There are much intent and purposes of visual representations in scientific practices, as for example to make a diagnosis, compare, describe, and preserve for future study, verify and explore new territory, generate new data (Pauwels 2006 ), or present new methodologies. According to Latour and Woolgar ( 1979 ) and Knorr Cetina ( 1999 ), visual representations can be used either as primary data (i.e., image from a microscope). or can be used to help in concept development (i.e., models of DNA used by Watson and Crick), to uncover relationships and to make the abstract more concrete (graphs of sound waves). Therefore, visual representations and visual practices, in all forms, are an important aspect of the scientific practices in developing, clarifying, and transmitting scientific knowledge (Pauwels 2006 ).

Methods and Results: Merging Visualization and scientific practices in science

In this paper, we present three case studies that embody the working practices of scientists in an effort to present visualization as a scientific practice and present our argument about how visualization is a complex process that could include among others modeling and use of representation but is not only limited to that. The first case study explores the role of visualization in the construction of knowledge about the structure of DNA, using visuals as evidence. The second case study focuses on Faraday’s use of the lines of magnetic force and the visual reasoning leading to the theoretical development that was an inherent part of the experimentation. The third case study focuses on the current practices of scientists in the context of a peer-reviewed journal called the Journal of Visualized Experiments where the methodology is communicated through videotaped procedures. The three case studies represent the research interests of the three authors of this paper and were chosen to present how visualization as a practice can be involved in all stages of doing science, from hypothesizing and evaluating evidence (case study 1) to experimenting and reasoning (case study 2) to communicating the findings and methodology with the research community (case study 3), and represent in this way the three functions of visualization as presented by Lynch ( 2006 ). Furthermore, the last case study showcases how the development of visualization technologies has contributed to the communication of findings and methodologies in science and present in that way an aspect of current scientific practices. In all three cases, our approach is guided by the observation that the visual information is an integral part of scientific practices at the least and furthermore that they are particularly central in the scientific practices of science.

Case study 1: use visual representations as evidence in the discovery of DNA

The focus of the first case study is the discovery of the structure of DNA. The DNA was first isolated in 1869 by Friedrich Miescher, and by the late 1940s, it was known that it contained phosphate, sugar, and four nitrogen-containing chemical bases. However, no one had figured the structure of the DNA until Watson and Crick presented their model of DNA in 1953. Other than the social aspects of the discovery of the DNA, another important aspect was the role of visual evidence that led to knowledge development in the area. More specifically, by studying the personal accounts of Watson ( 1968 ) and Crick ( 1988 ) about the discovery of the structure of the DNA, the following main ideas regarding the role of visual representations in the production of knowledge can be identified: (a) The use of visual representations was an important part of knowledge growth and was often dependent upon the discovery of new technologies (i.e., better microscopes or better techniques in crystallography that would provide better visual representations as evidence of the helical structure of the DNA); and (b) Models (three-dimensional) were used as a way to represent the visual images (X-ray images) and connect them to the evidence provided by other sources to see whether the theory can be supported. Therefore, the model of DNA was built based on the combination of visual evidence and experimental data.

An example showcasing the importance of visual representations in the process of knowledge production in this case is provided by Watson, in his book The Double Helix (1968):

…since the middle of the summer Rosy [Rosalind Franklin] had had evidence for a new three-dimensional form of DNA. It occurred when the DNA 2molecules were surrounded by a large amount of water. When I asked what the pattern was like, Maurice went into the adjacent room to pick up a print of the new form they called the “B” structure. The instant I saw the picture, my mouth fell open and my pulse began to race. The pattern was unbelievably simpler than those previously obtained (A form). Moreover, the black cross of reflections which dominated the picture could arise only from a helical structure. With the A form the argument for the helix was never straightforward, and considerable ambiguity existed as to exactly which type of helical symmetry was present. With the B form however, mere inspection of its X-ray picture gave several of the vital helical parameters. (p. 167-169)

As suggested by Watson’s personal account of the discovery of the DNA, the photo taken by Rosalind Franklin (Fig.  1 ) convinced him that the DNA molecule must consist of two chains arranged in a paired helix, which resembles a spiral staircase or ladder, and on March 7, 1953, Watson and Crick finished and presented their model of the structure of DNA (Watson and Berry 2004 ; Watson 1968 ) which was based on the visual information provided by the X-ray image and their knowledge of chemistry.

X-ray chrystallography of DNA

In analyzing the visualization practice in this case study, we observe the following instances that highlight how the visual information played a role:

Asking questions and defining problems: The real world in the model of science can at some points only be observed through visual representations or representations, i.e., if we are using DNA as an example, the structure of DNA was only observable through the crystallography images produced by Rosalind Franklin in the laboratory. There was no other way to observe the structure of DNA, therefore the real world.

Analyzing and interpreting data: The images that resulted from crystallography as well as their interpretations served as the data for the scientists studying the structure of DNA.

Experimenting: The data in the form of visual information were used to predict the possible structure of the DNA.

Modeling: Based on the prediction, an actual three-dimensional model was prepared by Watson and Crick. The first model did not fit with the real world (refuted by Rosalind Franklin and her research group from King’s College) and Watson and Crick had to go through the same process again to find better visual evidence (better crystallography images) and create an improved visual model.

Example excerpts from Watson’s biography provide further evidence for how visualization practices were applied in the context of the discovery of DNA (Table  1 ).

In summary, by examining the history of the discovery of DNA, we showcased how visual data is used as scientific evidence in science, identifying in that way an aspect of the nature of science that is still unexplored in the history of science and an aspect that has been ignored in the teaching of science. Visual representations are used in many ways: as images, as models, as evidence to support or rebut a model, and as interpretations of reality.

Case study 2: applying visual reasoning in knowledge production, the example of the lines of magnetic force

The focus of this case study is on Faraday’s use of the lines of magnetic force. Faraday is known of his exploratory, creative, and yet systemic way of experimenting, and the visual reasoning leading to theoretical development was an inherent part of this experimentation (Gooding 2006 ). Faraday’s articles or notebooks do not include mathematical formulations; instead, they include images and illustrations from experimental devices and setups to the recapping of his theoretical ideas (Nersessian 2008 ). According to Gooding ( 2006 ), “Faraday’s visual method was designed not to copy apparent features of the world, but to analyse and replicate them” (2006, p. 46).

The lines of force played a central role in Faraday’s research on electricity and magnetism and in the development of his “field theory” (Faraday 1852a ; Nersessian 1984 ). Before Faraday, the experiments with iron filings around magnets were known and the term “magnetic curves” was used for the iron filing patterns and also for the geometrical constructs derived from the mathematical theory of magnetism (Gooding et al. 1993 ). However, Faraday used the lines of force for explaining his experimental observations and in constructing the theory of forces in magnetism and electricity. Examples of Faraday’s different illustrations of lines of magnetic force are given in Fig.  2 . Faraday gave the following experiment-based definition for the lines of magnetic forces:

a Iron filing pattern in case of bar magnet drawn by Faraday (Faraday 1852b , Plate IX, p. 158, Fig. 1), b Faraday’s drawing of lines of magnetic force in case of cylinder magnet, where the experimental procedure, knife blade showing the direction of lines, is combined into drawing (Faraday, 1855, vol. 1, plate 1)

A line of magnetic force may be defined as that line which is described by a very small magnetic needle, when it is so moved in either direction correspondent to its length, that the needle is constantly a tangent to the line of motion; or it is that line along which, if a transverse wire be moved in either direction, there is no tendency to the formation of any current in the wire, whilst if moved in any other direction there is such a tendency; or it is that line which coincides with the direction of the magnecrystallic axis of a crystal of bismuth, which is carried in either direction along it. The direction of these lines about and amongst magnets and electric currents, is easily represented and understood, in a general manner, by the ordinary use of iron filings. (Faraday 1852a , p. 25 (3071))

The definition describes the connection between the experiments and the visual representation of the results. Initially, the lines of force were just geometric representations, but later, Faraday treated them as physical objects (Nersessian 1984 ; Pocovi and Finlay 2002 ):

I have sometimes used the term lines of force so vaguely, as to leave the reader doubtful whether I intended it as a merely representative idea of the forces, or as the description of the path along which the power was continuously exerted. … wherever the expression line of force is taken simply to represent the disposition of forces, it shall have the fullness of that meaning; but that wherever it may seem to represent the idea of the physical mode of transmission of the force, it expresses in that respect the opinion to which I incline at present. The opinion may be erroneous, and yet all that relates or refers to the disposition of the force will remain the same. (Faraday, 1852a , p. 55-56 (3075))

He also felt that the lines of force had greater explanatory power than the dominant theory of action-at-a-distance:

Now it appears to me that these lines may be employed with great advantage to represent nature, condition, direction and comparative amount of the magnetic forces; and that in many cases they have, to the physical reasoned at least, a superiority over that method which represents the forces as concentrated in centres of action… (Faraday, 1852a , p. 26 (3074))

For giving some insight to Faraday’s visual reasoning as an epistemic practice, the following examples of Faraday’s studies of the lines of magnetic force (Faraday 1852a , 1852b ) are presented:

(a) Asking questions and defining problems: The iron filing patterns formed the empirical basis for the visual model: 2D visualization of lines of magnetic force as presented in Fig.  2 . According to Faraday, these iron filing patterns were suitable for illustrating the direction and form of the magnetic lines of force (emphasis added):

It must be well understood that these forms give no indication by their appearance of the relative strength of the magnetic force at different places, inasmuch as the appearance of the lines depends greatly upon the quantity of filings and the amount of tapping; but the direction and forms of these lines are well given, and these indicate, in a considerable degree, the direction in which the forces increase and diminish . (Faraday 1852b , p.158 (3237))

Despite being static and two dimensional on paper, the lines of magnetic force were dynamical (Nersessian 1992 , 2008 ) and three dimensional for Faraday (see Fig.  2 b). For instance, Faraday described the lines of force “expanding”, “bending,” and “being cut” (Nersessian 1992 ). In Fig.  2 b, Faraday has summarized his experiment (bar magnet and knife blade) and its results (lines of force) in one picture.

(b) Analyzing and interpreting data: The model was so powerful for Faraday that he ended up thinking them as physical objects (e.g., Nersessian 1984 ), i.e., making interpretations of the way forces act. Of course, he made a lot of experiments for showing the physical existence of the lines of force, but he did not succeed in it (Nersessian 1984 ). The following quote illuminates Faraday’s use of the lines of force in different situations:

The study of these lines has, at different times, been greatly influential in leading me to various results, which I think prove their utility as well as fertility. Thus, the law of magneto-electric induction; the earth’s inductive action; the relation of magnetism and light; diamagnetic action and its law, and magnetocrystallic action, are the cases of this kind… (Faraday 1852a , p. 55 (3174))

(c) Experimenting: In Faraday's case, he used a lot of exploratory experiments; in case of lines of magnetic force, he used, e.g., iron filings, magnetic needles, or current carrying wires (see the quote above). The magnetic field is not directly observable and the representation of lines of force was a visual model, which includes the direction, form, and magnitude of field.

(d) Modeling: There is no denying that the lines of magnetic force are visual by nature. Faraday’s views of lines of force developed gradually during the years, and he applied and developed them in different contexts such as electromagnetic, electrostatic, and magnetic induction (Nersessian 1984 ). An example of Faraday’s explanation of the effect of the wire b’s position to experiment is given in Fig.  3 . In Fig.  3 , few magnetic lines of force are drawn, and in the quote below, Faraday is explaining the effect using these magnetic lines of force (emphasis added):

Picture of an experiment with different arrangements of wires ( a , b’ , b” ), magnet, and galvanometer. Note the lines of force drawn around the magnet. (Faraday 1852a , p. 34)

It will be evident by inspection of Fig. 3 , that, however the wires are carried away, the general result will, according to the assumed principles of action, be the same; for if a be the axial wire, and b’, b”, b”’ the equatorial wire, represented in three different positions, whatever magnetic lines of force pass across the latter wire in one position, will also pass it in the other, or in any other position which can be given to it. The distance of the wire at the place of intersection with the lines of force, has been shown, by the experiments (3093.), to be unimportant. (Faraday 1852a , p. 34 (3099))

In summary, by examining the history of Faraday’s use of lines of force, we showed how visual imagery and reasoning played an important part in Faraday’s construction and representation of his “field theory”. As Gooding has stated, “many of Faraday’s sketches are far more that depictions of observation, they are tools for reasoning with and about phenomena” (2006, p. 59).

Case study 3: visualizing scientific methods, the case of a journal

The focus of the third case study is the Journal of Visualized Experiments (JoVE) , a peer-reviewed publication indexed in PubMed. The journal devoted to the publication of biological, medical, chemical, and physical research in a video format. The journal describes its history as follows:

JoVE was established as a new tool in life science publication and communication, with participation of scientists from leading research institutions. JoVE takes advantage of video technology to capture and transmit the multiple facets and intricacies of life science research. Visualization greatly facilitates the understanding and efficient reproduction of both basic and complex experimental techniques, thereby addressing two of the biggest challenges faced by today's life science research community: i) low transparency and poor reproducibility of biological experiments and ii) time and labor-intensive nature of learning new experimental techniques. ( http://www.jove.com/ )

By examining the journal content, we generate a set of categories that can be considered as indicators of relevance and significance in terms of epistemic practices of science that have relevance for science education. For example, the quote above illustrates how scientists view some norms of scientific practice including the norms of “transparency” and “reproducibility” of experimental methods and results, and how the visual format of the journal facilitates the implementation of these norms. “Reproducibility” can be considered as an epistemic criterion that sits at the heart of what counts as an experimental procedure in science:

Investigating what should be reproducible and by whom leads to different types of experimental reproducibility, which can be observed to play different roles in experimental practice. A successful application of the strategy of reproducing an experiment is an achievement that may depend on certain isiosyncratic aspects of a local situation. Yet a purely local experiment that cannot be carried out by other experimenters and in other experimental contexts will, in the end be unproductive in science. (Sarkar and Pfeifer 2006 , p.270)

We now turn to an article on “Elevated Plus Maze for Mice” that is available for free on the journal website ( http://www.jove.com/video/1088/elevated-plus-maze-for-mice ). The purpose of this experiment was to investigate anxiety levels in mice through behavioral analysis. The journal article consists of a 9-min video accompanied by text. The video illustrates the handling of the mice in soundproof location with less light, worksheets with characteristics of mice, computer software, apparatus, resources, setting up the computer software, and the video recording of mouse behavior on the computer. The authors describe the apparatus that is used in the experiment and state how procedural differences exist between research groups that lead to difficulties in the interpretation of results:

The apparatus consists of open arms and closed arms, crossed in the middle perpendicularly to each other, and a center area. Mice are given access to all of the arms and are allowed to move freely between them. The number of entries into the open arms and the time spent in the open arms are used as indices of open space-induced anxiety in mice. Unfortunately, the procedural differences that exist between laboratories make it difficult to duplicate and compare results among laboratories.

The authors’ emphasis on the particularity of procedural context echoes in the observations of some philosophers of science:

It is not just the knowledge of experimental objects and phenomena but also their actual existence and occurrence that prove to be dependent on specific, productive interventions by the experimenters” (Sarkar and Pfeifer 2006 , pp. 270-271)

The inclusion of a video of the experimental procedure specifies what the apparatus looks like (Fig.  4 ) and how the behavior of the mice is captured through video recording that feeds into a computer (Fig.  5 ). Subsequently, a computer software which captures different variables such as the distance traveled, the number of entries, and the time spent on each arm of the apparatus. Here, there is visual information at different levels of representation ranging from reconfiguration of raw video data to representations that analyze the data around the variables in question (Fig.  6 ). The practice of levels of visual representations is not particular to the biological sciences. For instance, they are commonplace in nanotechnological practices:

Visual illustration of apparatus

Video processing of experimental set-up

Computer software for video input and variable recording

In the visualization processes, instruments are needed that can register the nanoscale and provide raw data, which needs to be transformed into images. Some Imaging Techniques have software incorporated already where this transformation automatically takes place, providing raw images. Raw data must be translated through the use of Graphic Software and software is also used for the further manipulation of images to highlight what is of interest to capture the (inferred) phenomena -- and to capture the reader. There are two levels of choice: Scientists have to choose which imaging technique and embedded software to use for the job at hand, and they will then have to follow the structure of the software. Within such software, there are explicit choices for the scientists, e.g. about colour coding, and ways of sharpening images. (Ruivenkamp and Rip 2010 , pp.14–15)

On the text that accompanies the video, the authors highlight the role of visualization in their experiment:

Visualization of the protocol will promote better understanding of the details of the entire experimental procedure, allowing for standardization of the protocols used in different laboratories and comparisons of the behavioral phenotypes of various strains of mutant mice assessed using this test.

The software that takes the video data and transforms it into various representations allows the researchers to collect data on mouse behavior more reliably. For instance, the distance traveled across the arms of the apparatus or the time spent on each arm would have been difficult to observe and record precisely. A further aspect to note is how the visualization of the experiment facilitates control of bias. The authors illustrate how the olfactory bias between experimental procedures carried on mice in sequence is avoided by cleaning the equipment.

Our discussion highlights the role of visualization in science, particularly with respect to presenting visualization as part of the scientific practices. We have used case studies from the history of science highlighting a scientist’s account of how visualization played a role in the discovery of DNA and the magnetic field and from a contemporary illustration of a science journal’s practices in incorporating visualization as a way to communicate new findings and methodologies. Our implicit aim in drawing from these case studies was the need to align science education with scientific practices, particularly in terms of how visual representations, stable or dynamic, can engage students in the processes of science and not only to be used as tools for cognitive development in science. Our approach was guided by the notion of “knowledge-as-practice” as advanced by Knorr Cetina ( 1999 ) who studied scientists and characterized their knowledge as practice, a characterization which shifts focus away from ideas inside scientists’ minds to practices that are cultural and deeply contextualized within fields of science. She suggests that people working together can be examined as epistemic cultures whose collective knowledge exists as practice.

It is important to stress, however, that visual representations are not used in isolation, but are supported by other types of evidence as well, or other theories (i.e., in order to understand the helical form of DNA, or the structure, chemistry knowledge was needed). More importantly, this finding can also have implications when teaching science as argument (e.g., Erduran and Jimenez-Aleixandre 2008 ), since the verbal evidence used in the science classroom to maintain an argument could be supported by visual evidence (either a model, representation, image, graph, etc.). For example, in a group of students discussing the outcomes of an introduced species in an ecosystem, pictures of the species and the ecosystem over time, and videos showing the changes in the ecosystem, and the special characteristics of the different species could serve as visual evidence to help the students support their arguments (Evagorou et al. 2012 ). Therefore, an important implication for the teaching of science is the use of visual representations as evidence in the science curriculum as part of knowledge production. Even though studies in the area of science education have focused on the use of models and modeling as a way to support students in the learning of science (Dori et al. 2003 ; Lehrer and Schauble 2012 ; Mendonça and Justi 2013 ; Papaevripidou et al. 2007 ) or on the use of images (i.e., Korfiatis et al. 2003 ), with the term using visuals as evidence, we refer to the collection of all forms of visuals and the processes involved.

Another aspect that was identified through the case studies is that of the visual reasoning (an integral part of Faraday’s investigations). Both the verbalization and visualization were part of the process of generating new knowledge (Gooding 2006 ). Even today, most of the textbooks use the lines of force (or just field lines) as a geometrical representation of field, and the number of field lines is connected to the quantity of flux. Often, the textbooks use the same kind of visual imagery than in what is used by scientists. However, when using images, only certain aspects or features of the phenomena or data are captured or highlighted, and often in tacit ways. Especially in textbooks, the process of producing the image is not presented and instead only the product—image—is left. This could easily lead to an idea of images (i.e., photos, graphs, visual model) being just representations of knowledge and, in the worse case, misinterpreted representations of knowledge as the results of Pocovi and Finlay ( 2002 ) in case of electric field lines show. In order to avoid this, the teachers should be able to explain how the images are produced (what features of phenomena or data the images captures, on what ground the features are chosen to that image, and what features are omitted); in this way, the role of visualization in knowledge production can be made “visible” to students by engaging them in the process of visualization.

The implication of these norms for science teaching and learning is numerous. The classroom contexts can model the generation, sharing and evaluation of evidence, and experimental procedures carried out by students, thereby promoting not only some contemporary cultural norms in scientific practice but also enabling the learning of criteria, standards, and heuristics that scientists use in making decisions on scientific methods. As we have demonstrated with the three case studies, visual representations are part of the process of knowledge growth and communication in science, as demonstrated with two examples from the history of science and an example from current scientific practices. Additionally, visual information, especially with the use of technology is a part of students’ everyday lives. Therefore, we suggest making use of students’ knowledge and technological skills (i.e., how to produce their own videos showing their experimental method or how to identify or provide appropriate visual evidence for a given topic), in order to teach them the aspects of the nature of science that are often neglected both in the history of science and the design of curriculum. Specifically, what we suggest in this paper is that students should actively engage in visualization processes in order to appreciate the diverse nature of doing science and engage in authentic scientific practices.

However, as a word of caution, we need to distinguish the products and processes involved in visualization practices in science:

If one considers scientific representations and the ways in which they can foster or thwart our understanding, it is clear that a mere object approach, which would devote all attention to the representation as a free-standing product of scientific labor, is inadequate. What is needed is a process approach: each visual representation should be linked with its context of production (Pauwels 2006 , p.21).

The aforementioned suggests that the emphasis in visualization should shift from cognitive understanding—using the products of science to understand the content—to engaging in the processes of visualization. Therefore, an implication for the teaching of science includes designing curriculum materials and learning environments that create a social and epistemic context and invite students to engage in the practice of visualization as evidence, reasoning, experimental procedure, or a means of communication (as presented in the three case studies) and reflect on these practices (Ryu et al. 2015 ).

Finally, a question that arises from including visualization in science education, as well as from including scientific practices in science education is whether teachers themselves are prepared to include them as part of their teaching (Bybee 2014 ). Teacher preparation programs and teacher education have been critiqued, studied, and rethought since the time they emerged (Cochran-Smith 2004 ). Despite the years of history in teacher training and teacher education, the debate about initial teacher training and its content still pertains in our community and in policy circles (Cochran-Smith 2004 ; Conway et al. 2009 ). In the last decades, the debate has shifted from a behavioral view of learning and teaching to a learning problem—focusing on that way not only on teachers’ knowledge, skills, and beliefs but also on making the connection of the aforementioned with how and if pupils learn (Cochran-Smith 2004 ). The Science Education in Europe report recommended that “Good quality teachers, with up-to-date knowledge and skills, are the foundation of any system of formal science education” (Osborne and Dillon 2008 , p.9).

However, questions such as what should be the emphasis on pre-service and in-service science teacher training, especially with the new emphasis on scientific practices, still remain unanswered. As Bybee ( 2014 ) argues, starting from the new emphasis on scientific practices in the NGSS, we should consider teacher preparation programs “that would provide undergraduates opportunities to learn the science content and practices in contexts that would be aligned with their future work as teachers” (p.218). Therefore, engaging pre- and in-service teachers in visualization as a scientific practice should be one of the purposes of teacher preparation programs.

Achieve. (2013). The next generation science standards (pp. 1–3). Retrieved from http://www.nextgenscience.org/ .

Google Scholar  

Barber, J, Pearson, D, & Cervetti, G. (2006). Seeds of science/roots of reading . California: The Regents of the University of California.

Bungum, B. (2008). Images of physics: an explorative study of the changing character of visual images in Norwegian physics textbooks. NorDiNa, 4 (2), 132–141.

Bybee, RW. (2014). NGSS and the next generation of science teachers. Journal of Science Teacher Education, 25 (2), 211–221. doi: 10.1007/s10972-014-9381-4 .

Article   Google Scholar  

Chambers, D. (1983). Stereotypic images of the scientist: the draw-a-scientist test. Science Education, 67 (2), 255–265.

Cochran-Smith, M. (2004). The problem of teacher education. Journal of Teacher Education, 55 (4), 295–299. doi: 10.1177/0022487104268057 .

Conway, PF, Murphy, R, & Rath, A. (2009). Learning to teach and its implications for the continuum of teacher education: a nine-country cross-national study .

Crick, F. (1988). What a mad pursuit . USA: Basic Books.

Dimopoulos, K, Koulaidis, V, & Sklaveniti, S. (2003). Towards an analysis of visual images in school science textbooks and press articles about science and technology. Research in Science Education, 33 , 189–216.

Dori, YJ, Tal, RT, & Tsaushu, M. (2003). Teaching biotechnology through case studies—can we improve higher order thinking skills of nonscience majors? Science Education, 87 (6), 767–793. doi: 10.1002/sce.10081 .

Duschl, RA, & Bybee, RW. (2014). Planning and carrying out investigations: an entry to learning and to teacher professional development around NGSS science and engineering practices. International Journal of STEM Education, 1 (1), 12. doi: 10.1186/s40594-014-0012-6 .

Duschl, R., Schweingruber, H. A., & Shouse, A. (2008). Taking science to school . Washington DC: National Academies Press.

Erduran, S, & Jimenez-Aleixandre, MP (Eds.). (2008). Argumentation in science education: perspectives from classroom-based research . Dordrecht: Springer.

Eurydice. (2012). Developing key competencies at school in Europe: challenges and opportunities for policy – 2011/12 (pp. 1–72).

Evagorou, M, Jimenez-Aleixandre, MP, & Osborne, J. (2012). “Should we kill the grey squirrels?” A study exploring students’ justifications and decision-making. International Journal of Science Education, 34 (3), 401–428. doi: 10.1080/09500693.2011.619211 .

Faraday, M. (1852a). Experimental researches in electricity. – Twenty-eighth series. Philosophical Transactions of the Royal Society of London, 142 , 25–56.

Faraday, M. (1852b). Experimental researches in electricity. – Twenty-ninth series. Philosophical Transactions of the Royal Society of London, 142 , 137–159.

Gilbert, JK. (2010). The role of visual representations in the learning and teaching of science: an introduction (pp. 1–19).

Gilbert, J., Reiner, M. & Nakhleh, M. (2008). Visualization: theory and practice in science education . Dordrecht, The Netherlands: Springer.

Gooding, D. (2006). From phenomenology to field theory: Faraday’s visual reasoning. Perspectives on Science, 14 (1), 40–65.

Gooding, D, Pinch, T, & Schaffer, S (Eds.). (1993). The uses of experiment: studies in the natural sciences . Cambridge: Cambridge University Press.

Hogan, K, & Maglienti, M. (2001). Comparing the epistemological underpinnings of students’ and scientists’ reasoning about conclusions. Journal of Research in Science Teaching, 38 (6), 663–687.

Knorr Cetina, K. (1999). Epistemic cultures: how the sciences make knowledge . Cambridge: Harvard University Press.

Korfiatis, KJ, Stamou, AG, & Paraskevopoulos, S. (2003). Images of nature in Greek primary school textbooks. Science Education, 88 (1), 72–89. doi: 10.1002/sce.10133 .

Latour, B. (2011). Visualisation and cognition: drawing things together (pp. 1–32).

Latour, B, & Woolgar, S. (1979). Laboratory life: the construction of scientific facts . Princeton: Princeton University Press.

Lehrer, R, & Schauble, L. (2012). Seeding evolutionary thinking by engaging children in modeling its foundations. Science Education, 96 (4), 701–724. doi: 10.1002/sce.20475 .

Longino, H. E. (2002). The fate of knowledge . Princeton: Princeton University Press.

Lynch, M. (2006). The production of scientific images: vision and re-vision in the history, philosophy, and sociology of science. In L Pauwels (Ed.), Visual cultures of science: rethinking representational practices in knowledge building and science communication (pp. 26–40). Lebanon, NH: Darthmouth College Press.

Lynch, M. & S. Y. Edgerton Jr. (1988). ‘Aesthetic and digital image processing representational craft in contemporary astronomy’, in G. Fyfe & J. Law (eds), Picturing Power; Visual Depictions and Social Relations (London, Routledge): 184 – 220.

Mendonça, PCC, & Justi, R. (2013). An instrument for analyzing arguments produced in modeling-based chemistry lessons. Journal of Research in Science Teaching, 51 (2), 192–218. doi: 10.1002/tea.21133 .

National Research Council (2000). Inquiry and the national science education standards . Washington DC: National Academies Press.

National Research Council (2012). A framework for K-12 science education . Washington DC: National Academies Press.

Nersessian, NJ. (1984). Faraday to Einstein: constructing meaning in scientific theories . Dordrecht: Martinus Nijhoff Publishers.

Book   Google Scholar  

Nersessian, NJ. (1992). How do scientists think? Capturing the dynamics of conceptual change in science. In RN Giere (Ed.), Cognitive Models of Science (pp. 3–45). Minneapolis: University of Minnesota Press.

Nersessian, NJ. (2008). Creating scientific concepts . Cambridge: The MIT Press.

Osborne, J. (2014). Teaching scientific practices: meeting the challenge of change. Journal of Science Teacher Education, 25 (2), 177–196. doi: 10.1007/s10972-014-9384-1 .

Osborne, J. & Dillon, J. (2008). Science education in Europe: critical reflections . London: Nuffield Foundation.

Papaevripidou, M, Constantinou, CP, & Zacharia, ZC. (2007). Modeling complex marine ecosystems: an investigation of two teaching approaches with fifth graders. Journal of Computer Assisted Learning, 23 (2), 145–157. doi: 10.1111/j.1365-2729.2006.00217.x .

Pauwels, L. (2006). A theoretical framework for assessing visual representational practices in knowledge building and science communications. In L Pauwels (Ed.), Visual cultures of science: rethinking representational practices in knowledge building and science communication (pp. 1–25). Lebanon, NH: Darthmouth College Press.

Philips, L., Norris, S. & McNab, J. (2010). Visualization in mathematics, reading and science education . Dordrecht, The Netherlands: Springer.

Pocovi, MC, & Finlay, F. (2002). Lines of force: Faraday’s and students’ views. Science & Education, 11 , 459–474.

Richards, A. (2003). Argument and authority in the visual representations of science. Technical Communication Quarterly, 12 (2), 183–206. doi: 10.1207/s15427625tcq1202_3 .

Rothbart, D. (1997). Explaining the growth of scientific knowledge: metaphors, models and meaning . Lewiston, NY: Mellen Press.

Ruivenkamp, M, & Rip, A. (2010). Visualizing the invisible nanoscale study: visualization practices in nanotechnology community of practice. Science Studies, 23 (1), 3–36.

Ryu, S, Han, Y, & Paik, S-H. (2015). Understanding co-development of conceptual and epistemic understanding through modeling practices with mobile internet. Journal of Science Education and Technology, 24 (2-3), 330–355. doi: 10.1007/s10956-014-9545-1 .

Sarkar, S, & Pfeifer, J. (2006). The philosophy of science, chapter on experimentation (Vol. 1, A-M). New York: Taylor & Francis.

Schwartz, RS, Lederman, NG, & Abd-el-Khalick, F. (2012). A series of misrepresentations: a response to Allchin’s whole approach to assessing nature of science understandings. Science Education, 96 (4), 685–692. doi: 10.1002/sce.21013 .

Schwarz, CV, Reiser, BJ, Davis, EA, Kenyon, L, Achér, A, Fortus, D, et al. (2009). Developing a learning progression for scientific modeling: making scientific modeling accessible and meaningful for learners. Journal of Research in Science Teaching, 46 (6), 632–654. doi: 10.1002/tea.20311 .

Watson, J. (1968). The Double Helix: a personal account of the discovery of the structure of DNA . New York: Scribner.

Watson, J, & Berry, A. (2004). DNA: the secret of life . New York: Alfred A. Knopf.

Wickman, PO. (2004). The practical epistemologies of the classroom: a study of laboratory work. Science Education, 88 , 325–344.

Wu, HK, & Shah, P. (2004). Exploring visuospatial thinking in chemistry learning. Science Education, 88 (3), 465–492. doi: 10.1002/sce.10126 .

Download references

Acknowledgements

The authors would like to acknowledge all reviewers for their valuable comments that have helped us improve the manuscript.

Author information

Authors and affiliations.

University of Nicosia, 46, Makedonitissa Avenue, Egkomi, 1700, Nicosia, Cyprus

Maria Evagorou

University of Limerick, Limerick, Ireland

Sibel Erduran

University of Tampere, Tampere, Finland

Terhi Mäntylä

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Maria Evagorou .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contributions

ME carried out the introductory literature review, the analysis of the first case study, and drafted the manuscript. SE carried out the analysis of the third case study and contributed towards the “Conclusions” section of the manuscript. TM carried out the second case study. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( https://creativecommons.org/licenses/by/4.0 ), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article.

Evagorou, M., Erduran, S. & Mäntylä, T. The role of visual representations in scientific practices: from conceptual understanding and knowledge generation to ‘seeing’ how science works. IJ STEM Ed 2 , 11 (2015). https://doi.org/10.1186/s40594-015-0024-x

Download citation

Received : 29 September 2014

Accepted : 16 May 2015

Published : 19 July 2015

DOI : https://doi.org/10.1186/s40594-015-0024-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Visual representations
  • Epistemic practices
  • Science learning

visual representation definition psychology

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Representational Theories of Consciousness

The idea of representation has been central in discussions of intentionality for many years. But only more recently has it begun playing a wider role in the philosophy of mind, particularly in theories of consciousness. Indeed, there are now multiple representational theories of consciousness, corresponding to different uses of the term “conscious,” each attempting to explain the corresponding phenomenon in terms of representation. More cautiously, each theory attempts to explain its target phenomenon in terms of intentionality , and assumes that intentionality is representation.

An intentional state represents an object, real or unreal (say, Mage or Pegasus), and typically represents a whole state of affairs, one which may or may not actually obtain (say, that Mage wins the Kentucky Derby again in 2024). Like public, social cases of representation such as writing or mapmaking, intentional states such as beliefs have truth-value; they entail or imply other beliefs; they are (it seems) composed of concepts and depend for their truth on a match between their internal structures and the way the world is; and so it is natural to regard their aboutness as a matter of mental referring or designation. Sellars (1956, 1967) and Fodor (1975) argue that intentional states are states of a subject that have semantical properties, and the existent-or-nonexistent states of affairs that are their objects are just representational contents.

So much is familiar and not very controversial. But problems of consciousness are generally felt to be less tractable than matters of intentionality. The aim of a representationalist theory of consciousness is to extend the treatment of intentionality to that of consciousness, showing that if intentionality is well understood in representational terms, then so can be the phenomena of consciousness in whichever sense of that fraught term.

The notions of consciousness most commonly addressed by philosophers are the following: (1) “Conscious states” in the very specific sense of: states whose subjects are directly, internally aware of them. (2) Introspection and one’s privileged access to the internal character of one’s experience itself. (3) Being in a sensory state that has a distinctive qualitative property, such as the color one experiences in having a visual experience, or the timbre of a heard sound. (4) The phenomenal matter of “what it’s like” for the subject to be in a particular mental state, especially what it is like for that subject to experience a particular qualitative property as in (3). Block (1995) and others have used “phenomenal consciousness” for sense (4), without distinguishing it from sense (3). (A further terminological complication is that some theorists, such as Dretske (1995) and Tye (1995), have used the expression “what it’s like” to mean the qualitative property itself, rather than the present higher-order property of that property. This entry will mainly use “qualitative” to allude to sensory qualities of the sort that figure in (3), and “phenomenal” as applying to “what it’s like.”)

For each of the four foregoing notions of consciousness, some philosophers have claimed that that type of consciousness is entirely or largely explicable as a kind of representing. This entry will deal mainly with representational theories of consciousness in senses (3) and (4). The leading representational approaches to (1) and (2) are “higher-order representation” theories, which divide into “inner sense” or “higher-order perception” views, “acquaintance” accounts, and “higher-order thought” theories. For discussion of those, see the entry on higher-order theories of consciousness .

1. Qualitative Character as Representation

2.1 pure, strong, and weak representationalism, 2.2 narrow and wide representationalism, 2.3 representational contents, 2.4 reductive vs. nonreductive, 3.1 the argument from materialism, 3.2. the argument from veridicality, 3.3 the argument from transparency, 3.4 the argument from seeming, 3.5 the argument from hallucination, 3.6 awareness of sensory qualities, 4.1 objections to the nonactual, 4.2 unconscious representation, 4.3 “phenomenal intentionality”, 4.4 “laws of appearance”, 4.5 counterexamples, 4.6 arguments against wide representationalism, 5.1 higher-order “what it’s like,” distinct from sensory qualities, 5.2 arguments against materialism based on “what it’s like”, 5.3 “illusionism”, other internet resources, related entries.

Qualitative properties and phenomenal features of mental states are each often called “qualia” (singular, “quale”). In recent philosophy of mind that term has been used in a number of confusingly different ways. There is a specific, fairly strict sense that comes to us from C.I. Lewis (1929) by way of Goodman (1951) (though there is plenty of room for exegetical disagreement about Lewis’ own usage). A quale in this sense is a qualitative property inhering in a sensory state: the color of an after-image, or that of a more ordinary patch in one’s visual field; the pitch or volume or timbre of a subjectively heard sound; the smell of an odor; a particular taste; the perceived texture of an object encountered by touch. (The term “inhering in” in the preceding sentence is deliberately vague, and neutral on as many metaphysical issues as possible. In particular, qualia may be properties of the experiences in which they inhere, or they may be related to those experiences in some other way.) For reasons that will become clear, we may call this sense of “qualia” the “first-order” sense. Notice that it differs from the broader and vaguer sense defined in the entry on qualia (“the introspectively accessible, phenomenal aspects of our mental lives”), and from the much more heavily laden sense of Block (1990, 1995, 1996), according to which “qualia” are by stipulative definition neither functional nor intentional properties. To avoid further confusion, let us speak of sensory qualities .

A sensory quality can be thought of as the distinctive property of an apparent phenomenal individual. An “apparent phenomenal individual” is anything of the sort that Bertrand Russell would have taken to be a “sense-datum,” such as (again) a colored region of one’s visual field, or a heard sound or an experienced smell. But it is important to see that qualities of this kind do not presuppose the existence of sense-data or other exotica. Sensory fields are pervaded by such qualities both in everyday veridical experience and in less usual cases. In our first-order sense of the “q”-word, the latter point is the merest common sense, and to deny it would be to take a very radical position. Of course philosophers will immediately debate the nature of these commonsensical qualities and further claims about them, but it is generally agreed that that they are introspectible, apparently monadic or nonrelational, and describable in ordinary English words such as “green,” “loud,” and “sweet” (though it may be questioned whether those words have just the same senses as when they are applied to physical objects and events).

Sensory qualities pose a serious problem for materialist theories of the mind. For where, ontologically speaking, are they located? Suppose Bertie is experiencing a green after-image as a result of seeing a red flash bulb go off; the greenness of the after-image is the quality. Actual Russellian sense-data are immaterial individuals; so the materialist cannot admit that the greenness of the after-image is a property of an actual sense-datum. Nor is it plausible to suggest that the greenness is exemplified by anything physical in the brain (if there is some green physical thing in your brain, you are probably in big trouble). To sharpen the problem:

  • Bertie is experiencing a green thing.
  • Suppose that there is no physical green thing outside Bertie’s head. But
  • There is no physical green thing inside Bertie’s head either.
  • If it is physical, the green thing is either outside Bertie’s head or inside it. Thus,
  • The green thing is not physical. [1,2,3,4] Thus,
  • Bertie’s experience contains a nonphysical thing. [1,5] Thus,
  • Bertie’s experience is not, or not entirely, physical. [6]

This is a valid deductive argument against materialism, and its premises are hard to deny.

The modern representational theory of sensory qualities originates with Hall (1961), Anscombe(1965) and Hintikka (1969); early adherents include Kraut (1982), Lewis (1983), Lycan (1987, 1996), Harman (1990), Shoemaker (1994), Tye (1994, 1995, 2003a), Dretske (1995), Clark (2000), Byrne (2001), Crane (2001, 2003), and Thau (2002). The representational theory is usually (though not always) an attempt to resolve the foregoing dilemma compatibly with materialism. According to the theory, sensory qualities are actually intentional contents, represented properties of represented objects. Suppose Ludwig is seeing a real tomato in good light, and naturally it looks red to him; there is a corresponding red patch in his visual field. He is visually representing the actual redness of the tomato, and the redness of the “patch” is just the redness of the tomato itself. But suppose George Edward is hallucinating a similar tomato, and there is a tomato-shaped red patch in his visual field just as there is in Ludwig’s. George Edward too is representing the redness of an external, physical tomato. It is just that in his case the tomato is not real; it and its redness are nonactual intentional contents. But the redness is still the redness of the illusory tomato. (Note that the representation going on here is good old first-order representation of environmental features, not higher-order as in the “higher-order representation” theories of awareness.)

What about Bertie’s green after-image? On the representationalist (sometimes “intentionalist”) analysis, for Bertie to experience the green after-image is for Bertie to be visually representing a green blob located at such-and-such a spot in the room. Since in reality there is no green blob in the room with Bertie, his visual experience is unveridical; after-images are illusions. The sensory quality, the greenness of the blob, is (like the blob itself) a nonactual intentional content. Of course, in cases of veridical perception, the color and the colored object are not merely intentional contents, because they actually exist, but they are still intentional objects, representata.

And that is how the representationalist resolves our dilemma. As P1 has it, there is a green thing that Bertie is experiencing, but it is not an actual thing. That “there is” is the same lenient non-actualist “there is” that occurs in “There is something that Bertie believes in but that doesn’t exist” and in “There is at least one mythical god that the Greeks worshipped but that no one worships anymore.” (In defending his sense-data, Russell mistook a nonactual material thing for an actual immaterial thing.) Thus, P5, understood as delivering an actual green entity, does not follow.

A slightly surprising but harmless consequence of the representational view as formulated here is that sensory qualities (“qualia” in our strict first-order sense) are not themselves properties of the experiences that present them: Sensory qualities are represented properties of represented objects, and so they are only intentionally present in experiences. The relevant properties of the experiences are, representing this quality or that. Of course, one could shift the meaning slightly and speak of “qualia” as properties of experiences, identifying them with representational features, such as the feature of representing this strict-sense sensory quality or that; nothing much hangs on this terminological choice. (As before, “what it’s like” properties are something else again.)

Most representationalists agree that the perceptual representation of color and other sensible properties is “nonconceptual” in some sense—at least in that the qualitative representations need not be easily translatable into the subject’s natural language. Of course, some psychosemantics would be needed to explain what it is in virtue of which a brain item represents greenness in particular. Dretske (1995) offers one, as does Tye (1995); both accounts are teleologized versions of “indicator” semantics.

2. Varieties of Representationalism

The mere representation of redness does not suffice for something’s looking red to a subject, much less for a sensory quality of red and still less for a phenomenal “something it is like” to experience that quality. One could say the word “red” aloud, or semaphore it from a cliff, or send it in Morse code, or write the French word “rouge” on a blackboard, or point to a color chip. The representation must be specifically a visual representation, produced by either a normal human visual system or by something functionally like one. Similar points would be made for nonvisual qualities, such as subjective bitterness, which would require alluding to the gustatory system.

Thus, the representational theory of sensory qualities cannot be purely representational, but must appeal to some further factor, to distinguish visual representations from other sorts of representations of redness. Dretske (1995) cites only the fact that visual representation is sensory and what he calls “systemic.” Tye (1995) requires that the representation be nonconceptual and “poised,” though he also argues that visual representations of color would differ from other sorts of representations in being accompanied by further representational differences. Lycan (1996) appeals to functional role.

Thus we may distinguish different grades of representationalism about sensory qualities.

Pure representationalism would be the view that representation alone suffices for a sensory quality. But no one holds that view, for the reason just given: representation alone is cheap and ubiquitous. (Lloyd (1991) and Thau (2002) perhaps come close; Thau suggests that representing a certain special sort of content does suffice for a sensory quality.)

Strong representationalism (defended by Dretske, Tye and Lycan) is the view that representation of a certain kind suffices for a sensory quality, where the kind can be specified in functionalist or other familiar materialist terms, without recourse to properties of any ontologically “new” sort. (A mixed representational-functional view is what Block (1996) calls “quasi-representationism.”) We might further contrast (a) theories that appeal to functional considerations only to separate sensory qualities from other represented properties with (b) theories that use functional considerations more ambitiously, to distinguish qualitatively different experiences that have the same intentional content.

Weak representationalism says only that qualitative states necessarily have representational content, which admission is compatible with sensory qualities also necessarily involving features that are ontologically “new” (Block (1990, 1996), Chalmers (1996)). Weak representationalism has been fairly uncontroversial (though it would have been denied by Russell, who showed no sign of thinking that his sense-data represented anything, and by behaviorists and Wittgensteinians who are hostile to the whole idea of mental representation). At the very least, one who rejects it must try to explain why we distinguish between veridical and unveridical experiences; but more recently new opponents such as Campbell (2002), Travis (2004), Noë (2005), Brewer (2006), and Fish (2009) have attempted just that.

Throughout the rest of this entry, unless otherwise noted, “representationalism” shall mean the strong representationalist view. The mixed sort of account that Block calls “quasi-representationism” is a version of strong representationalism, since it does rule out qualitative features that are both nonintentional and nonfunctional.

Also, we shall consider strong representationalism as applying to all sensory states, including bodily sensations as well as visual and other perceptions. Weak representationalism is somewhat controversial for pains, itches and other sensations, since it is not obvious that such sensations represent anything at all. Accordingly, strong representationalism will be all the less defensible for them.

There are further issues that divide strong representationalists, generating different versions of the view. One is the question, of whether sensory qualities themselves, in our very specific sense, exhaust all of what has usually been thought of as a sensory state’s overall phenomenal character. Dretske and Tye maintain that they do; Lycan (1998) and others argue that they do not. This matter will be discussed below, in the context of what is called the “transparency” thesis.

A more important division is that of “narrow” vs. “wide.” In the literature on propositional attitudes beginning with Putnam (1975), the representational content of an attitude is generally thought to be “wide” in that it does not supervene on the contents of the subject’s head; on this view, two molecularly indistinguishable people could have different belief or desire contents, determined in part by objects in their respective environments. Since according to the representational theory, sensory qualities themselves are real or unreal environmental properties, the theory suggests that the qualities too are wide, and molecularly identical subjects could experience different qualities. Dretske (1996) and Lycan (1996, 2001) have explicitly defended this “phenomenal externalism,” though in our present terminology it should be called “qualitative” externalism. Some other representationalists reject this idea and believe sensory qualities to be narrow, necessarily shared by molecular duplicates. Shoemaker (1994), Horgan (2000), Kriegel (2002b), Levine (2003) and Chalmers (2004) defend narrow representationalism. (Rey (1998) calls his view “a narrow representationalist account of qualitative experience,” but it is not an account of sensory qualities in the present sense; if anything Rey favors the elimination of those qualities.) For some arguments on each side, see Section 4.6.

Within wide representationalism or within narrow, there may be disagreement about what kinds of properties are represented. In the previous section, it was assumed that the putative representata are environmental features such as the colors of physical objects. But others have been suggested (Byrne (2001), Levine (2003)): e.g., perceptual experience might instead represent sense-data, or nonexistent colorish properties that physical objects do not really have. Shoemaker (1994) defends the view that a color experience represents a dispositional property, viz., the disposition to cause an experience of just that type. (Kriegel (2002b) and Levine (2003) defend versions of Shoemaker’s view in order to keep sensory qualities narrow and to handle various inversion cases.) On one interpretation at least, Thau (2002) posits a special sort of quasi-color property, distinct from but related to actual colors.

Notice that even on the straightforward view that the representata are the ostensible colors of physical objects, the representational theory does not presuppose color realism. It is true that we have been using color words such as “green” to mean public properties of physical objects, and one could not (without circularity) explicate phenomenal greenness in terms of represented real-world public color and then turn around and construe the latter real physical greenness as a mere disposition to produce sensations of phenomenal greenness, or in any other way that presupposed phenomenal greenness. But one may hold an error theory of physical color, taking the colors of physical objects to be ultimately illusory, and yet maintain that physical color concepts are explanatorily and/or conceptually prior to phenomenal ones.

(There can be more general issues, in other sense modalities, of identifying the relevant worldly representata. E.g., Gray (2003) argues that sensations of heat represent neither heat, nor temperature, nor conductivity, nor energy.)

Chalmers (2004) calls attention to the distinction between Russellian contents and Fregean contents. The former can be a singular proposition or a configuration of objects and their properties. Though the proposition may be believed (etc.) under a mode of presentation, the mode of presentation is not part of the content itself. By contrast, a Fregean content includes the mode of presentation, and does not include individual objects themselves. Representationalists have most often thought in Russellian terms about perceptual contents, but Chalmers argues that the content of a perceptual experience is Fregean. Because it neglects the objects themselves, the Fregean option would lend itself to a narrow representationalist account, if such is wanted; also, it helps to accommodate inversion examples (Section 4.4).

As Crane (2003) and Chalmers (2004) have pointed out, representationalism need not be reductive. One might agree with the strong representationalist that sensory qualities are identical with intentional contents, but also contend that the latter intentional content properties cannot be characterized without reference to sensory qualities, so despite the identity there cannot be reduction without circularity. Maintaining in this vein that “qualia” require a special phenomenal manner of representation and holding that that manner cannot be reduced to the functional, Chalmers defends a nonreductive representationalism. Representationalists who sympathize with the view (of, e.g., Searle (1990) and Siewert (1998)) that intentionality requires consciousness would also be motivated to remain nonreductive. Levine (2003) argues that Shoemaker’s (1994) view is nonreductive, on the grounds that it explicates the qualitative character of an experience in terms of representing a property that is in turn characterized in terms of experiences having that qualitative character (Levine does not consider the apparent circularity vicious). But many other representationalists are motivated by materialism and by the desire to reduce sensory qualities to intentionality, holding that intentionality is the more materialistically tractable of the two.

3. Arguments in Favor of the Representational Theory of Sensory Qualities

There are at least four direct arguments in favor of the representational theory.

Many representationalists hold that the theory not only preserves materialism while accommodating sensory qualities, but is the only very promising way of doing so. For the only viable alternative resolution of our Bertie dilemma seems to be belief in actual Russellian sense-data or at least in immaterial properties. The anti-materialist may not mind sense-data ontologically, but s/he will also inherit the nasty epistemological problems that Russell never succeeded in overcoming: If sensory experience presents us with sense-data and nothing but, the sense-data wall us off from whatever may be the rest of reality, and we are left with a justificatory gap between our beliefs about sense-data and our beliefs about the external world.

More likely, an opponent will hold the line at property dualism, as do Jackson (1982) and Chalmers (1996). That is quite bad enough for the materialist, but of course one who holds no brief for materialism in the first place will not be convinced by the present argument.

There are still nonrepresentationalist alternatives. For example, a materialist might suggest a type-identity of Bertie’s phenomenal greenness with something neurophysiological, but it is not plausible to think that a smoothly and monadically green patch in one’s visual field just is a neural state or event in one’s brain. At best, the type-identity theorist would have to do away with the important claim that greenness itself, rather than some surrogate property, figures in Bertie’s experience; the suggestion would be an error theory, and would have to explain away the intuition that, whatever the ultimate ontology, Bertie really is experiencing an instance of greenness.

Two further alternate materialist treatments of sensory qualities are an “adverbial” theory of the sort recommended by Chisholm (1957) and Sellars (1967), and outright eliminativism.

According to the adverbial theory, Bertie’s experience involves no thing, either actual or nonactual, that is green. Rather, Bertie senses green ly , greenly-sensing being just a type of visual sensing. Our main question, “Where, ontologically, is the green thing?,” thus has a false presupposition and there is no problem. Adverbialism dominated materialists’ thinking for so long that the latter question was hardly ever raised. But (as was not often noticed), adverbialism is a semantical thesis about the logical forms of sensation statements, and as such it has been severely and tellingly criticized, e.g., by Jackson (1977), Butchvarov (1980) and Lycan (1987).

Eliminativism about sensory qualities is suggested if not championed by Dennett (1991) and by Rey (1983, 1998). But if Bertie or anyone else says, “I am visually experiencing greenness,” it is hard either to call that person a liar or to explain how s/he could be subject to so massive a delusion. (Levine (2001) discusses eliminativism at more length.)

Dretske (1996) maintains that there is nothing intrinsic to the brain that constitutes the difference between a red quality and a green one. Unless there are Russellian sense-data or at least immaterial properties, what distinguishes the two qualities must be relational, and the only obvious candidate is, representing red or green. But as before, if one has no objection to sense-data or immaterial properties, one will be unmoved. The neurophysiological type-identity theorist would protest here too, though the same rejoinders apply. A less commissive objection is that, contra Dretske, there are candidate relations besides that of representing: some wide functional relation, perhaps, or a typical-cause relation (where neither of these is itself taken to constitute representing).

We distinguish between veridical and nonveridical visual experiences. How so? It is fairly uncontentious that Bertie’s experience is as of a green blob and has greenness as an intentional object, and that what the experience reports is false. That is hard to dispute. If one instead accepts Russellian sense-data, and thinks of the after-image itself as an actually and independently existing individual—indeed one of the world’s basic building blocks—one then need not also think of it as representational. But one will then have to give an oblique account of the notion of veridicality. If one joins Campbell (2002) et al. in rejecting perceptual representation entirely, one will still have to reconstruct veridicality in some ad hoc way.

The representationalist further argues that the experience’s veridicality condition, i.e., there being a green blob where there seems to Bertie to be one, seems to exhaust not only its representational content but its qualitative content. Once the greenness has already been accounted for, what qualitative content is left?

Since weak representationalism does not entail strong, opponents may offer serious nonrhetorical answers to the argument’s concluding rhetorical question. For example, Block (1996) maintains that Bertie could introspect a certain qualitative property in addition to the greenness of the after-image. And we shall definitely encounter a further kind of content in Section 4.5 below, that may or may not be the same sort of property Block has in mind.

Harman (1990) offers the transparency argument: We normally “see right through” perceptual states to external objects and do not even notice that we are in perceptual states; the properties we are aware of in perception are attributed to the objects perceived. “Look at a tree and try to turn your attention to intrinsic features of your visual experience. I predict you will find that the only features there to turn your attention to will be features of the presented tree, including relational features of the tree ‘from here’” (p. 39). Tye (1995) and Crane (2003) extend this argument to bodily sensations such as pain.

The transparency argument can be extended also to the purely hallucinatory case. Suppose you are looking at a real, richly red tomato in good light. Suppose also that you then hallucinate a second, identical tomato to the right of the real one. (You may be aware that the second tomato is not real.) Phenomenally, the relevant two sectors of your visual field are just the same; the appearances are just the same in structure. The redness involved in the second-tomato appearance is exactly the same property as is involved in the first. But if we agree that the redness perceived in the real tomato is just the redness of the tomato itself, then the redness perceived in the hallucinated tomato—the red quality involved in the second-tomato appearance—is just the redness of the hallucinated tomato itself.

The appeal to transparency makes it immensely plausible that visual experience represents external objects and their apparent properties. But as noted above, that weak representationalist thesis is not terribly controversial. What the transparency argument as it stands does not show, but only claims, is that experience has no other properties that pose problems for materialism. The argument needs to be filled out, and typically is filled out by a further appeal to introspection. The obvious additional premises are: (i) If a perceptual state has relevant mental properties in addition to its representational properties, they are introspectible. But (ii) not even the most determined introspection ever reveals any such additional properties.

(ii) is the transparency thesis proper. (Kind (2003) calls it “strong transparency” and usefully distinguishes it from weaker claims, such as that we are very hard put to introspect additional properties or that we only rarely or abnormally do.) Transparency is vigorously defended by Tye (1995, 2002) and by Crane (2003). Dretske (2003) endorses a radical version of it: that we cannot introspect anything about a perceptual experience, if “introspect” has its usual meaning of internally attending to the experience.

Objections to the transparency thesis typically take the form of counterexamples, mental features of our experiences that can be introspected but allegedly are nonrepresentational. Harman (1990) and Block (1996) speak of “mental paint,” alluding to introspectible intrinsic features of a perceptual representation in virtue of which it represents what it does. Harman precisely denies the existence of mental paint, but Block holds that, in particular, he can introspect the nonintentional, nonfunctional items that he calls “qualia” in a sense quite different from that of sensory qualities. Loar (2003) grants that vision is normally transparent, but argues that we can therapeutically adopt what he calls the “perspective of oblique reflection,” perform a certain imaginative exercise, and thereby come to detect “qualia” in something like Block’s heavily laden sense of that term.

Block further mentions bodily sensations and moods whose representational contents are minimal but which are vividly introspectible. Lycan (1998) argues on similar grounds that the qualities inhering in a sensory experience are only part of that experience’s “overall feel” or phenomenal character in sense (4). Kriegel (2017) adds that, in particular, sensory experiences have affective components that are part of the mode or manner of representing rather than of the representatum.

Turning back to perception, Block notes that if one’s vision is blurry, one can introspect the blurriness as well as the visual representata. (More on blurriness in Section 4.5.2 below.) The point may also be made that the vividness of a perception, say of color, can be introspected over and above the sensed content, but Bourget (2017b) offers to explain vividness in terms of representata.

A more straightforward objection to transparency is that in perceptual experience, we can introspect the relevant sense modality in addition to the content, i.e., whether the representatum is sensed visually, aurally, olfactorily, or whatever. It may be claimed, as by Lycan (1996), that those differences are functional only. However, Tye (2002) has maintained that they can be captured in terms of representational contents. Moreover, Bourget (2017a) argues in some detail that only by individuating the sense modalities in terms of contents can we give the best account of “multimodal” or multi-sensory experiences, ones which unify distinct sub-experiences occurring in different modalities, such as those of drinking a cup of coffee or encountering a barking dog. (Related arguments had been made by Tye (2007) and Speaks (2015).)

Finally, it would seem that for any sensory quality, one can introspect the higher-order property of what it is like to experience that quality (cf. notion (4) listed in the fourth paragraph of this entry). Indeed, doing that seems to be one of introspection’s standard tasks.

These apparent counterexamples take a lot of overcoming. Tye (2003a) addresses some of them, arguing in each case that what appears to be a nonrepresentational difference between two experiences is actually a difference in representata.

As representationalism has been defined here, it does not require the transparency thesis. Representationalism itself is a claim only about sensory qualities, while transparency is about features of experience more generally. Even if transparency fails and there are introspectible nonrepresentational features of experiences, those features are presumably not sensory qualities. (Though some of the foregoing examples have also been used against representationalism; see Section 4.) Of course, if representationalism should be construed as applying to features of experience more broadly, then the existence of some such features may be troublesome for the view so construed; but they may be acceptable to the materialist, e.g., because they are functional.

Byrne (2001) and Thau (2002) appeal to the notion of the way the world seems to a subject. Very briefly, as Byrne puts it: “if the way the world seems to him hasn’t changed, then it can’t be that the phenomenal character of his experience has changed” (p. 207), where by “phenomenal character” in the context Byrne means sensory qualities. Suppose a subject has two consecutive experiences that differ in qualitative character. If she is “competent” in the sense of having no cognitive shortcomings (in particular, her memory is in good working order) and is slightly idealized in one or two other ways, she will notice the change in qualitative character. If so, Byrne argues, the way things seems to her when she has the second experience must differ from the way they seemed to her while she was having the first. For suppose that consecutive experiences are the same in content. Then the world seems exactly the same to the subject during both. She “has no basis for” noticing a change in qualitative character either, and by the previous premise it follows that there was no change in qualitative character (p. 211). The argument generalizes in each of several natural ways, and Byrne concludes that experiences cannot differ in qualitative character without differing in representational content.

If this sounds too close to being another simple appeal to transparency—and/or to beg the question against mental paint—Byrne hastens to add that his argument does not require transparency and is compatible with the existence of mental paint (pp. 212–13). So far as a subject is aware of mental paint, her experience is “partly reflexive” and represents its own paint. Therefore, a difference in paint would be another difference in representatum, not a qualitative difference unaccompanied by a content difference.

From the orthodox representationalist point of view, that may seem a dangerous concession. Byrne does hew to the representationalist’s line of supervenience (no qualitative difference without an intentional difference), but if his argument does not rule out mental paint, an anti-representationalist may construct inversion cases such as that of Block’s (1990) “Inverted Earth” (see Section 4.4 below), and argue that the paint is a nonfunctional intrinsic mental feature of the experience given in introspection, which is close enough to a “quale” in Block’s special sense, even if the feature does happen to be reflexively represented by the experience itself.

An anti-representationalist might also complain that Byrne has equivocated on “seems.” Block (1996) argued that “looks” (as in “That thing looks red to me”) is ambiguous, as between an intentional or representational sense and a separable phenomenal sense, and he believes inversion cases show that the two sorts of looking can come apart. No doubt he would hold the same of “seems.” Whether or not one is persuaded of that claim, Byrne’s argument presupposes that it is false. While doing that does not strictly beg the question, the argument does help itself to an assumption that is unlikely to be granted by the anti-representationalist.

Pautz (2007, 2010) appeals to hallucinatory experience. Suppose you hallucinate (simultaneously) a red ellipse, an orange circle, and a green square, without ever previously having encountered any of those colors or shapes. That experience directly gives you the capacity to form beliefs about the external world, e.g., that there is a red ellipse, that red is more like orange than like green, and that ellipses are more like circles than like squares. This “grounding property” of the experience motivates a “relational” view of it, according to which having the experience puts the subject in a relation to “items involving properties which, if they are properties of anything at all, are properties of extended objects” (2007, p. 524). Pautz offers two further arguments for relationality, based respectively on the “matching property” and the “characterization property.”

Given relationality, representationalism still has three rivals: sense-data, Peacocke’s (2008) “sensationalist” theory, and Alston’s (1999) “theory of appearing.” But each of the rivals succumbs to objections; so representationalism is true of hallucinatory experience. Now, why not extend representationalism to experience across the board? At this point there is just one still unrefuted opponent: the “positive” disjunctivist who maintains that veridical experience differs radically in kind from hallucination. Pautz argues that that view is not worth the complications it enforces.

It remains to show that an experience’s characteristic qualitative properties can be identified with its representational properties. At this point Pautz appeals to the assumption that for any qualitative difference between two visual experiences, the difference has a spatial component, either in distinguishing two subregions of one’s visual field or in attending to one rather than another. In no case, Pautz maintains, is there going to be an experiential difference without a representational difference.

Since Pautz has proceeded by objecting to various competing views, we must hear what their respective proponents will say in rebuttal.

The discussion in this past section and in the next focus on the nature of sensory qualities themselves. According to the representationalist, the qualities are not mental; the corresponding mental property of a sensory state is that of representing the relevant quality. Of course, it is sensory states and experiences themselves that interest philosophers of mind, and some critics of representationalism will protest that merely representing a quality cannot be all there is to having the qualitative character that needs explaining; we shall return to that complaint in section 4.2 below.

As yet we have said nothing about what it is to be aware of a quality.

If a sensory quality is an intentional object of a mental state, then presumably the state’s owner is aware of the quality in whatever way a person is aware of the intentional contents of her/his mental states generally, including those of nonsensory propositional attitudes. The general issue is problematic and is much discussed in the “self-knowledge” literature. There are various options: higher-order representation; self-representation; attentional modulation; “acquaintance” of some sort more intimate than any of the foregoing; or the automatic replication of a first-order state’s content in any other state directed upon the first. In any case, however, the problems of awareness of content are already with us, and do not afflict the representational theory of sensory qualities in particular.

4. Objections to the Representational Theory

The (strong) representational theory entails the obvious supervenience claim: that there can be no difference in sensory qualities without a representational difference. Objections to the theory have most often come in the form of counterexamples to that thesis. But we shall begin with four more general complaints.

Some philosophers are squeamish about the representationalist’s commitment to nonactual objects in cases of hallucination or perceptual illusion. For example, Loar (2003) imagines comparing the experience of seeing a lemon and a subjectively indistinguishable case of hallucinating an exactly similar lemon. “A way of putting this is representational: the two experiences present the real lemon and a merely intentional object as exactly similar, and that is what makes the experiences indistinguishable…. At the same time, one has a good sense of reality, and so wants to hold that the merely intentional lemon is nothing at all, and so not something that can resemble something else” (p. 84). A similar sentiment is sympathetically attributed to Fred Dretske by Levine (2003, p. 59n). The representational theory is sometimes assimilated to Alexius Meinong’s fanciful view that along with the many things that actually exist, there are plenty of other things that are like the things that exist except for happening to lack the property of existing. (Thus, Mage exists but does not have wings; Pegasus lacked existence but had wings.)

But it is important to see that the metaphysics of nonexistence is everyone’s problem, not peculiarly that of the representationalist (or of one’s current opponent on whatever issue in the philosophy of mind). There are things that do not exist, such as a hallucinated pink rat or a hallucinated lemon. However troublesome it is for fundamental ontology, that fact does not entail Meinong’s exegesis of it, or David Lewis’ concretist interpretation, or any other particular metaphysical account of it. The representational theory of sensory qualities is neutral on such underlying issues; it says only that when you hallucinate a lemon, the yellowness you experience is that of the lemon. Of course, neither the lemon nor its color actually exists, but as before, there are plenty of things that do not exist. (And one should question whether, as Loar maintains, nonactual things and people cannot resemble actual ones.)

Sturgeon (2000), Kriegel (2002a) and Chalmers (2004) argue that representation cannot suffice for a sensory quality (or for there being something it is like for the subject to do the representing, but that is not our topic for now), because representation can occur unconsciously. This appears to refute pure representationalism, since according to that view representation of the relevant sort of property does necessarily constitute a sensory quality. The point is not highly significant, since as before, pure representationalism is an unoccupied position. The real question is, whether the concern behind this objection carries over to “quasi-” or other strong representationalism.

Sturgeon does seem to hold the stronger view that not even representation of whatever special sort is appealed to by the quasi-representationalist can suffice for a sensory quality, because any such representation can occur unconsciously. Since the quasi-representationalist maintains precisely that a sensory quality is simply a representatum of the relevant sort, this would be an outright refutation.

This objection rests on the crucial assumption that sensory qualities can occur only consciously. That assumption shares the usual multiple ambiguity of “conscious.” The interpretation on which the objection’s premise is most obviously true is that of sense (1) or sense (2) above: Representation can occur without its subject’s being aware of it, and/or without the subject’s introspecting it. But if we then understand the tacit assumption in the same way, it would be independently rejected by most representationalists, who already hold that a sensory quality can occur without being noticed by its host. (Consider the driver driving “on autopilot” who obviously saw the red light, and saw its redness in particular, but who was daydreaming and quite unaware of the redness, or even of applying the brake.) If, rather, sense (3) is intended, the assumption would be fine, because tautologous (a quality cannot occur without a quality occurring); but the objection’s premise would be, in effect, a flat and question-begging denial of strong representationalism, saying that the relevant representation can occur without a sensory quality occurring.

What of sense (4)? Here the objection gets a slightly better foothold. The premise is true; representation can occur without there being something that representing is like for the subject. And there is at least a sense of the phrase “what it’s like” in which the tacit assumption is true also: Recall that some theorists have used that phrase simply to mean a sensory quality (in sense (3)); so again the assumption would be tautologous. But the present concern is for sense (4), and at this point the objection breaks down. For so far as has been shown, a (first-order) quality can occur without there being anything it is like for the subject to experience that quality on that occasion; the subject may be entirely unaware of it.

In virtue of what, then, does an experience contain, or have inhering within it, a sensory quality? The representationalist’s answer is, in virtue of representing that quality in a distinctive way. What are distinctive about that mode of representation are (a) the functional considerations needed to specify the relevant sense modality, and (b) assuming “experience” implies awareness of the sensory quality, whatever is called for by one’s account of awareness-of.

Terry Horgan and co-authors, beginning with Horgan (2000) and Horgan and Tienson (2002), inaugurated the “Phenomenal Intentionality Research Program.” (The view was anticipated by Siewert (1998), and see, e.g., the essays collected in Kriegel (2013), as well as Mendelovici (2018).) It was inspired in part by an idea of Loar’s (1987, 2003), but its approach is different and its claims are much more ambitious. Its proponents defend an internal, narrow type of intentionality that (Horgan and Tienson say) is not only determined by phenomenology but is constituted by it (pp. 520, 524); indeed, they contend (p. 529) that their internal intentionality is “the fundamental kind of intentionality: the narrow, phenomenal kind that is a prerequisite for wide content and wide truth conditions.” And by “phenomenal,” they seem to mean “what it’s like” properties in the higher-order sense.

This would imply that “what it’s like” properties are conceptually or at least metaphysically prior to intentionality. Put together with representationalism about sensory qualities, it would follow that “what it’s like” properties are prior to those, which is quite contrary to the spirit of (though not flatly incompatible with) representationalism, and certainly it poses a general threat of circularity. Moreover, some proponents concede that it is anti-naturalistic and may require a departure from materialism; but Mendelovici and Bourget (2014) argue that naturalistic considerations actually favor phenomenal intentionality.

An obvious objection is that many perceptual states and other clearly intentional occurrent states are not conscious in sense (1) above; their subjects are entirely unaware of them, and they have no phenomenology at all. Kriegel (2011) replies, contending that the intentionality of such states is not “intrinsic” but “interpretivist,” merely charitable interpretation à la Donald Davidson and D.C. Dennett. Mendelovici (2018, Ch. 8) reviews several further strategies and proposes a combination of those.

A further objection is that since phenomenal intentionality is narrow, the view cannot account for the wide intentionality of everyday mental states. Proponents reply either that wide “contents” are not really contents (e.g., Horgan and Tienson, Bourget (2010)), or that they are derived in a deflationary way from the narrow phenomenal contents (e.g., Siewert, Kriegel, Farkas (2008)).

The “phenomenal intentionality” view will have to be considered on its merits, and after we have seen what account it affords of the sensory qualities. (However, it blocks any direct argument for representationalism based on transparency. Indeed, Kriegel (2007, p. 321) appeals to transparency in support of phenomenal intentionality.) For further discussion, see the entry on phenomenal intentionality .

Pautz (2017) points out that there are restrictions on sensory appearances that have no parallel for other intentional states such as beliefs. For example, a surface cannot appear both round and square, or both pure red and pure yellow at the same place at the same time; nor can anything seem round or red but not extended. Pautz contends that such impossibilities are metaphysical in strength. In contrast, there is nothing metaphysically or even nomologically impossible, but only irrational, about contradictory or otherwise anomalous beliefs or desires. How is a representationalist to explain the existence of such laws?

Speaks (2017) responds by questioning whether the laws are metaphysically necessary. Perhaps they arise simply from how we are built. We might have had sense modalities that worked differently, in such a way as to allow representation of the anomalous states of affairs.

Further, even if the impossibilities are metaphysical, they may arise from the metaphysics of functional roles of sensory states rather than from the sense modalities’ representational capacities.

Now we come to the cases in which, allegedly, either two experiences differ in their sensory qualities without differing in intentional content or they differ entirely in their intentional content but share sensory qualities. Let us start with the degenerate case in which (supposedly) there is no intentional content to begin with.

4.5.1 Nonintentional mental states

If every sensory quality is a represented property, then phenomenal character in sense (3) is exhausted by the intentionality of the relevant experience. Since vision is pretty plainly representational, it is no surprise that representationalists have talked mainly about color qualities. But many other mental states that have phenomenal character are not intentional and do not represent anything: bodily sensations, and especially moods. Rey (1998): “Many have noted that states like that of elation, depression, anxiety, pleasure, orgasm seem to be just overall states of oneself , and not features of presented objects” (p. 441, italics original). For that matter, it is hardly obvious that every specifically perceptual experience represents—smell, for example, does not clearly do so.

The representationalist has several options here. First, s/he could restrict the thesis to perceptual experiences, or to states that are admitted to be intentional. But that would be ad hoc, and would leave the phenomenal character of the excluded mental states entirely unexplained.

Second, representationalists such as Lycan (1996, 2001) and Tye (1995, 2003b) have, in some detail, defended Brentano’s thesis that in fact every mental state is intentional, including bodily sensations and moods. It is easy enough to argue that pains and tickles and even orgasms have some representational features (see Tye (1995) and Lycan (1996)). For example, a pain is felt as being in a certain part of one’s body, as if that part is disordered in a certain way; that is why pains are described as “burning,” “stabbing,” “throbbing” and the like. Though it is harder to maintain that a mood has intentional content, it is plausible to say that a state of elation, for example, represents one’s surroundings as being beautiful and exciting, and free-floating anxiety represents that something bad is about to happen. However, this does not answer the previous objection that even when it is admitted that a pain, a tickle, or a general depression does represent something, that representational content does not loom very large in the overall phenomenal character of the mental state in question.

Third, if transparency is rejected, other introspectible features of an experience can be allowed to count as part of its phenomenal character. Lycan (1998) argues that for some mental states, sensory qualities do not exhaust their “overall feel.” Consider pains. Armstrong (1968) and Pitcher (1970) argued convincingly that pains are representational and have intentional objects, real or unreal as usual, which objects are unsalutary conditions of body parts; pain is a kind of proprioception. But those intentional objects are not all I can introspect about a pain. I can also introspect its awfulness, its urgency. We should distinguish the pain’s sensory quality, its specifically sensory core (say, the throbbing character of a headache) from the pain’s affective and conative aspect that constitutes its awfulness. Those are not normally felt as distinct, but two different neurological subsystems are responsible for the overall experience, and they can come apart. The quality is what remains under morphine; what morphine blocks is the affective aspect—the desire that the pain stop, the distraction, the urge to pain-behave. It is then open to the materialist to treat the affective components as functional rather than representational, and that is not ad hoc.

Fourth, recall the distinction between senses (3) and (4) of the tricky term “phenomenal character.” As always, the representational theory addresses only sense (3). But sense (4), that of “what it’s like” to entertain a given sensory quality, can be generalized: It is not only qualities that have the higher-order what-it’s-like property; arguably propositional attitudes and other states that do not involve qualities in sense (3), such as occurrent thoughts, have it too (Siewert (1998), Pitt (2004)). So the representationalist can accuse the present objection of confusing (4) with (3). It remains to be argued how plausible that accusation would be.

4.5.2 Same intentional contents, different sensory qualities

Peacocke (1983) gave three examples of this kind, Block (1995, 1996) a few others; for discussion of those, see Lycan (1996). Tye (2003a) provides an extensive catalogue of further cases, and rebuts them on the representationalist’s behalf. In each case, the representationalist tries to show that there are after all intentional differences underlying the qualitative differences in question.

Trees . In Peacocke’s leading example, your experience represents two (actual) trees, at different distances from you but as being of the same physical height and other dimensions; “[y]et there is also some sense in which the nearest tree occupies more of your visual field than the more distant tree” (p. 12). That sense is a qualitative sense, and Peacocke maintains that the qualitative difference is unmatched by any representational difference.

Tye (1995) and others have rejoined that there are after all identifiable representational differences constituting the qualitative differences in the trees example. Tye points to the fact that one of the trees subtends a larger visual angle from the subject’s point of view, and he argues that this fact is itself (nonconceptually) represented by the visual experience. Lycan contends that perceptual representation is layered, and vision in particular represents physical objects such as trees by representing items called “shapes,” most of which are nonactual; in the trees case differing shapes are represented. Much more promisingly, Schellenberg (2008) appeals to “situation-dependent” properties of external objects, by perceiving which we also perceive the high-level properties of the same objects. Byrne (2001) merely observes that one of the trees is represented as being farther from the subject than the other. Bourget (2015) even less commissively points out that your experience represents fewer small features of the farther tree. For further discussion, see Millar (2010).

Peacocke’s second and third examples concern, respectively, binocular vision and the Necker reversible-cube illusion. On the former, see Tye (1992) and Lycan (1996). The latter has given rise to a distinctive literature.

Aspect-perception and attention. The Necker cube is one of a growing family of alleged counterexamples involving aspect-perception or selective attention. Others include ambiguous pictures such as the famous duck-rabbit; arrays of dots or geometric figures which can be “grouped” by vision in alternate ways; or other displays which can be attended to in multiple ways. In each case, a single and unchanging figure that seems to be univocally represented by vision nonetheless gives rise to differing visual experiences.

Representationalists of course respond by trying to specify distinct properties as characteristic representata in the differing experiences. For example, a “duck” experience of the duck-rabbit will represent the property of being a bill without representing that of being an ear; the “rabbit” experience will do the opposite. One way of grouping dots will mobilize the concept of a row but not that of a column, etc. Macpherson (2006) offers a rich survey of such examples and rebuts representationalist replies both existing and anticipated.

One of the most interesting recent examples (not discussed by Macpherson) is offered by Nickel (2007):

Figure 1
           
           
           
Figure 2
 1   2   3 
 4   5   6 
 7   8   9 

Nickel says we can see an arbitrarily chosen set of constituent squares “as prominent.” For example, in Figure 1 we can see the squares corresponding to 1, 3, 5, 7, and 9 as prominent, or alternately see 2, 4, 6, and 8 as prominent, without changing where we look and, it seems, while representing just the same figure and its elements all the while. In particular, we need not change the focus of our vision, but leave it on the center of Figure 1, yet have different experiences.

The representationalist has several options here. First, focusing on Nickel’s phrase “see as prominent,” s/he could claim that a distance illusion is created, so that the “farther away” relation is represented; or, noting that the preposition “as” seems already to be representational language, s/he could appropriate Nickel’s own term “prominent” as designating a property and just leave it unexplicated. Second, the representationalist could insist that the figure is pictorial, and then invoke some version of figure-ground, or assimilate the case to seeing-as of some other sort (assuming s/he had already provided a representational account of seeing-as more generally). Third, s/he might reject Nickel’s assumption that the whole figure is actually seen at one time, writing off the contrary impression as what Noë (2004) calls “presence as absence.” And there are other possibilities, though each is bound to be contentious.

Block (2010) offers cases in which shifts of attention seem to change sensory qualities (Carrasco (2006)). “The effect of attention is experienced in terms of appearance of contrast, speed, size, color saturation, etc. Attended things look bigger, faster, more saturated, and higher in contrast” (p. 44). Realism about contrast, speed and the rest being assumed, it would seem clear that if an attended thing looks (e.g.) bigger than its actual size, that is just a false or inaccurate representation. But Block takes pains to forestall that inference.

Blurry vision . We must revisit that case (which actually was introduced as a problem for representationalism by Boghossian and Velleman (1989)), because it requires a wrinkle in the representationalist strategy. The normal representationalist move would be to say that the visual experience represents the relevant part of the world as being blurry, but here we want to concede that there is a phenomenal difference between seeing an object as being blurry and blurrily seeing a nonblurry object. Tye (2003a) points out that that difference can be characterized informationally: In the former case, as when looking at a blurry painting, vision represents the blurred edges as such, and just where they lie. But in the latter, vision provides less information, and fails to represent the sharp edges. Tye distinguishes similarly between nonveridically seeing a sharp object as blurry, which experience incorrectly represents the boundaries as fuzzy, and seeing the same object blurrily, which does not represent them, except to place them within broad limits. Allen (2013) contends, to the contrary, that a blurry visual experience represents objects as having multiple boundaries. Bourget (2015) argues that, whatever the positive details, a blurry experience loses some of the content that would be represented by a sharper experience of the exact same scene.

4.5.3 Inversions

Inversion examples in the tradition of Locke’s “inverted spectrum” form a special category of alleged counterexamples to representationalism. Some fit the foregoing model (same intentional contents, different sensory qualities), some do not. Lockean inversion was that of color qualities with respect to behavioral dispositions, which is regarded as possible by everyone except behaviorists and Wittgensteinians. To find an inversion counterexample to the representational theory, the objector would have to posit qualities inverted with respect to all representational contents, or, in the case of “mixed” or “quasi-” representationalism, qualities inverted with respect to all representational contents and all the relevant functional etc. properties. (It is important to see that the latter inversion hypothesis is much more ambitious and should be much more controversial than the original Lockean idea.)

Shoemaker (1991) contends that this strong sort of inversion is possible, i.e., that sensory qualities could invert with respect to representational contents. But his only argument seems to be that such an inversion is imaginable, or conceivable in a thin sense. Since the representationalist’s claim is precisely that sensory qualities just are representational contents of a certain kind, but not that this is analytically or conceptually true, Shoemaker has given her/him no reason to think that the inversion is really, metaphysically, possible. (Also, it is too easy to think of color looks inverting with respect to mere representation; cf. the opening paragraph of Section 2. One has to try to imagine their inverting with respect to visual representation of the appropriate type.) Yet there are further inversion scenarios, supported by argument, that the representationalist must take seriously.

Fish-heads . Building on an example of Byrne’s (2001), Levine (2003) supposes that there are creatures whose eyes are on opposite sides of their heads and whose heads are fixed, so that they never look at an object with both eyes. Now, imagine one such creature whose eyes’ lenses are color-inverted with respect to each other. (It is not that one lens has been inverted; the creatures are born thus mismatched.) It seems that identically colored objects simultaneously presented will look, say, green to one eye but red to the other. Yet the same worldly color property (i.e., a reflectance property of whatever sort) is being represented by each eye. Now, every eye is normal within the population, so neither can easily be described as mis representing the colors of red objects. Each eye just sees the colors differently, and so the difference is not exhausted by the common representatum.

The first point to be made on the representationalist’s behalf is that, as Levine goes on to admit (p. 71), the eyes seem to be representing the world differently; “space appears differently filled on the two sides of the head.” Also, if the fish-head were able to turn and look at the same object first with one eye and then with the other and back again, the object would successively appear to it to be different colors. So we do not here have a case of phenomenal difference without representational difference.

But there is still a puzzle. If the two eyes are representing different properties and neither is misrepresenting, and only the one surface reflectance property is involved, what are the two distinct representata? Several options are available. (i) One could try to find a basis for saying that one of the eyes is (after all) misrepresenting, though it is hard to imagine what basis that might be. (ii) As Levine points out, one could fall in with the view of Shoemaker (1994) mentioned in Section 2, that the eyes are representing distinct dispositions even though the dispositions are realized by the same physical properties. (iii) If the eyes are mutually color-inverted, then they differ functionally. A psychosemantics such as Dretske’s (1986) that makes essential reference to function might therefore distinguish representata here. (iv) To the extent that each creature’s two eyes differ functionally from each other, the creature has two different and nonequivalent visual systems. Perhaps, then, we cannot say that either eye represents its red object as red, or as green; the same reflectance property is one color for one of the visual systems and a different color for the other, as it might be between two different species of organism, and we do not know what those colors are. That the realizing reflectance property is the same in each case does not establish sameness of representatum, because that property may be a common disjunct of each of two distinct disjunctive properties that are respectively colors for the two types of visual system.

Inverted Earth . Block (1990) appeals to an “Inverted Earth,” a planet exactly like Earth except that its real physical colors are (somehow) inverted with respect to ours. The Inverted Earthlings’ speech sounds just like English, but their intentional contents in regard to color are inverted relative to ours: When they say “red,” they mean green (if it is green Inverted objects that correspond to red Earthly objects under the inversion in question), and green things look green to them even though they call those things “red.” Now, an Earthling victim is chosen by the customary mad scientists, knocked out, fitted with color inverting lenses, transported to Inverted Earth, and repainted to match that planet’s human skin and hair coloring. Block contends that after some length of time—a few days or a few millennia—the victim’s word meanings and propositional-attitude contents and all other intentional contents will shift to match the Inverted Earthlings’ contents, but, intuitively, the victim’s color qualities will remain the same. Thus, sensory qualities are not intentional contents.

A natural representationalist reply is to insist that if the intentional contents would change, so too would the qualitative contents. Block’s nearly explicit argument for denying this is that “qualia” (he fails to distinguish sensory qualities, sense (3), from their higher-order “what it’s like” properties, sense (4)) are narrow, while the intentional contents shift under environmental pressure precisely because they are wide. If sensory qualities are indeed narrow, and all the intentional contents are wide and would shift, then Block’s argument succeeds. (Stalnaker (1996) gives a version of Block’s argument that does not depend on the assumption that the qualities are narrow; Lycan (1996) rebuts it.)

Three replies are available, then: (i) To insist that the visual intentional contents would not shift. Word meanings would shift, but it does not follow that visual contents ever would. (ii) To hold that although all the ordinary intentional contents would shift, there is a special class of narrow though still representational contents underlying the wide contents; sensory qualities can be identified with the special narrow contents. (iii) To deny that qualitative content is narrow and argue that it is wide, i.e., that two molecularly indistinguishable people could indeed experience different qualities. This last is the position that Dretske (1996) has labelled “phenomenal externalism,” though (again) in our terminology that would have been “qualitative” externalism.

Reply (i) has not been much pursued. (ii) has, a bit, by Tye (1994) and especially Rey (1998). Rey argues vigorously that “qualia” are narrow, and then offers a narrow representational theory. (But as previously mentioned, it turns out that Rey’s theory is not a theory of sensory qualities; see Section 4.5.) Note that Fregean as opposed to Russellian representationalism is well suited to (ii); even if the Russellian contents shift, the Fregean contents need not. Chalmers (2004) advocates such a view. (Papineau (2014) offers a fourth alternative: to say that although sensory states do represent worldly properties such as color and shape, the sensory qualities themselves are simply not representata and do not shift when the environment is inverted; rather, they are just narrow properties of subjects. He tries to explain away our feeling that sensory qualities are presented to the mind as worldly.)

Reply (iii) has been defended by Dretske (1995, 1996), Tye (1995) and Lycan (1996, 2001). A number of people (even Tye himself (1998)) have since called the original contrary assumption that sensory qualities are narrow a “deep / powerful / compelling” intuition, but it proves to be highly disputable. Here are two arguments, though not very strong arguments, for the claim that the qualities are wide.

First, if the representational theory is correct, then sensory qualities are determined by whatever determines a psychological state’s intentional content; in particular, the color properties represented are taken to be physical properties instanced in the subject’s environment. What determines a psychological state’s intentional content is given by a psychosemantics , in Fodor’s (1987) sense. But every known plausible psychosemantics makes intentional contents wide. Of course, the representational theory is just what is in question; but if one grants that it is independently plausible or at least defensible, the further step to externalism is not a giant step.

Second, suppose sensory qualities are narrow. Then Block’s Inverted Earth argument is plausible, and it would show that either the qualities are narrow functional properties or they are properties of a very weird kind whose existence is suggested by nothing else we know (see Ch. 6 of Lycan (1996)). But sensory qualities are not functional properties, at least not narrow ones: recall the Bertie dilemma. Also, they are ostensibly monadic properties, while functional properties are all relational; and see further Block’s anti-functionalist arguments in Block (1978). So, either sensory qualities are wide or weirdness is multiplied beyond necessity. Of course, that dichotomy will be resisted by anyone who offers a narrow representationalist theory as in (ii) above.

The Scrambler . Biggs (2009) constructs a complicated example in which a human-like species has evolved in such a way that the sensory qualities inhering in its members’ perceptual states are entirely disconnected both functionally and causal-historically from their worldly environments. Biggs argues that those states simply lack representational content, and he anticipates and rebuts the most likely replies.

Although until the mid-1990s the assumption that sensory qualities are narrow had been tacit and undefended, opponents of wide representationalism have since defended the assumption with vigor. Here are (only) some of their arguments, with sample replies. (For a fuller discussion, see Lycan (2001).)

Introspection . Block’s Earthling suddenly transported to Inverted Earth or some other relevant sort of Twin Earth would notice nothing introspectively, despite a change in representational content; so the sensory quality must remain unchanged and so is narrow.

Reply: The same goes for propositional attitudes, i.e., the transported Earthling would notice nothing introspectively. Yet the attitude contents are still wide. Wideness does not entail introspective change under transportation.

Narrow content . In the propositional-attitude literature, the corresponding transportation argument has been taken as the basis of an argument for “narrow content,” viz., for something that is intentional content within the meaning of the act but is narrow rather than, as usual, wide. The self-knowledge problem aforementioned, and the problem of “wide causation” (Fodor (1987), Kim (1995)), have also been used to motivate narrow content. And, indeed, any general argument for narrow content will presumably apply to sensory representation as well as to propositional attitudes. If there is narrow content at all, and sensory content is representational, then probably sensory states have narrow content too. Thus, sensory qualities can and should be taken to be the narrow contents of such states.

Replies: First, this begs the question against the claim that the qualities are wide. Even if there are indeed narrow contents impacted within sensory states, independent argument is needed for the identification of sensory qualities with those contents rather than with wide contents. Second and more strongly, narrow sensory contents still would not correspond to sensory qualities in our sense. So far as has been shown, the redness of a patch in my visual field is still a wide property, even if some other, narrow property underlies it in the same way that (mysterious, ineffable) narrow contents are supposed to underlie beliefs and desires.

Modelling a shift of qualities . If perceptual contents are wide and the environment is subject to change, we should expect a shift, even if the perceptual contents would not shift as readily as attitude contents would. Perhaps they would eventually shift after several centuries on Inverted Earth, if a subject could stay alive that long. But how would a distinctive quality even imaginably undergo such a shift? For example, suppose that a quality is supposed to shift from blue to yellow. A shift from blue to yellow might reasonably be supposed to be a smooth and gradual shift along the spectrum that passes through green. But it is hardly plausible that one would experience such a shift, or a period of unmistakable greenness in particular.

Reply: We have no plausible model for a shift of everyday attitude content either. How would a type of belief state smoothly go from being about blue to being about yellow? Presumably not by being about green in between. So our presumed quality shift is no worse off than the attitudinal shift in this regard; if the present argument works for the former case, it also works for the latter, contrary to hypothesis.

To this it may be rejoined that attitude contents are more tractable, in that they may yield to some view of aboutness according to which reference can divide for a time between contents such as blue and yellow. (E.g., Field’s (1973) theory of “partial reference.”) It is harder to imagine “divided” phenomenology.

Modes of presentation (Rey (1998); Chalmers (2004) defends a similar view). There is no such thing as representation without a mode of presentation. If a sensory quality is a representatum, then it is represented under a mode of presentation, and modes of presentation may be narrow even when the representational content itself is wide. Indeed, many philosophers of mind take modes of presentation to be internal causal or functional roles played by the representations in question. Surely they are strong candidates for qualitative content. Are they not narrow qualitative features?

Reply: Remember, the sensory qualities themselves are properties like subjective greenness and redness, which according to the representational theory are representata. The modes or guises under which greenness and redness are represented in vision are something else again.

But it can plausibly be argued that such modes and guises are qualitative or phenomenal properties of some sort, perhaps higher-order properties. See the next section.

Memory (Block (1996)). “[Y]ou remember the color of the sky on your birthday last year, the year before that, ten years before that, and so on, and your long-term memory gives you good reason to think that the phenomenal character of the experience has not changed…. Of course, memory can go wrong, but why should we suppose that it must go wrong here?” (pp. 43–44, italics and boldface original). The idea is that memory acts as a check on the sensory qualities, and can be used to support the claim that the qualities have remained unchanged despite the wholesale shift in representational contents.

Reply: Memory contents are wide, and so by Block’s own reasoning they will themselves undergo the representational shift to the Inverted-Earth complementary color. Thus, your post-shift memories of good old Earth are false. When you say or think to yourself, “Yes, the sky looks as blue as it did thirty years ago,” you are not expressing the same memory content as you would have when you had just arrived on Inverted Earth. You are now remembering or seeming to remember that the sky looked yellow, since for you “blue” now means yellow. And that memory is false, since on the long-ago occasion the sky looked blue to you, not yellow; memory is not after all a reliable check on the sensory qualities. (Lycan (1996) takes this line; Tye (1998) expands it in more detail.)

Structural mismatch. Following Hardin (1988) and others, Pautz (2014, 2019) argues that the structural properties of a sensory field, paradigmatically resemblance relations, match the neural substrates of the relevant experiences rather than the chemical or physical properties of the representata. For example, the sensory quality blue resembles purple more strongly than it resembles green, but the reflectances underlying worldly objects are the other way around: the blue reflectance type resembles the green reflectance type more than it does the purple. Even more dramatic mismatches obtain in the case of smell. Therefore, it seems that a sensory quality is a narrow property rather than the wide worldly one predicted by standard externalist psychosemantics.

Reply: First, that a sense modality represents two worldly properties as being similar to degree n does not entail that they are thus similar; there may be illusion regarding resemblance. Second, there will be room for debate regarding exactly what chemical or physical properties do constitute the representata, and regarding what psychosemantics connects those properties to the sensory experience.

Hardly anyone will accept all of the foregoing replies. But no one should now find it uncontestable either that sensory qualities are narrow or that they are wide. The matter is likely to remain controversial for some time to come.

5. What It’s Like

Some philosophers (e.g., Dretske (1995), Tye (1995)) use this troublesome expression simply to mean a sensory quality, and this is one of the two meanings it has had in recent philosophy of mind. But in the fourth paragraph of this entry, the phrase was introduced in the context, “‘what it’s like’ for the subject to be in a particular mental state, especially what it is like for that subject to experience a particular qualitative property,” which indicates that there is another sense (4) in which (when the mental state does involve a sensory quality) the “what it’s like” is something over and above the quality itself. In fact, since this second “what it’s like” is itself a property of the quality, it cannot very well be identical with the quality. It is the property of what it is like to experience that quality; alternately, the relevant introspectible property of the experience itself. Let us now just speak of “what it’s like” (WIL) properties, meaning just this higher-order phenomenal sort.

Block (1995), like many other writers, fails to distinguish WIL properties from sensory qualities. But Carruthers (2000) elaborates nicely on the distinction: A quality in the first-order sense presents itself as part of the world, not as part of one’s mind. It is, e.g., the apparent color of an apparently physical object (or, if you are a Russellian, the color of a sense-datum that you happen to have encountered as an object of consciousness). A sensory quality is what the world is or seems like. But what it is like to experience that color is what your first-order perceptual state is like, intrinsically mental and experienced as such.

Here are two further reasons for maintaining such a distinct sense of the phrase. First, a sensory quality can be described in one’s public natural language, while what it is like to experience the quality seems to be ineffable. Suppose Ludwig asks Bertie, “How, exactly, does the after-image look to you as regards color?” Bertie replies, “I told you, it looks green.” “Yes,” says Ludwig, “but can you tell me what it’s like to experience that ‘green’ look?” “Well, the image looks the same color as that,” says Bertie, pointing to George Edward’s cloth coat. “No, I mean, can you tell me what it’s like intrinsically, not comparatively?” “Um,….” —In one way, Bertie can describe the phenomenal color, paradigmatically as “green.” But when asked what it is like to experience that green, he goes mute. So there is a difference between (a) “what it’s like” in the bare sense of the quality, the phenomenal color that can be described using ordinary color words, and (b) “what it’s like” to experience that phenomenal color, the WIL property, which cannot easily be described in public natural language at all.

Second, Armstrong (1968), Nelkin (1989), Rosenthal (1991), and Lycan (1996) have argued that sensory qualities can fail to be conscious in the earlier sense of awareness; a quality can occur without its being even slightly noticed by its subject. But in such a case, there is a good sense in which it would not be like anything for the subject to experience that quality. (Of course, in the first, Dretske-Tye sense there would be something it was like, since the quality itself is that. But in another sense, if the subject is entirely unaware of the quality, it is odd even to speak of the subject as “experiencing” it, much less of there being something it is like for the subject to experience it.) So even in the case in which one is aware of one’s quality, the second type of “what it’s like,” the WIL property, requires awareness and so is something distinct from the quality itself.

It is the second sense of “what it’s like” that figures in anti-materialist arguments from subjects’ “knowing what it’s like,” primarily Nagel’s (1974) “Bat” argument and Jackson’s (1982) “Knowledge” argument, Chalmers’ (1996, 2003) Conceivability argument, and Levine’s (1983, 2001) Explanatory Gap arguments. To begin with the first of those: Jackson’s character Mary, a brilliant color scientist trapped in an entirely black-and-white laboratory, nonetheless becomes omniscient as regards the physics and chemistry of color, the neurophysiology of color vision, and every other public, objective fact conceivably relevant to human color experience. Yet when she is finally released from her captivity and ventures into the outside world, she sees colors for the first time, and learns something: namely, she learns what it is like to see red and the other colors. Thus she seems to have learned a new fact, one that by hypothesis is not a public, objective fact. It is an intrinsically perspectival fact. This is what threatens materialism, since according to that doctrine, every fact about every human mind is ultimately a public, objective fact.

Upon her release, Mary has done two things: She has at last hosted a red sensory quality, and she has learned what it is like to experience a red quality. In experiencing it she has experienced a “what it’s like” in the first of our two senses. But the fact she has learned has the ineffability characteristic of our second sense of “what it’s like”; were Mary to try to pass on her new knowledge of a WIL property to a still color-deprived colleague, she would not be able to express it in English.

We have already surveyed the representational theory of sensory qualities. But there are also representational theories of “what it’s like” in the second sense (4). A common reply to the arguments of Nagel and Jackson (Horgan (1984), Van Gulick (1985), Churchland (1985), Tye (1986), Lycan (1987, 1990, 1996, 2003), Loar (1990), Rey (1991), Leeds (1993)) is to note that a knowledge difference does not entail a difference in fact known, for one can know a fact under one representation or mode of presentation but fail to know one and the same fact under a different mode of presentation. Someone might know that water is splashing but not know that H 2 O molecules are moving, and vice versa; someone might know that person X is underpaid without knowing that she herself is underpaid, even if she herself is in fact person X. Thus, from Mary’s before-and-after knowledge difference, Jackson is not entitled to infer the existence of a new, weird fact, but at most that of a new way of representing. Mary has not learned a new fact, but has only acquired a new, introspective or first-person way of representing one that she already knew in its neurophysiological guise.

(As noted above, the posited introspective modes of presentation for sensory qualities in the first-order sense are strong candidates for the title of “qualia” in a distinct, higher-order sense of the term, and they may well be narrow rather than wide. This is what Rey (1998) seems to be talking about.)

This attractive response to Nagel and Jackson—call it the “perspectivalist” response—requires that the first-order qualitative state itself be represented (else how could it be newly known under Mary’s new mode of presentation?). And that hypothesis in turn encourages a representational theory of higher-order conscious awareness and introspection. However, representational theories of awareness face powerful objections, the perspectivalist must either buy into such a theory despite its liabilities, or find some other way of explicating the idea of an introspective or first-person perspective without appealing to higher-order representation. The latter option does not seem promising. And a further question raised by the perspectivalist response concerns the nature of the alleged first-person representation itself.

It has become popular, especially among materialists, to speak of “phenomenal concepts,”and to suppose that Mary has acquired one which she can now apply to her first-order qualitative state; it is in that way that she is able to represent the old fact in a new way. Phenomenal concepts figure also in responses to the Conceivability and Explanatory Gap arguments.

The Conceivability argument (Chalmers 1996, 2003) has it that “zombies” are conceivable—physical duplicates of ordinary human beings, that share all the human physical and functional states but lack phenomenal consciousness in sense (4); there is nothing it is like to be a zombie. The argument then moves from bare conceivability to genuine metaphysical possibility, which would refute materialism. According to the Explanatory Gap argument (Levine 1983, 2001), no amount of physical, functional or other objective information could explain why a given sensory state feels to its subject in the way it does, and the best explanation of this in turn is that the feel is an extra fact that does not supervene on the physical.

Lormand (2004) offers a very detailed linguistic analysis of the formula “There is something it is like for [creature] c to have [mental state] M,” and on its basis defends the claim that instances of that formula as well as more specific attributions of WIL properties can in fact be conceptually deduced at least from “nonphenomenal” facts about subjects.

What the Knowledge, Conceivability and Explanatory Gap arguments have in common is that they move from an alleged epistemic gap to a would-be materialism-refuting metaphysical one. Though some materialists balk at once and refuse to admit the epistemic gap, more grant the epistemic gap and resist the move to the metaphysical one. The epistemic gap, on this view, is created by the “conceptual isolation” of phenomenal concepts from all others, and it is conceptual only rather than metaphysical. Stoljar (2005) calls this the “phenomenal concept strategy”. There are a number of distinct positive accounts of phenomenal concepts and how they work; such concepts are: “recognitional” (Loar (1990), Carruthers (2000), Tye (2003c)); proprietary lexemes of an internal monitoring system (Lycan (1996)); indexical (Perry (2001), O’Dea (2002), Schellenberg (2013)); demonstrative (Levin (2007), Stalnaker (2008), Schroer (2010)); “quotational” or “constitutive” (Papineau (2002), Balog (2012)); “unimodal” (Dove and Elpidorou (2016)). Some of those accounts are minimal, aspiring only to block the key inferences in the anti-materialist arguments aforementioned. Others, particularly the constitutive account, are more detailed and offer to explain more specific features of WIL properties. For example, Papineau points out that the constitutive account explains the odd persistent attractiveness of some of the obviously fallacious antimaterialist arguments. He and Balog each argue that the account, according to which a phenomenal concept token is at least partly constituted by the very mental state-token that is its referent, explains the special directness of the reference: no feature of the state is appealed to (and a fortiori no neural, functional, causal etc. feature); Balog adds that since the referent is literally contained and present in the concept token, “there will always be something it is like” to do the tokening (p. 7).

The phenomenal concept strategy is criticized by Raffman (1995), Stoljar (2005), Prinz (2007), Chalmers (2007), Ball (2009), Tye (2009), Demircioglu (2013), and Shea (2014). For further works and references see Alter and Walter (2007), Sundström (2011) and Elpidorou (2015). Chalmers offers a “Master Argument” meant to refute any version of the strategy: it is a dilemma based on whether it is conceivable that the complete fundamental physical truth holds yet we possess no phenomenal concepts (having whichever features). The argument is criticized by Papineau (2007), Carruthers and Veillet (2007), and Balog (2012).

It is possible simply to deny the existence of WIL properties, as do Dennett (1991) and Dretske (1995); see also Humphrey (1992, 2011), Hall (2007), Pereboom (2011) and Tartaglia (2013). To do that is of course not to defend a representational theory or any other theory of them. But it would be good to explain away the majority belief in such properties, and some theorists do that in representational terms, arguing that other, real properties are misrepresented in introspection as WIL properties; Frankish (2016) calls this strategy “illusionism.” An obvious instance of such a misrepresentation would be to mistake a sensory quality for a WIL property; since the conflation of the two is already rife even among sophisticated philosophers, WIL deniers may suggest that what was introspected was only a sensory quality. (That is one way of understanding Dretske’s position, bar his resistance to the very notion of introspection as in Dretske (2003).) And as before, the phrase “what it’s like” has nontendentiously been used as referring to a sensory quality rather than to a property of a whole experience. Several authors point out that to reject WIL properties is not to grant Chalmers’ (1996) claim that for a zombie lacking WIL properties, “all is dark inside” (pp. 95–6).

Rey (1992) suggests that introspection mistakes the lack of detail it delivers for the accurate representation of a simple ineffable property. Alternately or in addition (1995), having detected stable and identifiable complexes of involuntary responses to states of ourselves and to living creatures who look and behave like us, for example the commonsense causal, representational, conative and affective syndrome we lump together using the word ‘pain’, we project a simple quality onto the others and into ourselves. Rey and Pereboom each compare the projection of WIL properties into the mind by introspection to vision’s projection of simple homogeneous color properties onto environmental objects.

Illusionism is criticized by Strawson (2006), Prinz (2016), Balog (2016), Nida-Rümelin (2016), Schwitzgebel (2016), and Chalmers (2018). Though highly sympathetic to illusionism, Kammerer (2018) argues that no previously existing account can explain the strength of the illusion. For general discussion, see the essays collected in Frankish (2017), and for further defense, see Shabasson (2022).

  • Allen, K., 2013. ‘Blur’, Philosophical Studies , 162: 257–73.
  • Alston, W., 1999. ‘Back to the Theory of Appearing’, in J. Tomberlin (ed.), Epistemology: Philosophical Perspectives , Vol. 13, Malden, MA: Wiley-Blackwell.
  • Alter, T., and S. Walter (eds.), 2007. Phenomenal Concepts and Phenomenal Knowledge , Oxford: Oxford University Press.
  • Anscombe, G.E.M., 1965. ‘The Intentionality of Sensation: A Grammatical Feature’, in R.J. Butler (ed.), Analytical Philosophy: Second Series , Oxford: Basil Blackwell.
  • Armstrong, D.M., 1968. A Materialist Theory of the Mind , London: Routledge and Kegan Paul.
  • –––, 1981. ‘What is Consciousness?’, in The Nature of Mind , Ithaca, NY: Cornell University Press.
  • Ball, D., 2009. ‘There Are No Phenomenal Concepts’, Mind , 118: 935–62.
  • Balog, K., 2012. ‘Acquaintance and the Mind-Body Problem’, in C.S. Hill and S. Gozzano (eds.), New Perspectives on Type Identity: The Mental and the Physical , Cambridge: Cambridge University Press.
  • –––, 2016. ‘Illusionism’s Discontent’, Journal of Consciousness Studies , 23: 40–51; reprinted in Frankish 2017.
  • Biggs, S., 2009. ‘The Scrambler: An Argument Against Representationalism’, Canadian Journal of Philosophy , 39: 215–36.
  • Block, N.J., 1990. ‘Inverted Earth’, in Tomberlin 1990.
  • –––, 1995. ‘On a Confusion about a Function of Consciousness’, Behavioral and Brain Sciences , 18: 227–47.
  • –––, 1996. ‘Mental Paint and Mental Latex’, in Villanueva 1996.
  • –––, 2010. ‘Attention and Mental Paint’, in E. Sosa and E. Villanueva (eds.), Philosophical Issues, 20: Philosophy of Mind , Malden, MA: Wiley-Blackwell.
  • Boghossian, P., and J. D. Velleman, 1989. ‘Color as a Secondary Quality’, Mind , 98: 81–103.
  • Bourget, D., 2010. ‘Consciousness is Underived Intentionality’, Noûs , 44: 32–58.
  • –––, 2015. ‘Representationalism, Perceptual Distortion and the Limits of Phenomenal Concepts’, Canadian Journal of Philosophy , 45: 16–36.
  • –––, 2017a. ‘Representationalism and Sensory Modalities: An Argument for Intermodal Representationalism’, American Philosophical Quarterly , 54: 251–68.
  • –––, 2017b. ‘Why Are Some Phenomenal Experiences “Vivid” and Others “Faint”? Representationalism, Imagery, and Cognitive Phenomenology’, Australasian Journal of Philosophy , 95: 673–87.
  • Brewer, B, 2006. ‘Perception and Content’, European Journal of Philosophy , 14: 165–81.
  • Burge, T., 1988. ‘Individualism and Self-Knowledge’, Journal of Philosophy , 85: 649–53.
  • Byrne, A., 2001. ‘Intentionalism Defended’, Philosophical Review , 110: 199–239.
  • Butchvarov, P., 1980. ‘Adverbial Theories of Consciousness’, in P. French, T.E. Uehing and H. Wettstein (eds.), Midwest Studies in Philosophy, Vol. V: Studies in Epistemology , Minneapolis: University of Minnesota Press.
  • Campbell, J., 2002. Reference and Consciousness , Oxford: Oxford University Press.
  • Carrasco, M., 2006. ‘Covert Attention Increases Contrast Sensitivity: Psychophysical, Neurophysiological and Neuroimaging Studies’, in S. Martinez-Conde, et al. (eds.), Progress in Brain Research, Vol. 154: Visual Perception, Part 1 , Amsterdam: Elsevier.
  • Carruthers, P., 2000. Phenomenal Consciousness , Cambridge: Cambridge University Press.
  • Carruthers, P., and B. Veillet, 2007. ‘The Phenomenal Concept Strategy’, Journal of Consciousness Studies , 14: 212–36.
  • Chalmers, D.J., 1996. The Conscious Mind , Oxford: Oxford University Press.
  • –––, 2003. ‘Consciousness and Its Place in Nature’, in S.P. Stich and F. Warfield (eds.), The Blackwell Guide to the Philosophy of Mind , (Oxford: Blackwell).
  • –––, 2004. ‘The Representational Character of Experience’, in B. Leiter (ed.), The Future for Philosophy , Oxford: Oxford University Press.
  • –––, 2007. ‘Phenomenal Concepts and the Explanatory Gap’, in Alter & Walter 2007.
  • –––, 2018. ‘The Meta-Problem of Consciousness’, Journal of Consciousness Studies , 25: 6–61.
  • Chisholm, R., 1957. Perceiving , Ithaca, NY: Cornell University Press.
  • Churchland, P.M., 1985. ‘Reduction, Qualia, and the Direct Introspection of Brain States’, Journal of Philosophy 82: 8–28.
  • Clark, A., 2000. A Theory of Sentience , Oxford: Oxford University Press.
  • Crane, T., 2001. Elements of Mind , Oxford: Oxford University Press.
  • –––, 2003. ‘The Intentional Structure of Consciousness’, in Smith & Jokic 2003.
  • Davidson, D., 1987. ‘Knowing One’s Own Mind’, Proceedings and Addresses of the American Philosophical Association , Vol. 60, No. 3.
  • Davies, M., and G. Humphreys (eds.), 1993. Consciousness , Oxford: Basil Blackwell.
  • Demircioglu, E., 2013. ‘Physicalism and Phenomenal Concepts’, Philosophical Studies , 165: 257–77.
  • Dennett, D.C., 1991. Consciousness Explained , Boston: Little, Brown and Co.
  • Dove, G., and A. Elpidorou, 2016. ‘Embodied Conceivability: How to Keep the Phenomenal Concept Strategy Grounded’, Mind and Language , 31: 580–611.
  • Dretske, F., 1986. ‘Misrepresentation’, in R.J. Bogdan (ed.), Belief , Oxford: Oxford University Press.
  • –––, 1993. ‘Conscious Experience’, Mind , 102: 263–83.
  • –––, 1995. Naturalizing the Mind , Cambridge, MA: Bradford Books / MIT Press.
  • –––, 1996. ‘Phenomenal Externalism’, in Villanueva 1996).
  • –––, 2003. ‘How Do You Know You Are Not a Zombie?’, in B. Gertler (ed.), Privileged Access and First Person Authority , Aldershot: Ashgate Publishing Limited.
  • Elpidorou, A., 2015. ‘Phenomenal Concepts’, in D. Pritchard (ed.), Oxford Bibliographies in Philosophy , New York: Oxford University Press.
  • Farkas, K., 2008. ‘Phenomenal Intentionality without Compromise’, Monist , 91: 273–93.
  • Fish, W., 2009. Perception, Hallucination, and Illusion , Oxford: Oxford University Press.
  • Fodor, J.A., 1975. The Language of Thought , New York: Crowell.
  • –––, 1987. Psychosemantics , Cambridge, MA: Bradford Books/MIT Press.
  • Frankish, K., 2016. ‘Illusionism as a Theory of Consciousness’, Journal of Consciousness Studies , 23: 11–39; reprinted in Frankish 2017.
  • ––– (ed.), 2017. Illusionism as a Theory of Consciousness , Exeter, UK: Imprint Academic.
  • Gennaro, R., 1995. Consciousness and Self-Consciousness , (Philadelphia: John Benjamins.
  • Goodman, N., 1951. The Structure of Appearance , Cambridge, MA: Harvard University Press.
  • Gray, R., 2003. ‘Tye’s Representationalism: Feeling the Heat?’, Philosophical Studies , 115: 245–56.
  • Green, E.J., 2016. ‘Representationalism and Perceptual Organization’, Philosophical Topics , 44: 121–48.
  • Hall, E.W., 1961. Our Knowledge of Fact and Value , Chapel Hill: University of North Carolina Press.
  • Hall, R.J., 2007. ‘Phenomenal Properties as Dummy Properties’, Philosophical Studies , 135: 199–223.
  • Hardin, C.L., 1988. Color for Philosophers , Indianapolis: Hackett.
  • Harman, G., 1990. ‘The Intrinsic Quality of Experience’, in Tomberlin 1990; reprinted in Lycan & Prinz 2008.
  • Hawthorne, J. (ed.), 2007. Philosophy of Mind ( Philosophical Perspectives : Volume 21), Malden, MA: Wiley-Blackwell.
  • Heil, J., 1988. ‘Privileged Access’, Mind , 47: 238–51.
  • Hintikka, K.J.J., 1969. ‘On the Logic of Perception’, in N.S. Care and R.H. Grimm (eds.), Perception and Personal Identity , Cleveland, OH: Case Western Reserve University Press.
  • Horgan, T., 1984. ‘Jackson on Physical Information and Qualia’, Philosophical Quarterly , 34: 147–52.
  • –––, 2000. ‘Narrow Content and the Phenomenology of Intentionality’, Presidential Address to the Society for Philosophy and Psychology, New York City (June, 2000).
  • –––, and J. Tienson, 2002. ‘The Intentionality of Phenomenology and the Phenomenology of Intentionality’, in D. Chalmers (ed.), Philosophy of Mind: Classical and Contemporary Readings , Oxford: Oxford University Press).
  • Humphrey, N., 1992. A History of the Mind: Evolution and the Birth of Consciousness , New York: Simon & Schuster.
  • –––, 2011. Soul Dust: The Magic of Consciousness , Princeton: Princeton University Press.
  • Jackson, F., 1977. Perception , Cambridge: Cambridge University Press.
  • –––, 1982. ‘Epiphenomenal Qualia’, Philosophical Quarterly , 32: 127–36; reprinted in Lycan & Prinz 2008.
  • Kammerer, F., 2018. ‘Can You Believe It? Illusionism and the Illusion Meta-Problem’, Philosophical Psychology , 31: 44–67.
  • Kim, J., 1995. ‘Mental Causation: What, Me Worry?’, in E. Villanueva (ed.), Philosophical Issues, 6: Content , Atascadero, CA: Ridgeview Publishing.
  • Kind, A., 2003. ‘What’s So Transparent about Transparency?’, Philosophical Studies , 115: 225–44.
  • Kraut, R., 1982. ‘Sensory States and Sensory Objects’, Noûs , 16: 277–95.
  • Kriegel, U., 2002a. ‘PANIC Theory and the Prospects for a Representational Theory of Phenomenal Consciousness’, Philosophical Psychology , 15: 55–64.
  • –––, 2002b. ‘Phenomenal Content’, Erkenntnis , 57: 175–98.
  • –––, 2007. ‘Intentional Inexistence and Phenomenal Intentionality’, in Hawthorne 2007.
  • –––, 2011. The Sources of Intentionality , Oxford: Oxford University Press.
  • ––– (ed.), 2013. Phenomenal Intentionality . Oxford: Oxford University Press.
  • –––, 2017. ‘Reductive Representationalism and Emotional Phenomenology’, Midwest Studies in Philosophy , 41: 41–59.
  • Leeds, S., 1993. ‘Qualia, Awareness, Sellars’, Noûs , 27: 303–30.
  • Levine, J., 1983. ‘Materialism and Qualia: the Explanatory Gap’, Pacific Philosophical Quarterly , 64: 354–61.
  • –––, 2001. Purple Haze , Oxford: Oxford University Press.
  • –––, 2003. ‘Experience and Representation’, in Smith & Jokic 2003.
  • –––, 2007. ‘What is a Phenomenal Concept?’, in Alter & Walter 2007.
  • Lewis, C.I., 1929. Mind and the World Order , New York: C. Scribners Sons.
  • Lewis, D., 1983. ‘Individuation by Acquaintance and by Stipulation’, Philosophical Review , 92: 3–32.
  • Lloyd, D., 1991. ‘Leaping to Conclusions: Connectionism, Consciousness, and the Computational Mind’, in T. Horgan and J. Tienson (eds.), Connectionism and the Philosophy of Mind , Dordrecht: Kluwer.
  • Loar, B., 1987. ‘Subjective Intentionality’, Philosophical Topics , 15: 89–124.
  • –––, 1990. ‘Phenomenal States’, in Tomberlin 1990.
  • –––, 2003. ‘Transparent Experience and the Availability of Qualia’, in Smith & Jokic 2003.
  • Lormand, E., 2004. ‘The Explanatory Stopgap’, Philosophical Review , 113: 303–57.
  • Lycan, W.G., 1987. Consciousness , Cambridge, MA: Bradford Books / MIT Press.
  • –––, 1990. ‘What is the Subjectivity of the Mental?’, in Tomberlin 1990.
  • –––, 1996. Consciousness and Experience , Cambridge, MA: Bradford Books / MIT Press.
  • –––, 1998. ‘In Defense of the Representational Theory of Qualia’ (Replies to Neander, Rey and Tye), in Tomberlin 1998.
  • –––, 2001. ‘The Case for Phenomenal Externalism’, in J.E. Tomberlin (ed.), Metaphysics (Philosophical Perspectives, Vol. 15), Atascadero: Ridgeview Publishing.
  • –––, 2003. ‘Perspectivalism and the Knowledge Argument’, in Smith & Jokic 2003.
  • –––, 2019. ‘Block and the Representation Theory of Sensory Qualities’, in A. Pautz and D. Stoljar (eds.), Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness , Cambridge, MA: MIT Press.
  • Lycan, W.G., and J. Prinz (ed.), 2008. Mind and Cognition , Third Edition. Oxford: Basil Blackwell.
  • Macpherson, F., 2006. ‘Ambiguous Figures and the Content of Experience’, Noûs , 40: 82–117.
  • Mendelovici, A., 2018. The Phenomenal Basis of Intentionality , Oxford: Oxford University Press.
  • Mendelovici, A., and D. Bourget, 2014. ‘Naturalizing Intentionality: Tracking Theories Versus Phenomenal Intentionality Theories’, Philosophy Compass , 9: 325–37.
  • Metzinger, T. (ed.), 1995. Conscious Experience , Tucson: University of Arizona Press.
  • Millar, B., 2010. ‘Peacocke’s Trees’, Synthese , 174: 455–61.
  • Nagel, T., 1974. ‘What Is It Like to Be a Bat?’, Philosophical Review , 82: 435–56.
  • Nanay, B., 2010. ‘Attention and Perceptual Content’, Analysis , 70: 263–70.
  • Neander, K., 1998. ‘The Division of Phenomenal Labor: A Problem for Representational Theories of Consciousness’, in Tomberlin 1998.
  • Nelkin, N., 1989. ‘Unconscious Sensations’, Philosophical Psychology , 2: 129–41.
  • Nickel, B., 2007. ‘Against Intentionalism’, Philosophical Studies , 136: 279–304.
  • Nida-Rümelin, M., 2016. ‘The Illusion of Illusionism’, Journal of Consciousness Studies , 23: 160–71; reprinted in Frankish 2017.
  • Noë, A., 2004. Action in Perception , Cambridge, MA: MIT Press.
  • –––, 2005. ‘Real Presence’, Philosophical Topics , 33: 235–64.
  • O’Dea, J., 2002. ‘The Indexical Nature of Sensory Concepts’, Philosophical Papers , 31: 169–81.
  • Papineau, D., 2002. Thinking about Consciousness , Oxford: Oxford University Press.
  • –––, 2007. ‘Phenomenal and Perceptual Concepts’, in Alter & Walter 2007.
  • –––, 2014. ‘Sensory Experience and Representational Properties’, Proceedings of the Aristotelian Society , 114: 1–33.
  • Pautz, A., 2007. ‘Intentionalism and Perceptual Presence’, in Hawthorne 2007.
  • –––, 2010. ‘Why Explain Visual Experience in Terms of Content?’, in B. Nanay (ed.), Perceiving the World , Oxford: Oxford University Press.
  • –––, 2014. ‘The Real Trouble for Phenomenal Externalists: New Evidence for a Brain-Based Theory of Sensory Consciousness’, in R. Brown (ed.), Consciousness Inside and Out , Dordrecht: Springer.
  • –––, 2017. ‘Experiences are Representations: An Empirical Argument’, in B. Nanay (ed.), Current Debates in Philosophy of Perception , New York: Routledge.
  • –––, 2019. ‘How Does Color Experience Represent the World?’, in D. Brown and F. Macpherson (eds.), Routledge Handbook of the Philosophy of Color , London: Routledge.
  • Peacocke, C., 1983. Sense and Content , Oxford: Oxford University Press.
  • –––, 2008. ‘Sensational Properties: Theses to Accept and Theses to Reject’, Revue Internationale de Philosophie , 62: 7–24.
  • Pereboom, D., 2011. Consciousness and the Prospects of Physicalism , New York: Oxford University Press.
  • Perry, J., 2001. Knowledge, Possibility, and Consciousness , Cambridge, MA: MIT Press.
  • Pitcher, G., 1970. ‘Pain Perception’, Philosophical Review , 79: 368–93.
  • Pitt, D., 2004. ‘The Phenomenology of Cognition; or What is It Like to Think that P? ’, Philosophy and Phenomenological Research , 69: 1–36.
  • Prinz, J., 2007. ‘Mental Pointing’, Journal of Consciousness Studies , 14: 184–211.
  • –––, 2016. ‘Against Illusionism’, Journal of Consciousness Studies , 23: 186–96; reprinted in Frankish 2017.
  • Putnam, H., 1975. ‘The Meaning of “Meaning”’, in K. Gunderson (ed.), Language, Mind and Knowledge (Minnesota Studies in the Philosophy of Science: Volume VII), Minneapolis: University of Minnesota Press.
  • Raffman, D., 1995. ‘On the Persistence of Phenomenology’, in Metzinger 1995.
  • Rey, G., 1983. ‘A Reason for Doubting the Existence of Consciousness’, in Davidson, R., G.E. Schwartz and D. Shapiro (eds.), Consciousness and Self-Regulation (Volume 3), New York: Plenum Press.
  • –––, 1991. ‘Sensations in a Language of Thought’, in Villanueva 1991.
  • –––, 1992. ‘Sensational Sentences Switched,’ Philosophical Studies 68: 289–319.
  • –––, 1995. ‘Towards a Projectivist Account of Conscious Experience’, in Metzinger 1995.
  • –––, 1998. ‘A Narrow Representationalist Account of Qualitative Experience’, in Tomberlin 1998.
  • Rosenthal, D.M., 1991. ‘The Independence of Consciousness and Sensory Quality’, in Villanueva 1991.
  • –––, 1993. ‘Thinking that One Thinks’, in Davies & Humphreys 1993.
  • Schellenberg, S., 2008. ‘The Situation-Dependency of Perception’, Journal of Philosophy , 105: 55–85.
  • –––, 2013. ‘Externalism and the Gappy Content of Hallucination’, in F. Macpherson and D. Platchias (eds.), Hallucination , Cambridge, MA: MIT Press.
  • Schroer, R., 2010. ‘Where’s the Beef? Phenomenal Concepts as Both Demonstrative and Substantial,’ Australasian Journal of Philosophy , 88: 505–22.
  • Schwitzgebel, E., 2016. ‘Phenomenal Consciousness, Defined and Defended as Innocently as I Can Manage’, Journal of Consciousness Studies , 23: 224–35; reprinted in Frankish 2017.
  • Searle, J.R., 1990. ‘Consciousness, Explanatory Inversion and Cognitive Science’, Behavioral and Brain Sciences 13: 585–642.
  • Sellars, W., 1956. ‘Empiricism and the Philosophy of Mind’, in H. Feigl and M. Scriven (eds.), Minnesota Studies in the Philosophy of Science (Volume I), Minneapolis: University of Minnesota Press.
  • –––, 1967. Science and Metaphysics , London: Routledge and Kegan Paul.
  • Shabasson, D., 2022. ‘Illusionism about Phenomenal Consciousness: Explaining the Illusion’, Review of Philosophy and Psychology , 13: 427–53.
  • Shea, N., 2014. ‘Using Phenomenal Concepts to Explain Away the Intuition of Contingency’, Philosophical Psychology , 27: 553–70.
  • Shoemaker, S., 1991. ‘Qualia and Consciousness’, Mind , 100: 507–24.
  • –––, 1994. ‘Phenomenal Character’, Noûs , 28: 21–38.
  • Siewert, C., 1998. The Significance of Consciousness , Princeton: Princeton University Press.
  • Smith, Q., and A. Jokic (eds.), 2003. Consciousness: New Philosophical Perspectives , Oxford: Oxford University Press.
  • Speaks, J., 2010. ‘Attention and Intentionalism’, Philosophical Quarterly , 60: 325–42.
  • –––, 2015. The Phenomenal and the Representational , Oxford: Oxford University Press.
  • –––, 2017. ‘Reply to Critics’, Philosophy and Phenomenological Research , 95: 492–506.
  • Stalnaker, R., 1996. ‘On a Defense of the Hegemony of Representation’, in Villanueva 1996.
  • –––, 2008. Our Knowledge of the Internal World , Oxford: Oxford University Press.
  • Stoljar, D., 2005. ‘Physicalism and Phenomenal Concepts’, Mind and Language , 20: 469–94.
  • Strawson, G., 2006. ‘Realistic Monism: Why Physicalism Entails Panpsychism’, in A. Freeman (ed.), Consciousness and its Place in Nature; Does Physicalism Entail Panpsychism? , Exeter, UK: Imprint Academic.
  • Sturgeon, S., 2000. Matters of Mind , London: Routledge.
  • Sundström, P., 2011. ‘Phenomenal Concepts’, Philosophy Compass , 6: 267–81.
  • Tartaglia, J., 2013. ‘Conceptualizing Physical Consciousness’, Philosophical Psychology 26: 817–38.
  • Thau, M., 2002. Consciousness and Cognition , Oxford: Oxford University Press.
  • Tomberlin, J.E. (ed.), 1990. Action Theory and Philosophy of Mind (Philosophical Perspectives, Vol. 4) , Atascadero, CA: Ridgeview Publishing.
  • ––– (ed.), 1998. Language, Mind, and Ontology (Philosophical Perspectives: Vol. 12), Atascadero, CA: Ridgeview Publishing.
  • Travis, C., 2004. ‘The Silence of the Senses’, Mind , 113: 57–94.
  • Tye, M., 1986. ‘The Subjectivity of Experience’, Mind , 95: 1–17.
  • –––, 1992. ‘Visual Qualia and Visual Content’, in T. Crane (ed.), The Contents of Experience , Cambridge: Cambridge University Press.
  • –––, 1994. ‘Qualia, Content, and the Inverted Spectrum’, Noûs , 28: 159–83.
  • –––, 1995. Ten Problems of Consciousness , Cambridge, MA: Bradford Books/MIT Press.
  • –––, 1998. ‘Inverted Earth, Swampman, and Representationism’, in Tomberlin 1998.
  • –––, 2002. ‘Representationalism and the Transparency of Experience’, Noûs , 36: 137–51.
  • –––, 2003a. ‘Blurry Images, Double Vision, and Other Oddities: New Problems for Representationalism?’, in Smith & Jokic 2003.
  • –––, 2003b. Consciousness and Persons , Cambridge, MA: Bradford Books / MIT Press.
  • –––, 2003c. ‘A Theory of Phenomenal Concepts’, in A. O’Hear (ed.), Minds and Persons , Cambridge: Cambridge University Press.
  • –––, 2007. ‘The Problem of Common Sensibles’, Erkenntnis , 66: 287–303.
  • –––, 2009. Consciousness Revisited: Materialism without Phenomenal Concepts , Cambridge, MA: MIT Press.
  • Van Gulick, R., 1985. ‘Physicalism and the Subjectivity of the Mental’, Philosophical Topics , 13: 51–70.
  • Villanueva, E. (ed.), 1991. Philosophical Issues, 1: Consciousness , Atascadero, CA: Ridgeview Publishing.
  • ––– (ed.), 1996. Philosophical Issues, 7: Perception , Atascadero, CA: Ridgeview Publishing.
  • White, S., 1987. ‘What Is It Like to Be a Homunculus?’, Pacific Philosophical Quarterly , 68: 148–74.
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.

[Please contact the author with suggestions.]

consciousness | consciousness: higher-order theories | consciousness: unity of | intentionality | intentionality: phenomenal | mental representation | qualia | qualia: inverted

Copyright © 2023 by William Lycan < william . lycan @ uconn . edu >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Cogn Res Princ Implic

Creating visual explanations improves learning

Eliza bobek.

1 University of Massachusetts Lowell, Lowell, MA USA

Barbara Tversky

2 Stanford University, Columbia University Teachers College, New York, NY USA

Associated Data

Many topics in science are notoriously difficult for students to learn. Mechanisms and processes outside student experience present particular challenges. While instruction typically involves visualizations, students usually explain in words. Because visual explanations can show parts and processes of complex systems directly, creating them should have benefits beyond creating verbal explanations. We compared learning from creating visual or verbal explanations for two STEM domains, a mechanical system (bicycle pump) and a chemical system (bonding). Both kinds of explanations were analyzed for content and learning assess by a post-test. For the mechanical system, creating a visual explanation increased understanding particularly for participants of low spatial ability. For the chemical system, creating both visual and verbal explanations improved learning without new teaching. Creating a visual explanation was superior and benefitted participants of both high and low spatial ability. Visual explanations often included crucial yet invisible features. The greater effectiveness of visual explanations appears attributable to the checks they provide for completeness and coherence as well as to their roles as platforms for inference. The benefits should generalize to other domains like the social sciences, history, and archeology where important information can be visualized. Together, the findings provide support for the use of learner-generated visual explanations as a powerful learning tool.

Electronic supplementary material

The online version of this article (doi:10.1186/s41235-016-0031-6) contains supplementary material, which is available to authorized users.

Significance

Uncovering cognitive principles for effective teaching and learning is a central application of cognitive psychology. Here we show: (1) creating explanations of STEM phenomena improves learning without additional teaching; and (2) creating visual explanations is superior to creating verbal ones. There are several notable differences between visual and verbal explanations; visual explanations map thought more directly than words and provide checks for completeness and coherence as well as a platform for inference, notably from structure to process. Extensions of the technique to other domains should be possible. Creating visual explanations is likely to enhance students’ spatial thinking skills, skills that are increasingly needed in the contemporary and future world.

Dynamic systems such as those in science and engineering, but also in history, politics, and other domains, are notoriously difficult to learn (e.g. Chi, DeLeeuw, Chiu, & Lavancher, 1994 ; Hmelo-Silver & Pfeffer, 2004 ; Johnstone, 1991 ; Perkins & Grotzer, 2005 ). Mechanisms, processes, and behavior of complex systems present particular challenges. Learners must master not only the individual components of the system or process (structure) but also the interactions and mechanisms (function), which may be complex and frequently invisible. If the phenomena are macroscopic, sub-microscopic, or abstract, there is an additional level of difficulty. Although the teaching of STEM phenomena typically relies on visualizations, such as pictures, graphs, and diagrams, learning is typically revealed in words, both spoken and written. Visualizations have many advantages over verbal explanations for teaching; can creating visual explanations promote learning?

Learning from visual representations in STEM

Given the inherent challenges in teaching and learning complex or invisible processes in science, educators have developed ways of representing these processes to enable and enhance student understanding. External visual representations, including diagrams, photographs, illustrations, flow charts, and graphs, are often used in science to both illustrate and explain concepts (e.g., Hegarty, Carpenter, & Just, 1990 ; Mayer, 1989 ). Visualizations can directly represent many structural and behavioral properties. They also help to draw inferences (Larkin & Simon, 1987 ), find routes in maps (Levine, 1982 ), spot trends in graphs (Kessell & Tversky, 2011 ; Zacks & Tversky, 1999 ), imagine traffic flow or seasonal changes in light from architectural sketches (e.g. Tversky & Suwa, 2009 ), and determine the consequences of movements of gears and pulleys in mechanical systems (e.g. Hegarty & Just, 1993 ; Hegarty, Kriz, & Cate, 2003 ). The use of visual elements such as arrows is another benefit to learning with visualizations. Arrows are widely produced and comprehended as representing a range of kinds of forces as well as changes over time (e.g. Heiser & Tversky, 2002 ; Tversky, Heiser, MacKenzie, Lozano, & Morrison, 2007 ). Visualizations are thus readily able to depict the parts and configurations of systems; presenting the same content via language may be more difficult. Although words can describe spatial properties, because the correspondences of meaning to language are purely symbolic, comprehension and construction of mental representations from descriptions is far more effortful and error prone (e.g. Glenberg & Langston, 1992 ; Hegarty & Just, 1993 ; Larkin & Simon, 1987 ; Mayer, 1989 ). Given the differences in how visual and verbal information is processed, how learners draw inferences and construct understanding in these two modes warrants further investigation.

Benefits of generating explanations

Learner-generated explanations of scientific phenomena may be an important learning strategy to consider beyond the utility of learning from a provided external visualization. Explanations convey information about concepts or processes with the goal of making clear and comprehensible an idea or set of ideas. Explanations may involve a variety of elements, such as the use of examples and analogies (Roscoe & Chi, 2007 ). When explaining something new, learners may have to think carefully about the relationships between elements in the process and prioritize the multitude of information available to them. Generating explanations may require learners to reorganize their mental models by allowing them to make and refine connections between and among elements and concepts. Explaining may also help learners metacognitively address their own knowledge gaps and misconceptions.

Many studies have shown that learning is enhanced when students are actively engaged in creative, generative activities (e.g. Chi, 2009 ; Hall, Bailey, & Tillman, 1997 ). Generative activities have been shown to benefit comprehension of domains involving invisible components, including electric circuits (Johnson & Mayer, 2010 ) and the chemistry of detergents (Schwamborn, Mayer, Thillmann, Leopold, & Leutner, 2010 ). Wittrock’s ( 1990 ) generative theory stresses the importance of learners actively constructing and developing relationships. Generative activities require learners to select information and choose how to integrate and represent the information in a unified way. When learners make connections between pieces of information, knowledge, and experience, by generating headings, summaries, pictures, and analogies, deeper understanding develops.

The information learners draw upon to construct their explanations is likely important. For example, Ainsworth and Loizou ( 2003 ) found that asking participants to self-explain with a diagram resulted in greater learning than self-explaining from text. How might learners explain with physical mechanisms or materials with multi-modal information?

Generating visual explanations

Learner-generated visualizations have been explored in several domains. Gobert and Clement ( 1999 ) investigated the effectiveness of student-generated diagrams versus student-generated summaries on understanding plate tectonics after reading an expository text. Students who generated diagrams scored significantly higher on a post-test measuring spatial and causal/dynamic content, even though the diagrams contained less domain-related information. Hall et al. ( 1997 ) showed that learners who generated their own illustrations from text performed equally as well as learners provided with text and illustrations. Both groups outperformed learners only provided with text. In a study concerning the law of conservation of energy, participants who generated drawings scored higher on a post-test than participants who wrote their own narrative of the process (Edens & Potter, 2003 ). In addition, the quality and number of concept units present in the drawing/science log correlated with performance on the post-test. Van Meter ( 2001 ) found that drawing while reading a text about Newton’s Laws was more effective than answering prompts in writing.

One aspect to explore is whether visual and verbal productions contain different types of information. Learning advantages for the generation of visualizations could be attributed to learners’ translating across modalities, from a verbal format into a visual format. Translating verbal information from the text into a visual explanation may promote deeper processing of the material and more complete and comprehensive mental models (Craik & Lockhart, 1972 ). Ainsworth and Iacovides ( 2005 ) addressed this issue by asking two groups of learners to self-explain while learning about the circulatory system of the human body. Learners given diagrams were asked to self-explain in writing and learners given text were asked to explain using a diagram. The results showed no overall differences in learning outcomes, however the learners provided text included significantly more information in their diagrams than the other group. Aleven and Koedinger ( 2002 ) argue that explanations are most helpful if they can integrate visual and verbal information. Translating across modalities may serve this purpose, although translating is not necessarily an easy task (Ainsworth, Bibby, & Wood, 2002 ).

It is important to remember that not all studies have found advantages to generating explanations. Wilkin ( 1997 ) found that directions to self-explain using a diagram hindered understanding in examples in physical motion when students were presented with text and instructed to draw a diagram. She argues that the diagrams encouraged learners to connect familiar but unrelated knowledge. In particular, “low benefit learners” in her study inappropriately used spatial adjacency and location to connect parts of diagrams, instead of the particular properties of those parts. Wilkin argues that these learners are novices and that experts may not make the same mistake since they have the skills to analyze features of a diagram according to their relevant properties. She also argues that the benefits of self-explaining are highest when the learning activity is constrained so that learners are limited in their possible interpretations. Other studies that have not found a learning advantage from generating drawings have in common an absence of support for the learner (Alesandrini, 1981 ; Leutner, Leopold, & Sumfleth, 2009 ). Another mediating factor may be the learner’s spatial ability.

The role of spatial ability

Spatial thinking involves objects, their size, location, shape, their relation to one another, and how and where they move through space. How then, might learners with different levels of spatial ability gain structural and functional understanding in science and how might this ability affect the utility of learner-generated visual explanations? Several lines of research have sought to explore the role of spatial ability in learning science. Kozhevnikov, Hegarty, and Mayer ( 2002 ) found that low spatial ability participants interpreted graphs as pictures, whereas high spatial ability participants were able to construct more schematic images and manipulate them spatially. Hegarty and Just ( 1993 ) found that the ability to mentally animate mechanical systems correlated with spatial ability, but not verbal ability. In their study, low spatial ability participants made more errors in movement verification tasks. Leutner et al. ( 2009 ) found no effect of spatial ability on the effectiveness of drawing compared to mentally imagining text content. Mayer and Sims ( 1994 ) found that spatial ability played a role in participants’ ability to integrate visual and verbal information presented in an animation. The authors argue that their results can be interpreted within the context of dual-coding theory. They suggest that low spatial ability participants must devote large amounts of cognitive effort into building a visual representation of the system. High spatial ability participants, on the other hand, are more able to allocate sufficient cognitive resources to building referential connections between visual and verbal information.

Benefits of testing

Although not presented that way, creating an explanation could be regarded as a form of testing. Considerable research has documented positive effects of testing on learning. Presumably taking a test requires retrieving and sometimes integrating the learned material and those processes can augment learning without additional teaching or study (e.g. Roediger & Karpicke, 2006 ; Roediger, Putnam, & Smith, 2011 ; Wheeler & Roediger, 1992 ). Hausmann and Vanlehn ( 2007 ) addressed the possibility that generating explanations is beneficial because learners merely spend more time with the content material than learners who are not required to generate an explanation. In their study, they compared the effects of using instructions to self-explain with instructions to merely paraphrase physics (electrodynamics) material. Attending to provided explanations by paraphrasing was not as effective as generating explanations as evidenced by retention scores on an exam 29 days after the experiment and transfer scores within and across domains. Their study concludes, “the important variable for learning was the process of producing an explanation” (p. 423). Thus, we expect benefits from creating either kind of explanation but for the reasons outlined previously, we expect larger benefits from creating visual explanations.

Present experiments

This study set out to answer a number of related questions about the role of learner-generated explanations in learning and understanding of invisible processes. (1) Do students learn more when they generate visual or verbal explanations? We anticipate that learning will be greater with the creation of visual explanations, as they encourage completeness and the integration of structure and function. (2) Does the inclusion of structural and functional information correlate with learning as measured by a post-test? We predict that including greater counts of information, particularly invisible and functional information, will positively correlate with higher post-test scores. (3) Does spatial ability predict the inclusion of structural and functional information in explanations, and does spatial ability predict post-test scores? We predict that high spatial ability participants will include more information in their explanations, and will score higher on post-tests.

Experiment 1

The first experiment examines the effects of creating visual or verbal explanations on the comprehension of a bicycle tire pump’s operation in participants with low and high spatial ability. Although the pump itself is not invisible, the components crucial to its function, notably the inlet and outlet valves, and the movement of air, are located inside the pump. It was predicted that visual explanations would include more information than verbal explanations, particularly structural information, since their construction encourages completeness and the production of a whole mechanical system. It was also predicted that functional information would be biased towards a verbal format, since much of the function of the pump is hidden and difficult to express in pictures. Finally, it was predicted that high spatial ability participants would be able to produce more complete explanations and would thus also demonstrate better performance on the post-test. Explanations were coded for structural and functional content, essential features, invisible features, arrows, and multiple steps.

Participants

Participants were 127 (59 female) seventh and eighth grade students, aged 12–14 years, enrolled in an independent school in New York City. The school’s student body is 70% white, 30% other ethnicities. Approximately 25% of the student body receives financial aid. The sample consisted of three class sections of seventh grade students and three class sections of eighth grade students. Both seventh and eighth grade classes were integrated science (earth, life, and physical sciences) and students were not grouped according to ability in any section. Written parental consent was obtained by means of signed informed consent forms. Each participant was randomly assigned to one of two conditions within each class. There were 64 participants in the visual condition explained the bicycle pump’s function by drawing and 63 participants explained the pump’s function by writing.

The materials consisted of a 12-inch Spalding bicycle pump, a blank 8.5 × 11 in. sheet of paper, and a post-test (Additional file 1 ). The pump’s chamber and hose were made of clear plastic; the handle and piston were black plastic. The parts of the pump (e.g. inlet valve, piston) were labeled.

Spatial ability was assessed using the Vandenberg and Kuse ( 1978 ) mental rotation test (MRT). The MRT is a 20-item test in which two-dimensional drawings of three-dimensional objects are compared. Each item consists of one “target” drawing and four drawings that are to be compared to the target. Two of the four drawings are rotated versions of the target drawing and the other two are not. The task is to identify the two rotated versions of the target. A score was determined by assigning one point to each question if both of the correct rotated versions were chosen. The maximum score was 20 points.

The post-test consisted of 16 true/false questions printed on a single sheet of paper measuring 8.5 × 11 in. Half of the questions related to the structure of the pump and the other half related to its function. The questions were adapted from Heiser and Tversky ( 2002 ) in order to be clear and comprehensible for this age group.

The experiment was conducted over the course of two non-consecutive days during the normal school day and during regularly scheduled class time. On the first day, participants completed the MRT as a whole-class activity. After completing an untimed practice test, they were given 3 min for each of the two parts of the MRT. On the second day, occurring between two and four days after completing the MRT, participants were individually asked to study an actual bicycle tire pump and were then asked to generate explanations of its function. The participants were tested individually in a quiet room away from the rest of the class. In addition to the pump, each participant was one instruction sheet and one blank sheet of paper for their explanations. The post-test was given upon completion of the explanation. The instruction sheet was read aloud to participants and they were instructed to read along. The first set of instructions was as follows: “A bicycle pump is a mechanical device that pumps air into bicycle tires. First, take this bicycle pump and try to understand how it works. Spend as much time as you need to understand the pump.” The next set of instructions differed for participants in each condition. The instructions for the visual condition were as follows: “Then, we would like you to draw your own diagram or set of diagrams that explain how the bike pump works. Draw your explanation so that someone else who has not seen the pump could understand the bike pump from your explanation. Don’t worry about the artistic quality of the diagrams; in fact, if something is hard for you to draw, you can explain what you would draw. What’s important is that the explanation should be primarily visual, in a diagram or diagrams.” The instructions for the verbal condition were as follows: “Then, we would like you to write an explanation of how the bike pump works. Write your explanation so that someone else who has not seen the pump could understand the bike pump from your explanation.” All participants then received these instructions: “You may not use the pump while you create your explanations. Please return it to me when you are ready to begin your explanation. When you are finished with the explanation, you will hand in your explanation to me and I will then give you 16 true/false questions about the bike pump. You will not be able to look at your explanation while you complete the questions.” Study and test were untimed. All students finished within the 45-min class period.

Spatial ability

The mean score on the MRT was 10.56, with a median of 11. Boys scored significantly higher (M = 13.5, SD = 4.4) than girls (M = 8.8, SD = 4.5), F(1, 126) = 19.07, p  < 0.01, a typical finding (Voyer, Voyer, & Bryden, 1995 ). Participants were split into high or low spatial ability by the median. Low and high spatial ability participants were equally distributed in the visual and verbal groups.

Learning outcomes

It was predicted that high spatial ability participants would be better able to mentally animate the bicycle pump system and therefore score higher on the post-test and that post-test scores would be higher for those who created visual explanations. Table  1 shows the scores on the post-test by condition and spatial ability. A two-way factorial ANOVA revealed marginally significant main effect of spatial ability F(1, 124) = 3.680, p  = 0.06, with high spatial ability participants scoring higher on the post-test. There was also a significant interaction between spatial ability and explanation type F(1, 124) = 4.094, p  < 0.01, see Fig.  1 . Creating a visual explanation of the bicycle pump selectively helped low spatial participants.

Post-test scores, by explanation type and spatial ability

Explanation type
VisualVerbalTotal
Spatial abilityMeanSDMeanSDMeanSD
Low11.451.939.752.3110.602.27
High11.201.4711.601.8011.421.65
Total11.31.7110.742.23

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig1_HTML.jpg

Scores on the post-test by condition and spatial ability

Coding explanations

Explanations (see Fig.  2 ) were coded for structural and functional content, essential features, invisible features, arrows, and multiple steps. A subset of the explanations (20%) was coded by the first author and another researcher using the same coding system as a guide. The agreement between scores was above 90% for all measures. Disagreements were resolved through discussion. The first author then scored the remaining explanations.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig2_HTML.jpg

Examples of visual and verbal explanations of the bicycle pump

Coding for structure and function

A maximum score of 12 points was awarded for the inclusion and labeling of six structural components: chamber, piston, inlet valve, outlet valve, handle, and hose. For the visual explanations, 1 point was given for a component drawn correctly and 1 additional point if the component was labeled correctly. For verbal explanations, sentences were divided into propositions, the smallest unit of meaning in a sentence. Descriptions of structural location e.g. “at the end of the piston is the inlet valve,” or of features of the components, e.g. the shape of a part, counted as structural components. Information was coded as functional if it depicted (typically with an arrow) or described the function/movement of an individual part, or the way multiple parts interact. No explanation contained more than ten functional units.

Visual explanations contained significantly more structural components (M = 6.05, SD = 2.76) than verbal explanations (M = 4.27, SD = 1.54), F(1, 126) = 20.53, p  < 0.05. The number of functional components did not differ between visual and verbal explanations as displayed in Figs.  3 and ​ and4. 4 . Many visual explanations (67%) contained verbal components; the structural and functional information in explanations was coded as depictive or descriptive. Structural and functional information were equally likely to be expressed in words or pictures in visual explanations. It was predicted that explanations created by high spatial participants would include more functional information. However, there were no significant differences found between low spatial (M = 5.15, SD = 2.21) and high spatial (M = 4.62, SD = 2.16) participants in the number of structural units or between low spatial (M = 3.83, SD = 2.51) and high spatial (M = 4.10, SD = 2.13) participants in the number of functional units.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig3_HTML.jpg

Average number of structural and functional components in visual and verbal explanations

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig4_HTML.jpg

Visual and verbal explanations of chemical bonding

Coding of essential features

To further establish a relationship between the explanations generated and outcomes on the post-test, explanations were also coded for the inclusion of information essential to its function according to a 4-point scale (adapted from Hall et al., 1997 ). One point was given if both the inlet and the outlet valve were clearly present in the drawing or described in writing, 1 point was given if the piston inserted into the chamber was shown or described to be airtight, and 1 point was given for each of the two valves if they were shown or described to be opening/closing in the correct direction.

Visual explanations contained significantly more essential information (M = 1.78, SD = 1.0) than verbal explanations (M = 1.20, SD = 1.21), F(1, 126) = 7.63, p  < 0.05. Inclusion of essential features correlated positively with post-test scores, r = 0.197, p  < 0.05).

Coding arrows and multiple steps

For the visual explanations, three uses of arrows were coded and tallied: labeling a part or action, showing motion, or indicating sequence. Analysis of visual explanations revealed that 87% contained arrows. No significant differences were found between low and high spatial participants’ use of arrows to label and no signification correlations were found between the use of arrows and learning outcomes measured on the post-test.

The explanations were coded for the number of discrete steps used to explain the process of using the bike pump. The number of steps used by participants ranged from one to six. Participants whose explanations, whether verbal or visual, contained multiple steps scored significantly higher (M = 0.76, SD = 0.18) on the post-test than participants whose explanations consisted of a single step (M = 0.67, SD = 0.19), F(1, 126) = 5.02, p  < 0.05.

Coding invisible features

The bicycle tire pump, like many mechanical devices, contains several structural features that are hidden or invisible and must be inferred from the function of the pump. For the bicycle pump the invisible features are the inlet and outlet valves and the three phases of movement of air, entering the pump, moving through the pump, exiting the pump. Each feature received 1 point for a total of 5 possible points.

The mean score for the inclusion of invisible features was 3.26, SD = 1.25. The data were analyzed using linear regression and revealed that the total score for invisible parts significantly predicted scores on the post-test, F(1, 118) = 3.80, p  = 0.05.

In the first experiment, students learned the workings of a bicycle pump from interacting with an actual pump and creating a visual or verbal explanation of its function. Understanding the functionality of a bike pump depends on the actions and consequences of parts that are not visible. Overall, the results provide support for the use of learner-generated visual explanations in developing understanding of a new scientific system. The results show that low spatial ability participants were able to learn as successfully as high spatial ability participants when they first generated an explanation in a visual format.

Visual explanations may have led to greater understanding for a number of reasons. As discussed previously, visual explanations encourage completeness. They force learners to decide on the size, shape, and location of parts/objects. Understanding the “hidden” function of the invisible parts is key to understanding the function of the entire system and requires an understanding of how both the visible and invisible parts interact. The visual format may have been able to elicit components and concepts that are invisible and difficult to integrate into the formation of a mental model. The results show that including more of the essential features and showing multiple steps correlated with superior test performance. Understanding the bicycle pump requires understanding how all of these components are connected through movement, force, and function. Many (67%) of the visual explanations also contained written components to accompany their explanation. Arguably, some types of information may be difficult to depict visually and verbal language has many possibilities that allow for specificity. The inclusion of text as a complement to visual explanations may be key to the success of learner-generated explanations and the development of understanding.

A limitation of this experiment is that participants were not provided with detailed instructions for completing their explanations. In addition, this experiment does not fully clarify the role of spatial ability, since high spatial participants in the visual and verbal groups demonstrated equivalent knowledge of the pump on the post-test. One possibility is that the interaction with the bicycle pump prior to generating explanations was a sufficient learning experience for the high spatial participants. Other researchers (e.g. Flick, 1993 ) have shown that hands-on interactive experiences can be effective learning situations. High spatial ability participants may be better able to imagine the movement and function of a system (e.g. Hegarty, 1992 ).

Experiment 1 examined learning a mechanical system with invisible (hidden) parts. Participants were introduced to the system by being able to interact with an actual bicycle pump. While we did not assess participants’ prior knowledge of the pump with a pre-test, participants were randomly assigned to each condition. The findings have promising implications for teaching. Creating visual explanations should be an effective way to improve performance, especially in low spatial students. Instructors can guide the creation of visual explanations toward the features that augment learning. For example, students can be encouraged to show every step and action and to focus on the essential parts, even if invisible. The coding system shows that visual explanations can be objectively evaluated to provide feedback on students’ understanding. The utility of visual explanations may differ for scientific phenomena that are more abstract, or contain elements that are invisible due to their scale. Experiment 2 addresses this possibility by examining a sub-microscopic area of science: chemical bonding.

Experiment 2

In this experiment, we examine visual and verbal explanations in an area of chemistry: ionic and covalent bonding. Chemistry is often regarded as a difficult subject; one of the essential or inherent features of chemistry which presents difficulty is the interplay between the macroscopic, sub-microscopic, and representational levels (e.g. Bradley & Brand, 1985 ; Johnstone, 1991 ; Taber, 1997 ). In chemical bonding, invisible components engage in complex processes whose scale makes them impossible to observe. Chemists routinely use visual representations to investigate relationships and move between the observable, physical level and the invisible particulate level (Kozma, Chin, Russell, & Marx, 2002 ). Generating explanations in a visual format may be a particularly useful learning tool for this domain.

For this topic, we expect that creating a visual rather than verbal explanation will aid students of both high and low spatial abilities. Visual explanations demand completeness; they were predicted to include more information than verbal explanations, particularly structural information. The inclusion of functional information should lead to better performance on the post-test since understanding how and why atoms bond is crucial to understanding the process. Participants with high spatial ability may be better able to explain function since the sub-microscopic nature of bonding requires mentally imagining invisible particles and how they interact. This experiment also asks whether creating an explanation per se can increase learning in the absence of additional teaching by administering two post-tests of knowledge, one immediately following instruction but before creating an explanation and one after creating an explanation. The scores on this immediate post-test were used to confirm that the visual and verbal groups were equivalent prior to the generation of explanations. Explanations were coded for structural and functional information, arrows, specific examples, and multiple representations. Do the acts of selecting, integrating, and explaining knowledge serve learning even in the absence of further study or teaching?

Participants were 126 (58 female) eighth grade students, aged 13–14 years, with written parental consent and enrolled in the same independent school described in Experiment 1. None of the students previously participated in Experiment 1. As in Experiment 1, randomization occurred within-class, with participants assigned to either the visual or verbal explanation condition.

The materials consisted of the MRT (same as Experiment 1), a video lesson on chemical bonding, two versions of the instructions, the immediate post-test, the delayed post-test, and a blank page for the explanations. All paper materials were typed on 8.5 × 11 in. sheets of paper. Both immediate and delayed post-tests consisted of seven multiple-choice items and three free-response items. The video lesson on chemical bonding consisted of a video that was 13 min 22 s. The video began with a brief review of atoms and their structure and introduced the idea that atoms combine to form molecules. Next, the lesson showed that location in the periodic table reveals the behavior and reactivity of atoms, in particular the gain, loss, or sharing of electrons. Examples of atoms, their valence shell structure, stability, charges, transfer and sharing of electrons, and the formation of ionic, covalent, and polar covalent bonds were discussed. The example of NaCl (table salt) was used to illustrate ionic bonding and the examples of O 2 and H 2 O (water) were used to illustrate covalent bonding. Information was presented verbally, accompanied by drawings, written notes of keywords and terms, and a color-coded periodic table.

On the first of three non-consecutive school days, participants completed the MRT as a whole-class activity. On the second day (occurring between two and three days after completing the MRT), participants viewed the recorded lesson on chemical bonding. They were instructed to pay close attention to the material but were not allowed to take notes. Immediately following the video, participants had 20 min to complete the immediate post-test; all finished within this time frame. On the third day (occurring on the next school day after viewing the video and completing the immediate post-test), the participants were randomly assigned to either the visual or verbal explanation condition. The typed instructions were given to participants along with a blank 8.5 × 11 in. sheet of paper for their explanations. The instructions differed for each condition. For the visual condition, the instructions were as follows: “You have just finished learning about chemical bonding. On the next piece of paper, draw an explanation of how atoms bond and how ionic and covalent bonds differ. Draw your explanation so that another student your age who has never studied this topic will be able to understand it. Be as clear and complete as possible, and remember to use pictures/diagrams only. After you complete your explanation, you will be asked to answer a series of questions about bonding.”

For the verbal condition the instructions were: “You have just finished learning about chemical bonding. On the next piece of paper, write an explanation of how atoms bond and how ionic and covalent bonds differ. Write your explanation so that another student your age who has never studied this topic will be able to understand it. Be as clear and complete as possible. After you complete your explanation, you will be asked to answer a series of questions about bonding.”

Participants were instructed to read the instructions carefully before beginning the task. The participants completed their explanations as a whole-class activity. Participants were given unlimited time to complete their explanations. Upon completion of their explanations, participants were asked to complete the ten-question delayed post-test (comparable to but different from the first) and were given a maximum of 20 min to do so. All participants completed their explanations as well as the post-test during the 45-min class period.

The mean score on the MRT was 10.39, with a median of 11. Boys (M = 12.5, SD = 4.8) scored significantly higher than girls (M = 8.0, SD = 4.0), F(1, 125) = 24.49, p  < 0.01. Participants were split into low and high spatial ability based on the median.

The maximum score for both the immediate and delayed post-test was 10 points. A repeated measures ANOVA showed that the difference between the immediate post-test scores (M = 4.63, SD = 0.469) and delayed post-test scores (M = 7.04, SD = 0.299) was statistically significant F(1, 125) = 18.501, p  < 0.05). Without any further instruction, scores increased following the generation of a visual or verbal explanation. Both groups improved significantly; those who created visual explanations (M = 8.22, SD = 0.208), F(1, 125) = 51.24, p  < 0.01, Cohen’s d  = 1.27 as well as those who created verbal explanations (M = 6.31, SD = 0.273), F(1,125) = 15.796, p  < 0.05, Cohen’s d  = 0.71. As seen in Fig.  5 , participants who generated visual explanations (M = 0.822, SD = 0.208) scored considerably higher on the delayed post-test than participants who generated verbal explanations (M = 0.631, SD = 0.273), F(1, 125) = 19.707, p  < 0.01, Cohen’s d  = 0.88. In addition, high spatial participants (M = 0.824, SD = 0.273) scored significantly higher than low spatial participants (M = 0.636, SD = 0.207), F(1, 125) = 19.94, p  < 0.01, Cohen’s d  = 0.87. The results of the test of the interaction between group and spatial ability was not significant.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig5_HTML.jpg

Scores on the post-tests by explanation type and spatial ability

Explanations were coded for structural and functional content, arrows, specific examples, and multiple representations. A subset of the explanations (20%) was coded by both the first author and a middle school science teacher with expertise in Chemistry. Both scorers used the same coding system as a guide. The percentage of agreement between scores was above 90 for all measures. The first author then scored the remainder of the explanations. As evident from Fig.  4 , the visual explanations were individual inventions; they neither resembled each other nor those used in teaching. Most contained language, especially labels and symbolic language such as NaCl.

Structure, function, and modality

Visual and verbal explanations were coded for depicting or describing structural and functional components. The structural components included the following: the correct number of valence electrons, the correct charges of atoms, the bonds between non-metals for covalent molecules and between a metal and non-metal for ionic molecules, the crystalline structure of ionic molecules, and that covalent bonds were individual molecules. The functional components included the following: transfer of electrons in ionic bonds, sharing of electrons in covalent bonds, attraction between ions of opposite charge, bonding resulting in atoms with neutral charge and stable electron shell configurations, and outcome of bonding shows molecules with overall neutral charge. The presence of each component was awarded 1 point; the maximum possible points was 5 for structural and 5 for functional information. The modality, visual or verbal, of each component was also coded; if the information was given in both formats, both were coded.

As displayed in Fig.  6 , visual explanations contained a significantly greater number of structural components (M = 2.81, SD = 1.56) than verbal explanations (M = 1.30, SD = 1.54), F(1, 125) = 13.69, p  < 0.05. There were no differences between verbal and visual explanations in the number of functional components. Structural information was more likely to be depicted (M = 3.38, SD = 1.49) than described (M = 0.429, SD = 1.03), F(1, 62) = 21.49, p  < 0.05, but functional information was equally likely to be depicted (M = 1.86, SD = 1.10) or described (M = 1.71, SD = 1.87).

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig6_HTML.jpg

Functional information expressed verbally in the visual explanations significantly predicted scores on the post-test, F(1, 62) = 21.603, p  < 0.01, while functional information in verbal explanations did not. The inclusion of structural information did not significantly predict test scores. As seen Fig.  7 , explanations created by high spatial participants contained significantly more functional components, F(1, 125) = 7.13, p  < 0.05, but there were no ability differences in the amount of structural information created by high spatial participants in either visual or verbal explanations.

An external file that holds a picture, illustration, etc.
Object name is 41235_2016_31_Fig7_HTML.jpg

Average number of structural and functional components created by low and high spatial ability learners

Ninety-two percent of visual explanations contained arrows. Arrows were used to indicate motion as well as to label. The use of arrows was positively correlated with scores on the post-test, r = 0.293, p  < 0.05. There were no significant differences in the use of arrows between low and high spatial participants.

Specific examples

Explanations were coded for the use of specific examples, such as NaCl, to illustrate ionic bonding and CO 2 and O 2 to illustrate covalent bonding. High spatial participants (M = 1.6, SD = 0.69) used specific examples in their verbal and visual explanations more often than low spatial participants (M = 1.07, SD = 0.79), a marginally significant effect F(1, 125) = 3.65, p  = 0.06. Visual and verbal explanations did not differ in the presence of specific examples. The inclusion of a specific example was positively correlated with delayed test scores, r = 0.555, p  < 0.05.

Use of multiple representations

Many of the explanations (65%) contained multiple representations of bonding. For example, ionic bonding and its properties can be represented at the level of individual atoms or at the level of many atoms bonded together in a crystalline compound. The representations that were coded were as follows: symbolic (e.g. NaCl), atomic (showing structure of atom(s), and macroscopic (visible). Participants who created visual explanations generated significantly more (M =1.79, SD = 1.20) than those who created verbal explanations (M = 1.33, SD = 0.48), F (125) = 6.03, p  < 0.05. However, the use of multiple representations did not significantly correlate with delayed post-test scores on the delayed post-test.

Metaphoric explanations

Although there were too few examples to be included in the statistical analyses, some participants in the visual group created explanations that used metaphors and/or analogies to illustrate the differences between the types of bonding. Figure  4 shows examples of metaphoric explanations. In one example, two stick figures are used to show “transfer” and “sharing” of an object between people. In another, two sharks are used to represent sodium and chlorine, and the transfer of fish instead of electrons.

In the second experiment, students were introduced to chemical bonding, a more abstract and complex set of phenomena than the bicycle pump used in the first experiment. Students were tested immediately after instruction. The following day, half the students created visual explanations and half created verbal explanations. Following creation of the explanations, students were tested again, with different questions. Performance was considerably higher as a consequence of creating either explanation despite the absence of new teaching. Generating an explanation in this way could be regarded as a test of learning. Seen this way, the results echo and amplify previous research showing the advantages of testing over study (e.g. Roediger et al., 2011 ; Roediger & Karpicke, 2006 ; Wheeler & Roediger, 1992 ). Specifically, creating an explanation requires selecting the crucial information, integrating it temporally and causally, and expressing it clearly, processes that seem to augment learning and understanding without additional teaching. Importantly, creating a visual explanation gave an extra boost to learning outcomes over and above the gains provided by creating a verbal explanation. This is most likely due to the directness of mapping complex systems to a visual-spatial format, a format that can also provide a natural check for completeness and coherence as well as a platform for inference. In the case of this more abstract and complex material, generating a visual explanation benefited both low spatial and high spatial participants even if it did not bring low spatial participants up to the level of high spatial participants as for the bicycle pump.

Participants high in spatial ability not only scored better, they also generated better explanations, including more of the information that predicted learning. Their explanations contained more functional information and more specific examples. Their visual explanations also contained more functional information.

As in Experiment 1, qualities of the explanations predicted learning outcomes. Including more arrows, typically used to indicate function, predicted delayed test scores as did articulating more functional information in words in visual explanations. Including more specific examples in both types of explanation also improved learning outcomes. These are all indications of deeper understanding of the processes, primarily expressed in the visual explanations. As before, these findings provide ways that educators can guide students to craft better visual explanations and augment learning.

General discussion

Two experiments examined how learner-generated explanations, particularly visual explanations, can be used to increase understanding in scientific domains, notably those that contain “invisible” components. It was proposed that visual explanations would be more effective than verbal explanations because they encourage completeness and coherence, are more explicit, and are typically multimodal. These two experiments differ meaningfully from previous studies in that the information selected for drawing was not taken from a written text, but from a physical object (bicycle pump) and a class lesson with multiple representations (chemical bonding).

The results show that creating an explanation of a STEM phenomenon benefits learning, even when the explanations are created after learning and in the absence of new instruction. These gains in performance in the absence of teaching bear similarities to recent research showing gains in learning from testing in the absence of new instruction (e.g. Roediger et al., 2011 ; Roediger & Karpicke, 2006 ; Wheeler & Roediger, 1992 ). Many researchers have argued that the retrieval of information required during testing strengthens or enhances the retrieval process itself. Formulating explanations may be an especially effective form of testing for post-instruction learning. Creating an explanation of a complex system requires the retrieval of critical information and then the integration of that information into a coherent and plausible account. Other factors, such as the timing of the creation of the explanations, and whether feedback is provided to students, should help clarify the benefits of generating explanations and how they may be seen as a form of testing. There may even be additional benefits to learners, including increasing their engagement and motivation in school, and increasing their communication and reasoning skills (Ainsworth, Prain, & Tytler, 2011 ). Formulating a visual explanation draws upon students’ creativity and imagination as they actively create their own product.

As in previous research, students with high spatial ability both produced better explanations and performed better on tests of learning (e.g. Uttal et al., 2013 ). The visual explanations of high spatial students contained more information and more of the information that predicts learning outcomes. For the workings of a bicycle pump, creating a visual as opposed to verbal explanation had little impact on students of high spatial ability but brought students of lower spatial ability up to the level of students with high spatial abilities. For the more difficult set of concepts, chemical bonding, creating a visual explanation led to much larger gains than creating a verbal one for students both high and low in spatial ability. It is likely a mistake to assume that how and high spatial learners will remain that way; there is evidence that spatial ability develops with experience (Baenninger & Newcombe, 1989 ). It is possible that low spatial learners need more support in constructing explanations that require imagining the movement and manipulation of objects in space. Students learned the function of the bike pump by examining an actual pump and learned bonding through a video presentation. Future work to investigate methods of presenting material to students may also help to clarify the utility of generating explanations.

Creating visual explanations had greater benefits than those accruing from creating verbal ones. Surely some of the effectiveness of visual explanations is because they represent and communicate more directly than language. Elements of a complex system can be depicted and arrayed spatially to reflect actual or metaphoric spatial configurations of the system parts. They also allow, indeed, encourage, the use of well-honed spatial inferences to substitute for and support abstract inferences (e.g. Larkin & Simon, 1987 ; Tversky, 2011 ). As noted, visual explanations provide checks for completeness and coherence, that is, verification that all the necessary elements of the system are represented and that they work together properly to produce the outcomes of the processes. Visual explanations also provide a concrete reference for making and checking inferences about the behavior, causality, and function of the system. Thus, creating a visual explanation facilitates the selection and integration of information underlying learning even more than creating a verbal explanation.

Creating visual explanations appears to be an underused method of supporting and evaluating students’ understanding of dynamic processes. Two obstacles to using visual explanations in classrooms seem to be developing guidelines for creating visual explanations and developing objective scoring systems for evaluating them. The present findings give insights into both. Creating a complete and coherent visual explanation entails selecting the essential components and linking them by behavior, process, or causality. This structure and organization is familiar from recipes or construction sets: first the ingredients or parts, then the sequence of actions. It is also the ingredients of theater or stories: the players and their actions. In fact, the creation of visual explanations can be practiced on these more familiar cases and then applied to new ones in other domains. Deconstructing and reconstructing knowledge and information in these ways has more generality than visual explanations: these techniques of analysis serve thought and provide skills and tools that underlie creative thought. Next, we have shown that objective scoring systems can be devised, beginning with separating the information into structure and function, then further decomposing the structure into the central parts or actors and the function into the qualities of the sequence of actions and their consequences. Assessing students’ prior knowledge and misconceptions can also easily be accomplished by having students create explanations at different times in a unit of study. Teachers can see how their students’ ideas change and if students can apply their understanding by analyzing visual explanations as a culminating activity.

Creating visual explanations of a range of phenomena should be an effective way to augment students’ spatial thinking skills, thereby increasing the effectiveness of these explanations as spatial ability increases. The proverbial reading, writing, and arithmetic are routinely regarded as the basic curriculum of school learning and teaching. Spatial skills are not typically taught in schools, but should be: these skills can be learned and are essential to functioning in the contemporary and future world (see Uttal et al., 2013 ). In our lives, both daily and professional, we need to understand the maps, charts, diagrams, and graphs that appear in the media and public places, with our apps and appliances, in forms we complete, in equipment we operate. In particular, spatial thinking underlies the skills needed for professional and amateur understanding in STEM fields and knowledge and understanding STEM concepts is increasingly required in what have not been regarded as STEM fields, notably the largest employers, business, and service.

This research has shown that creating visual explanations has clear benefits to students, both specific and potentially general. There are also benefits to teachers, specifically, revealing misunderstandings and gaps in knowledge. Visualizations could be used by teachers as a formative assessment tool to guide further instructional activities and scoring rubrics could allow for the identification of specific misconceptions. The bottom line is clear. Creating a visual explanation is an excellent way to learn and master complex systems.

Additional file

Post-tests. (DOC 44 kb)

Acknowledgments

The authors are indebted to the Varieties of Understanding Project at Fordham University and The John Templeton Foundation and to the following National Science Foundation grants for facilitating the research and/or preparing the manuscript: National Science Foundation NSF CHS-1513841, HHC 0905417, IIS-0725223, IIS-0855995, and REC 0440103. We are grateful to James E. Corter for his helpful suggestions and to Felice Frankel for her inspiration. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the funders. Please address correspondence to Barbara Tversky at the Columbia Teachers College, 525 W. 120th St., New York, NY 10025, USA. Email: [email protected].

Authors’ contributions

This research was part of EB’s doctoral dissertation under the advisement of BT. Both authors contributed to the design, analysis, and drafting of the manuscript. Both authors read and approved the final manuscript.

Competing interests

The author declares that they have no competing interests.

  • Ainsworth SE, Bibby PA, Wood DJ. Examining the effects of different multiple representational systems in learning primary mathematics. Journal of the Learning Sciences. 2002; 11 (1):25–62. doi: 10.1207/S15327809JLS1101_2. [ CrossRef ] [ Google Scholar ]
  • Ainsworth, S. E., & Iacovides, I. (2005). Learning by constructing self-explanation diagrams. Paper presented at the 11th Biennial Conference of European Association for Resarch on Learning and Instruction, Nicosia, Cyprus.
  • Ainsworth SE, Loizou AT. The effects of self-explaining when learning with text or diagrams. Cognitive Science. 2003; 27 (4):669–681. doi: 10.1207/s15516709cog2704_5. [ CrossRef ] [ Google Scholar ]
  • Ainsworth S, Prain V, Tytler R. Drawing to learn in science. Science. 2011; 26 :1096–1097. doi: 10.1126/science.1204153. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Alesandrini KL. Pictorial-verbal and analytic-holistic learning strategies in science learning. Journal of Educational Psychology. 1981; 73 :358–368. doi: 10.1037/0022-0663.73.3.358. [ CrossRef ] [ Google Scholar ]
  • Aleven, V. & Koedinger, K. R. (2002). An effective metacognitive strategy: learning by doing and explaining with a computer-based cognitive tutor. Cognitive Science , 26 , 147–179.
  • Baenninger M, Newcombe N. The role of experience in spatial test performance: A meta-analysis. Sex Roles. 1989; 20 (5–6):327–344. doi: 10.1007/BF00287729. [ CrossRef ] [ Google Scholar ]
  • Bradley JD, Brand M. Stamping out misconceptions. Journal of Chemical Education. 1985; 62 (4):318. doi: 10.1021/ed062p318. [ CrossRef ] [ Google Scholar ]
  • Chi MT. Active-Constructive-Interactive: A conceptual framework for differentiating learning activities. Topics in Cognitive Science. 2009; 1 :73–105. doi: 10.1111/j.1756-8765.2008.01005.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Chi MTH, DeLeeuw N, Chiu M, LaVancher C. Eliciting self-explanations improves understanding. Cognitive Science. 1994; 18 :439–477. [ Google Scholar ]
  • Craik F, Lockhart R. Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior. 1972; 11 :671–684. doi: 10.1016/S0022-5371(72)80001-X. [ CrossRef ] [ Google Scholar ]
  • Edens KM, Potter E. Using descriptive drawings as a conceptual change strategy in elementary science. School Science and Mathematics. 2003; 103 (3):135–144. doi: 10.1111/j.1949-8594.2003.tb18230.x. [ CrossRef ] [ Google Scholar ]
  • Flick LB. The meanings of hands-on science. Journal of Science Teacher Education. 1993; 4 :1–8. doi: 10.1007/BF02628851. [ CrossRef ] [ Google Scholar ]
  • Glenberg AM, Langston WE. Comprehension of illustrated text: Pictures help to build mental models. Journal of Memory and Language. 1992; 31 :129–151. doi: 10.1016/0749-596X(92)90008-L. [ CrossRef ] [ Google Scholar ]
  • Gobert JD, Clement JJ. Effects of student-generated diagrams versus student-generated summaries on conceptual understanding of causal and dynamic knowledge in plate tectonics. Journal of Research in Science Teaching. 1999; 36 :39–53. doi: 10.1002/(SICI)1098-2736(199901)36:1<39::AID-TEA4>3.0.CO;2-I. [ CrossRef ] [ Google Scholar ]
  • Hall VC, Bailey J, Tillman C. Can student-generated illustrations be worth ten thousand words? Journal of Educational Psychology. 1997; 89 (4):677–681. doi: 10.1037/0022-0663.89.4.677. [ CrossRef ] [ Google Scholar ]
  • Hausmann RGM, Vanlehn K. Explaining self-explaining: A contrast between content and generation. In: Luckin R, Koedinger KR, Greer J, editors. Artificial intelligence in education: Building technology rich learning contexts that work. Amsterdam: Ios Press; 2007. pp. 417–424. [ Google Scholar ]
  • Hegarty M. Mental animation: Inferring motion from static displays of mechanical systems. Journal of Experimental Psychology: Learning, Memory & Cognition. 1992; 18 :1084–1102. [ PubMed ] [ Google Scholar ]
  • Hegarty M, Carpenter PA, Just MA. Diagrams in the comprehension of scientific text. In: Barr R, Kamil MS, Mosenthal P, Pearson PD, editors. Handbook of reading research. New York: Longman; 1990. pp. 641–669. [ Google Scholar ]
  • Hegarty M, Just MA. Constructing mental models of machines from text and diagrams. Journal of Memory and Language. 1993; 32 :717–742. doi: 10.1006/jmla.1993.1036. [ CrossRef ] [ Google Scholar ]
  • Hegarty M, Kriz S, Cate C. The roles of mental animations and external animations in understanding mechanical systems. Cognition & Instruction. 2003; 21 (4):325–360. doi: 10.1207/s1532690xci2104_1. [ CrossRef ] [ Google Scholar ]
  • Heiser J, Tversky B. Diagrams and descriptions in acquiring complex systems. Proceedings of the Cognitive Science Society. Hillsdale: Erlbaum; 2002. [ Google Scholar ]
  • Hmelo-Silver C, Pfeffer MG. Comparing expert and novice understanding of a complex system from the perspective of structures, behaviors, and functions. Cognitive Science. 2004; 28 :127–138. doi: 10.1207/s15516709cog2801_7. [ CrossRef ] [ Google Scholar ]
  • Johnson CI, Mayer RE. Applying the self-explanation principle to multimedia learning in a computer-based game-like environment. Computers in Human Behavior. 2010; 26 :1246–1252. doi: 10.1016/j.chb.2010.03.025. [ CrossRef ] [ Google Scholar ]
  • Johnstone AH. Why is science difficult to learn? Things are seldom what they seem. Journal of Chemical Education. 1991; 61 (10):847–849. doi: 10.1021/ed061p847. [ CrossRef ] [ Google Scholar ]
  • Kessell AM, Tversky B. Visualizing space, time, and agents: Production, performance, and preference. Cognitive Processing. 2011; 12 :43–52. doi: 10.1007/s10339-010-0379-3. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kozhevnikov M, Hegarty M, Mayer R. Revising the Visualizer–Verbalizer Dimension: Evidence for Two Types of Visualizers. Cognition & Instruction. 2002; 20 :37–77. doi: 10.1207/S1532690XCI2001_3. [ CrossRef ] [ Google Scholar ]
  • Kozma R, Chin E, Russell J, Marx N. The roles of representations and tools in the chemistry laboratory and their implication for chemistry learning. Journal of the Learning Sciences. 2002; 9 (2):105–143. doi: 10.1207/s15327809jls0902_1. [ CrossRef ] [ Google Scholar ]
  • Larkin J, Simon H. Why a diagram is (sometimes) worth ten thousand words. Cognitive Science. 1987; 11 :65–100. doi: 10.1111/j.1551-6708.1987.tb00863.x. [ CrossRef ] [ Google Scholar ]
  • Leutner D, Leopold C, Sumfleth E. Cognitive load and science text comprehension: Effects of drawing and mentally imagining text content. Computers in Human Behavior. 2009; 25 :284–289. doi: 10.1016/j.chb.2008.12.010. [ CrossRef ] [ Google Scholar ]
  • Levine M. You-are-here maps: Psychological considerations. Environment and Behavior. 1982; 14 :221–237. doi: 10.1177/0013916584142006. [ CrossRef ] [ Google Scholar ]
  • Mayer RE. Systematic thinking fostered by illustrations in scientific text. Journal of Educational Psychology. 1989; 81 :240–246. doi: 10.1037/0022-0663.81.2.240. [ CrossRef ] [ Google Scholar ]
  • Mayer RE, Sims VK. For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. Journal of Educational Psychology. 1994; 86 (3):389–401. doi: 10.1037/0022-0663.86.3.389. [ CrossRef ] [ Google Scholar ]
  • Perkins DN, Grotzer TA. Dimensions of causal understanding: The role of complex causal models in students’ understanding of science. Studies in Science Education. 2005; 41 :117–166. doi: 10.1080/03057260508560216. [ CrossRef ] [ Google Scholar ]
  • Roediger HL, Karpicke JD. Test enhanced learning: Taking memory tests improves long-term retention. Psychological Science. 2006; 17 :249–255. doi: 10.1111/j.1467-9280.2006.01693.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Roediger HL, Putnam AL, Smith MA. Ten benefits of testing and their applications to educational practice. In: Ross BH, editor. The psychology of learning and motivation. New York: Elsevier; 2011. pp. 1–36. [ Google Scholar ]
  • Roscoe RD, Chi MTH. Understanding tutor learning: Knowledge-building and knowledge-telling in peer tutors’ explanations and questions. Review of Educational Research. 2007; 77 :534–574. doi: 10.3102/0034654307309920. [ CrossRef ] [ Google Scholar ]
  • Schwamborn A, Mayer RE, Thillmann H, Leopold C, Leutner D. Drawing as a generative activity and drawing as a prognostic activity. Journal of Educational Psychology. 2010; 102 :872–879. doi: 10.1037/a0019640. [ CrossRef ] [ Google Scholar ]
  • Taber KS. Student understanding of ionic bonding: Molecular versus electrostatic framework? School Science Review. 1997; 78 (285):85–95. [ Google Scholar ]
  • Tversky B. Visualizing thought. Topics in Cognitive Science. 2011; 3 :499–535. doi: 10.1111/j.1756-8765.2010.01113.x. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Tversky B, Heiser J, MacKenzie R, Lozano S, Morrison JB. Enriching animations. In: Lowe R, Schnotz W, editors. Learning with animation: Research implications for design. New York: Cambridge University Press; 2007. pp. 263–285. [ Google Scholar ]
  • Tversky B, Suwa M. Thinking with sketches. In: Markman AB, Wood KL, editors. Tools for innovation. Oxford: Oxford University Press; 2009. pp. 75–84. [ Google Scholar ]
  • Uttal DH, Meadow NG, Tipton E, Hand LL, Alden AR, Warren C, et al. The malleability of spatial skills: A meta-analysis of training studies. Psychological Bulletin. 2013; 139 :352–402. doi: 10.1037/a0028446. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Van Meter P. Drawing construction as a strategy for learning from text. Journal of Educational Psychology. 2001; 93 (1):129–140. doi: 10.1037/0022-0663.93.1.129. [ CrossRef ] [ Google Scholar ]
  • Vandenberg SG, Kuse AR. Mental rotations: A group test of three-dimensional spatial visualization. Perceptual Motor Skills. 1978; 47 :599–604. doi: 10.2466/pms.1978.47.2.599. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Voyer D, Voyer S, Bryden MP. Magnitude of sex differences in spatial abilities: A meta-analysis and consideration of critical variables. Psychological Bulletin. 1995; 117 :250–270. doi: 10.1037/0033-2909.117.2.250. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wheeler MA, Roediger HL. Disparate effects of repeated testing: Reconciling Ballard’s (1913) and Bartlett’s (1932) results. Psychological Science. 1992; 3 :240–245. doi: 10.1111/j.1467-9280.1992.tb00036.x. [ CrossRef ] [ Google Scholar ]
  • Wilkin J. Learning from explanations: Diagrams can “inhibit” the self-explanation effect. In: Anderson M, editor. Reasoning with diagrammatic representations II. Menlo Park: AAAI Press; 1997. [ Google Scholar ]
  • Wittrock MC. Generative processes of comprehension. Educational Psychologist. 1990; 24 :345–376. doi: 10.1207/s15326985ep2404_2. [ CrossRef ] [ Google Scholar ]
  • Zacks J, Tversky B. Bars and lines: A study of graphic communication. Memory and Cognition. 1999; 27 :1073–1079. doi: 10.3758/BF03201236. [ PubMed ] [ CrossRef ] [ Google Scholar ]

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 22 December 2023

Development of visual object recognition

  • Vladislav Ayzenberg   ORCID: orcid.org/0000-0003-2739-3935 1 , 2 &
  • Marlene Behrmann 2 , 3  

Nature Reviews Psychology volume  3 ,  pages 73–90 ( 2024 ) Cite this article

1116 Accesses

3 Citations

28 Altmetric

Metrics details

  • Cognitive neuroscience
  • Object vision

Object recognition is the process by which humans organize the visual world into meaningful perceptual units. In this Review, we examine the developmental origins and maturation of object recognition by synthesizing research from developmental psychology, cognitive neuroscience and computational modelling. We describe the extent to which infants demonstrate early traces of adult visual competencies within their first year. The rapid development of these competencies is supported by infant-specific biological and experiential constraints, including blurry vision and ‘self-curation’ of object viewpoints that best support learning. We also discuss how the neural mechanisms that support object-recognition abilities in infancy seem to differ from those in adulthood, with less engagement of the ventral visual pathway. We conclude that children’s specific developmental niche shapes early object-recognition abilities and their neural underpinnings.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 digital issues and online access to articles

55,14 € per year

only 4,60 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

visual representation definition psychology

Similar content being viewed by others

visual representation definition psychology

Qualitative similarities and differences in visual object representations between brains and deep networks

visual representation definition psychology

Predictive processing of scenes and objects

visual representation definition psychology

Capturing the objects of vision with neural networks

Feldman, J. What is a visual object? Trends Cogn. Sci. 7 , 252–256 (2003).

Article   PubMed   Google Scholar  

Spelke, E. S. Principles of object perception. Cogn. Sci. 14 , 29–56 (1990).

Article   Google Scholar  

Grill-Spector, K. & Kanwisher, N. Visual recognition: as soon as you know it is there, you know what it is. Psychol. Sci. 16 , 152–160 (2005).

Thorpe, S., Fize, D. & Marlot, C. Speed of processing in the human visual system. Nature 381 , 520–522 (1996).

Article   CAS   PubMed   ADS   Google Scholar  

Keysers, C., Xiao, D.-K., Földiák, P. & Perrett, D. I. The speed of sight. J. Cogn. Neurosci. 13 , 90–101 (2001).

Article   CAS   PubMed   Google Scholar  

Shepard, R. N. Toward a universal law of generalization for psychological science. Science 237 , 1317–1323 (1987).

Article   MathSciNet   CAS   PubMed   ADS   Google Scholar  

Tenenbaum, J. B. & Griffiths, T. L. Generalization, similarity, and Bayesian inference. Behav. Brain. Sci. 24 , 629–640 (2001).

Morgenstern, Y., Schmidt, F. & Fleming, R. W. One-shot categorization of novel object classes in humans. Vis. Res. 165 , 98–108 (2019).

Lake, B., Salakhutdinov, R., Gross, J. & Tenenbaum, J. One shot learning of simple visual concepts. In Proc. Ann. Meet. Cogn. Sci . Soc . (CogSci, 2011).

Tiedemann, H., Morgenstern, Y., Schmidt, F. & Fleming, R. W. One-shot generalization in humans revealed through a drawing task. eLife 11 , e75485 (2022).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Marr, D. & Nishihara, H. K. Representation and recognition of the spatial organization of three-dimensional shapes. Proc. R. Soc. Lond. Ser. B 200 , 269–294 (1978).

Article   CAS   ADS   Google Scholar  

Hubel, D. H. & Wiesel, T. N. Receptive fields of single neurones in the cat’s striate cortex. J. Physiol. 148 , 574–591 (1959).

Kanwisher, N. & Dilks, D. D. in The New Visual Neuroscience (eds Chalupa, L. & J. Werner, J.) 733–748 (MIT Press, 2012).

Ayzenberg, V. & Behrmann, M. Does the brain’s ventral visual pathway compute object shape? Trends Cogn. Sci. 26 , 1119–1132 (2022).

Freud, E., Behrmann, M. & Snow, J. C. What does dorsal cortex contribute to perception? Open Mind 4 , 40–56 (2020).

Article   PubMed   PubMed Central   Google Scholar  

Welchman, A. E. The human brain in depth: how we see in 3D. Annu. Rev. Vis. Sci. 2 , 345–376 (2016).

Van Dromme, I. C., Premereur, E., Verhoef, B.-E., Vanduffel, W. & Janssen, P. Posterior parietal cortex drives inferotemporal activations during three-dimensional object vision. PLoS Biol. 14 , e1002445 (2016).

Ayzenberg, V. & Behrmann, M. The dorsal visual pathway represents object-centered spatial relations for object recognition. J. Neurosci. 42 , 4693–4710 (2022).

Zachariou, V., Nikas, C. V., Safiullah, Z. N., Gotts, S. J. & Ungerleider, L. G. Spatial mechanisms within the dorsal visual pathway contribute to the configural processing of faces. Cereb. Cortex 27 , 4124–4138 (2017).

PubMed   Google Scholar  

Kar, K., Kubilius, J., Schmidt, K., Issa, E. B. & DiCarlo, J. J. Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior. Nat. Neurosci. 22 , 974–983 (2019).

Bar, M. et al. Top-down facilitation of visual recognition. Proc. Natl Acad. Sci. USA 103 , 449–454 (2006).

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Zamir, A. R. et al. Taskonomy: disentangling task transfer learning. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 3712–3722 (IEEE, 2018).

Blauch, N. M., Behrmann, M. & Plaut, D. C. A connectivity-constrained computational account of topographic organization in high-level visual cortex. Proc. Natl Acad. Sci. USA 119 , e2112566119 (2022).

Article   MathSciNet   CAS   PubMed   PubMed Central   Google Scholar  

Doerig, A. et al. The neuroconnectionist research programme. Nat. Rev. Neurosci. 24 , 431–450 (2023).

Zador, A. M. A critique of pure learning and what artificial neural networks can learn from animal brains. Nat. Commun. 10 , 3770 (2019).

Article   PubMed   PubMed Central   ADS   Google Scholar  

Geirhos, R. et al. Partial success in closing the gap between human and machine vision. Adv. Neural Inf. Process. Syst. 34 , 23885–23899 (2021).

Google Scholar  

Biederman, I. Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94 , 115–147 (1987).

Mervis, C. B. & Rosch, E. Categorization of natural objects. Annu. Rev. Psychol. 32 , 89–115 (1981).

Elder, J. H. & Velisavljević, L. Cue dynamics underlying rapid detection of animals in natural scenes. J. Vision 9 , https://doi.org/10.1167/9.7.7 (2009).

Geirhos, R. et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. Preprint at arXiv https://arxiv.org/abs/1811.12231 (2018).

Landau, B., Smith, L. B. & Jones, S. S. The importance of shape in early lexical learning. Cogn. Dev. 3 , 299–321 (1988).

Wagemans, J. et al. Identification of everyday objects on the basis of silhouette and outline versions. Perception 37 , 207–244 (2008).

Biederman, I. & Ju, G. Surface versus edge-based determinants of visual recognition. Cogn. Psychol. 20 , 38–64 (1988).

Fantz, R. L. Visual experience in infants: decreased attention to familiar patterns relative to novel ones. Science 146 , 668–670 (1964).

Slater, A., Morison, V. & Rose, D. Perception of shape by the new‐born baby. Br. J. Dev. Psychol. 1 , 135–142 (1983).

Slater, A. & Morison, V. Shape constancy and slant perception at birth. Perception 14 , 337-344 (1985).

Quinn, P. C., Eimas, P. D. & Tarr, M. J. Perceptual categorization of cat and dog silhouettes by 3-to 4-month-old infants. J. Exp. Child. Psychol. 79 , 78–94 (2001).

Behrmann, M., Peterson, M. A., Moscovitch, M. & Suzuki, S. Independent representation of parts and the relations between them: evidence from integrative agnosia. J. Exp. Psychol. Hum. Percept. Perform. 32 , 1169–1184 (2006).

Grill-Spector, K., Kourtzi, Z. & Kanwisher, N. The lateral occipital complex and its role in object recognition. Vis. Res. 41 , 1409–1422 (2001).

Emberson, L. L., Crosswhite, S. L., Richards, J. E. & Aslin, R. N. The lateral occipital cortex is selective for object shape, not texture/color, at six months. J. Neurosci. 37 , 3698–3703 (2017).

Aslin, R. N. & Mehler, J. Near-infrared spectroscopy for functional studies of brain activity in human infants: promise, prospects, and challenges. J. Biomed. Opt. 10 , 1083–3668 (2005).

Wilcox, T. & Biondi, M. fNIRS in the developmental sciences. Wiley Interdisc. Rev. Cogn. Sci. 6 , 263–283 (2015).

Wilcox, T. et al. Hemodynamic changes in the infant cortex during the processing of featural and spatiotemporal information. Neuropsychologia 47 , 657–662 (2009).

Wilcox, T., Hawkins, L. B., Hirshkowitz, A. & Boas, D. A. Cortical activation to object shape and speed of motion during the first year. NeuroImage 99 , 129–141 (2014).

Wilcox, T., Bortfeld, H., Woods, R., Wruck, E. & Boas, D. A. Hemodynamic response to featural changes in the occipital and inferior temporal cortex in infants: a preliminary methodological exploration. Dev. Sci. 11 , 361–370 (2008).

Golarai, G. et al. Differential development of high-level visual cortex correlates with category-specific recognition memory. Nat. Neurosci. 10 , 512–522 (2007).

Scherf, K. S., Behrmann, M., Humphreys, K. & Luna, B. Visual category‐selectivity for faces, places and objects emerges along different developmental trajectories. Dev. Sci. 10 , F15–F30 (2007).

Freud, E., Plaut, D. C. & Behrmann, M. Protracted developmental trajectory of shape processing along the two visual pathways. J. Cogn. Neurosci. 31 , 1589–1597 (2019).

Nishimura, M., Scherf, K. S., Zachariou, V., Tarr, M. J. & Behrmann, M. Size precedes view: developmental emergence of invariant object representations in lateral occipital complex. J. Cogn. Neurosci. 27 , 474–491 (2015).

Dekker, T., Mareschal, D., Sereno, M. I. & Johnson, M. H. Dorsal and ventral stream activation and object recognition performance in school-age children. NeuroImage 57 , 659–670 (2011).

Wilcox, T., Stubbs, J., Hirshkowitz, A. & Boas, D. A. Functional activation of the infant cortex during object processing. NeuroImage 62 , 1833–1840 (2012).

Lourenco, S. F. & Huttenlocher, J. The representation of geometric cues in infancy. Infancy 13 , 103–127 (2008).

Dillon, M. R., Izard, V. & Spelke, E. S. Infants’ sensitivity to shape changes in 2D visual forms. Infancy 25 , 618–639 (2020).

Slater, A., Mattock, A., Brown, E. & Bremner, J. G. Form perception at birth: revisited. J. Exp. Child. Psychol. 51 , 395–406 (1991).

Cohen, L. B. & Younger, B. A. Infant perception of angular relations. Infant. Behav. Dev. 7 , 37–47 (1984).

Bhatt, R. S. & Waters, S. E. Perception of three-dimensional cues in early infancy. J. Exp. Child. Psychol. 70 , 207–224 (1998).

Kavšek, M., Yonas, A. & Granrud, C. E. Infants’ sensitivity to pictorial depth cues: a review and meta-analysis of looking studies. Infant. Behav. Dev. 35 , 109–128 (2012).

Biederman, I. & Cooper, E. E. Priming contour-deleted images: evidence for intermediate representations in visual object recognition. Cogn. Psychol. 23 , 393–419 (1991).

Ayzenberg, V. & Lourenco, S. F. Skeletal descriptions of shape provide unique perceptual information for object recognition. Sci. Rep. 9 , 9359 (2019).

Wagemans, J. et al. A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure–ground organization. Psychol. Bull. 138 , 1172–1217 (2012).

Quinn, P. C., Bhatt, R. S., Brush, D., Grimes, A. & Sharpnack, H. Development of form similarity as a Gestalt grouping principle in infancy. Psychol. Sci. 13 , 320–328 (2002).

Schmidt, H. & Spelke, E. Gestalt relations and object perception in infancy. Infant. Behav. Dev. 7 , 319 (1984).

Farroni, T., Valenza, E., Simion, F. & Umiltà, C. Configural processing at birth: evidence for perceptual organisation. Perception 29 , 355–372 (2000).

Johnson, S. P. & Aslin, R. N. Perception of object unity in 2-month-old infants. Dev. Psychol. 31 , 739–745 (1995).

Slater, A., Johnson, S. P., Brown, E. & Badenoch, M. Newborn infant’s perception of partly occluded objects. Infant. Behav. Dev. 19 , 145–148 (1996).

Kellman, P. J. & Spelke, E. S. Perception of partly occluded objects in infancy. Cogn. Psychol. 15 , 483–524 (1983).

Slater, A. et al. Newborn and older infants’ perception of partly occluded objects. Infant. Behav. Dev. 13 , 33–49 (1990).

Cassia, V. M., Simion, F., Milani, I. & Umiltà, C. Dominance of global visual properties at birth. J. Exp. Psychol. Gen. 131 , 398 (2002).

Ayzenberg, V. & Lourenco, S. Perception of an object’s global shape is best described by a model of skeletal structure in human infants. eLife 11 , e74943 (2022).

Ghim, H.-R. & Eimas, P. D. Global and local processing by 3-and 4-month-old infants. Percept. Psychophys. 43 , 165–171 (1988).

von der Heydt, R. Figure–ground organization and the emergence of proto-objects in the visual cortex. Front. Psychol. 6 , 1695 (2015).

PubMed   PubMed Central   Google Scholar  

Lee, T. S., Mumford, D., Romero, R. & Lamme, V. A. The role of the primary visual cortex in higher level vision. Vis. Res. 38 , 2429–2454 (1998).

Wokke, M. E., Vandenbroucke, A. R., Scholte, H. S. & Lamme, V. A. Confuse your illusion: feedback to early visual cortex contributes to perceptual completion. Psychol. Sci. 24 , 63–71 (2013).

Tang, H. et al. Recurrent computations for visual pattern completion. Proc. Natl Acad. Sci. USA 115 , 8835–8840 (2018).

Kietzmann, T. C. et al. Recurrence is required to capture the representational dynamics of the human visual system. Proc. Natl Acad. Sci. USA 116 , 21854–21863 (2019).

Schrimpf, M. et al. Brain-score: which artificial neural network for object recognition is most brain-like? Preprint at bioRxiv https://doi.org/10.1101/407007 (2018).

Batardière, A. et al. Early specification of the hierarchical organization of visual cortical areas in the macaque monkey. Cereb. Cortex 12 , 453–465 (2002).

Burkhalter, A. Development of forward and feedback connections between areas V1 and V2 of human visual cortex. Cereb. Cortex 3 , 476–487 (1993).

Coogan, T. A. & Van Essen, D. C. Development of connections within and between areas V1 and V2 of macaque monkeys. J. Comp. Neurol. 372 , 327–342 (1996).

Burkhalter, A., Bernardo, K. L. & Charles, V. Development of local circuits in human visual cortex. J. Neurosci. 13 , 1916–1931 (1993).

Smyser, C. D. et al. Longitudinal analysis of neural network development in preterm infants. Cereb. Cortex 20 , 2852–2862 (2010).

Nagy, Z., Westerberg, H. & Klingberg, T. Maturation of white matter is associated with the development of cognitive functions during childhood. J. Cogn. Neurosci. 16 , 1227–1233 (2004).

Emberson, L. L., Richards, J. E. & Aslin, R. N. Top-down modulation in the infant brain: learning-induced expectations rapidly affect the sensory cortex at 6 months. Proc. Natl Acad. Sci. USA 112 , 9585–9590 (2015).

Kouider, S. et al. A neural marker of perceptual consciousness in infants. Science 340 , 376–380 (2013).

Ayzenberg, V. & Lourenco, S. Young children outperform feed-forward and recurrent neural networks on challenging object recognition tasks. J. Vis. 20 , 310–310 (2020).

Káldy, Z. & Kovács, I. Visual context integration is not fully developed in 4-year-old children. Perception 32 , 657–666 (2003).

Kovács, I. Human development of perceptual organization. Vis. Res. 40 , 1301–1310 (2000).

Kovács, I., Kozma, P., Fehér, Á. & Benedek, G. Late maturation of visual spatial integration in humans. Proc. Natl Acad. Sci. USA 96 , 12204–12209 (1999).

Scherf, K. S., Behrmann, M., Kimchi, R. & Luna, B. Emergence of global shape processing continues through adolescence. Child. Dev. 80 , 162–177 (2009).

Atkinson, J. Human visual development over the first 6 months of life. A review and a hypothesis. Hum. Neurobiol. 3 , 61–74 (1983).

Banks, M. S. & Salapatek, P. Infant pattern vision: a new approach based on the contrast sensitivity function. J. Exp. Child. Psychol. 31 , 1–45 (1981).

Atkinson, J., Braddick, O. & Braddick, F. Acuity and contrast sensivity of infant vision. Nature 247 , 403–404 (1974).

Brown, A. M. & Yamamoto, M. Visual acuity in newborn and preterm infants measured with grating acuity cards. Am. J. Ophthalmol. 102 , 245–253 (1986).

Sokol, S. Measurement of infant visual acuity from pattern reversal evoked potentials. Vis. Res. 18 , 33–39 (1978).

Newport, E. L. Maturational constraints on language learning. Cogn. Sci. 14 , 11–28 (1990).

Elman, J. L. Learning and development in neural networks: the importance of starting small. Cognition 48 , 71–99 (1993).

Bjorklund, D. F. The role of immaturity in human development. Psychol. Bull. 122 , 153–169 (1997).

Lickliter, R. Premature visual stimulation accelerates intersenory functioning in bobwhite quail neonates. Dev. Psychobiol. 23 , 15–27 (1990).

Lickliter, R. & Hellewell, T. B. in Developmental Time and Timing (eds Lurkewitz, G. & Devenny, D. A.) 105–124 (Lawrence Erlbaum, 1993).

Kenny, P. A. & Turkewitz, G. Effects of unusually early visual stimulation on the development of homing behavior in the rat pup. Dev. Psychobiol. 19 , 57–66 (1986).

Harlow, H. F. The development of learning in the rhesus monkey. Sci. Prog. 12 , 239–269 (1959).

Ostrovsky, Y., Meyers, E., Ganesh, S., Mathur, U. & Sinha, P. Visual parsing after recovery from blindness. Psychol. Sci. 20 , 1484–1491 (2009).

Le Grand, R., Mondloch, C. J., Maurer, D. & Brent, H. P. Early visual experience and face processing. Nature 410 , 890–890 (2001).

Article   PubMed   ADS   Google Scholar  

Ellemberg, D., Lewis, T. L., Maurer, D., Brar, S. & Brent, H. P. Better perception of global motion after monocular than after binocular deprivation. Vis. Res. 42 , 169–179 (2002).

Vogelsang, L. et al. Potential downside of high initial visual acuity. Proc. Natl Acad. Sci. USA 115 , 11333–11338 (2018).

Jang, H. & Tong, F. Convolutional neural networks trained with a developmental sequence of blurry to clear images reveal core differences between face and object processing. J. Vis. 21 , 6 (2021).

Jinsi, O., Henderson, M. M. & Tarr, M. J. Early experience with low-pass filtered images facilitates visual category learning in a neural network model. PLoS One 18 , e0280145 (2023).

Wang, W., Zhou, T., Chen, L. & Huang, Y. A subcortical magnocellular pathway is responsible for the fast processing of topological properties of objects: a transcranial magnetic stimulation study. Hum. Brain Mapp. 44 , 1617–1628 (2022).

Felleman, D. J. & Van Essen, D. C. Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex 1 , 1–47 (1991).

Rakic, P., Barlow, H. B. & Gaze, R. M. Prenatal development of the visual system in rhesus monkey. Phil. Trans. R. Soc. Lond. B 278 , 245–260 (1977).

Kogan, C. S., Zangenehpour, S. & Chaudhuri, A. Developmental profiles of SMI-32 immunoreactivity in monkey striate cortex. Dev. Brain Res. 119 , 85–95 (2000).

Article   CAS   Google Scholar  

Hammarrenger, B. et al. Magnocellular and parvocellular developmental course in infants during the first year of life. Doc. Ophthalmol. 107 , 225–233 (2003).

Arsenovic, A., Ischebeck, A. & Zaretskaya, N. Dissociation between attention-dependent and spatially specific illusory shape responses within the topographic areas of the posterior parietal cortex. J. Neurosci. 42 , 8125–8135 (2022).

Grassi, P. R., Zaretskaya, N. & Bartels, A. A generic mechanism for perceptual organization in the parietal cortex. J. Neurosci. 38 , 7158–7169 (2018).

Riddoch, M. J. et al. A tale of two agnosias: distinctions between form and integrative agnosia. Cogn. Neuropsychol. 25 , 56–92 (2008).

Romei, V., Driver, J., Schyns, P. G. & Thut, G. Rhythmic TMS over parietal cortex links distinct brain frequencies to global versus local visual processing. Curr. Biol. 21 , 334–337 (2011).

Zaretskaya, N., Anstis, S. & Bartels, A. Parietal cortex mediates conscious perception of illusory gestalt. J. Neurosci. 33 , 523–531 (2013).

Guo, C. et al. Adversarially trained neural representations are already as robust as biological neural representations. In Proc. Int. Conf. Machine Learning Vol. 162 (eds Chaudhuri, K. et al.) 8072–8081 (Proc. Machine Learning Research, 2022).

Waidmann, E. N., Koyano, K. W., Hong, J. J., Russ, B. E. & Leopold, D. A. Local features drive identity responses in macaque anterior face patches. Nat. Commun. 13 , 5592 (2022).

Ayzenberg, V., Simmons, C. & Behrmann, M. Temporal asymmetries and interactions between dorsal and ventral visual pathways during object recognition. Cereb. Cortex Comm. 4 , tgad003 (2022).

Tarr, M. J. & Bülthoff, H. H. Image-based object recognition in man, monkey and machine. Cognition 67 , 1–20 (1998).

Humphrey, G. K. & Jolicoeur, P. An examination of the effects of axis foreshortening, monocular depth cues, and visual field on object identification. Q. J. Exp. Psychol. 46 , 137–159 (1993).

Le Vay, S., Wiesel, T. N. & Hubel, D. H. The development of ocular dominance columns in normal and visually deprived monkeys. J. Comp. Neurol. 191 , 1–51 (1980).

Chino, Y. M., Smith, E. L. III, Hatta, S. & Cheng, H. Postnatal development of binocular disparity sensitivity in neurons of the primate visual cortex. J. Neurosci. 17 , 296–307 (1997).

Slater, A., Mattock, A. & Brown, E. Size constancy at birth: newborn infants’ responses to retinal and real size. J. Exp. Child. Psychol. 49 , 314–322 (1990).

Slater, A., Morison, V. & Rose, D. New‐born infants’ perception of similarities and differences between two‐and three‐dimensional stimuli. Br. J. Dev. Psychol. 2 , 287–294 (1984).

Jandó, G. et al. Early-onset binocularity in preterm infants reveals experience-dependent visual development in humans. Proc. Natl Acad. Sci. USA 109 , 11049–11052 (2012).

Fox, R., Aslin, R. N., Shea, S. L. & Dumais, S. T. Stereopsis in human infants. Science 207 , 323–324 (1980).

Hirshkowitz, A. & Wilcox, T. Infants’ ability to extract three-dimensional shape from coherent motion. Infant. Behav. Dev. 36 , 863–872 (2013).

Kellman, P. J. & Short, K. R. Development of three-dimensional form perception. J. Exp. Psychol. Hum. Percept. Perform. 13 , 545 (1987).

Kellman, P. J. Perception of three-dimensional form by human infants. Percept. Psychophys. 36 , 353–358 (1984).

Shuwairi, S. M., Albert, M. K. & Johnson, S. P. Discrimination of possible and impossible objects in infancy. Psychol. Sci. 18 , 303–307 (2007).

Tsuruhara, A., Sawada, T., Kanazawa, S., Yamaguchi, M. K. & Yonas, A. Infant’s ability to form a common representation of an object’s shape from different pictorial depth cues: a transfer-across-cues study. Infant Behav. Dev. 32 , 468–475 (2009).

Mash, C., Arterberry, M. E. & Bornstein, M. H. Mechanisms of visual object tecognition in infancy: five‐month‐olds generalize beyond the interpolation of familiar views. Infancy 12 , 31–43 (2007).

Ruff, H. A. Infant recognition of the invariant form of objects. Child. Dev. 49 , 293–306 (1978).

Kraebel, K. S. & Gerhardstein, P. C. Three-month-old infants’ object recognition across changes in viewpoint using an operant learning procedure. Infant Behav. Dev. 29 , 11–23 (2006).

Georgieva, S., Peeters, R., Kolster, H., Todd, J. T. & Orban, G. A. The processing of three-dimensional shape from disparity in the human brain. J. Neurosci. 29 , 727–742 (2009).

Georgieva, S. S., Todd, J. T., Peeters, R. & Orban, G. A. The extraction of 3D shape from texture and shading in the human brain. Cereb. Cortex 18 , 2416–2438 (2008).

Orban, G. A. The extraction of 3D shape in the visual system of human and nonhuman primates. Annu. Rev. Neurosci. 34 , 361–388 (2011).

Yamane, Y., Carlson, E. T., Bowman, K. C., Wang, Z. & Connor, C. E. A neural code for three-dimensional object shape in macaque inferotemporal cortex. Nat. Neurosci. 11 , 1352–1360 (2008).

Murphy, A. P., Leopold, D. A., Humphreys, G. W. & Welchman, A. E. Lesions to right posterior parietal cortex impair visual depth perception from disparity but not motion cues. Phil. Trans. R. Soc. B 371 , 20150263 (2016).

Tarr, M. J. & Bülthoff, H. H. Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein. J. Exp. Psychol. Hum. Percept. Perform. 21 , 1494–1505 (1995).

Wood, J. N. Newborn chickens generate invariant object representations at the onset of visual object experience. Proc. Natl Acad. Sci. USA 110 , 14000–14005 (2013).

Wood, J. N. & Wood, S. M. W. One-shot learning of view-invariant object representations in newborn chicks. Cognition 199 , 104192 (2020).

Kellman, P. J. & Shipley, T. F. A theory of visual interpolation in object perception. Cogn. Psychol. 23 , 141–221 (1991).

Wood, J. N. & Wood, S. M. The development of invariant object recognition requires visual experience with temporally smooth objects. Cogn. Sci. 42 , 1391–1406 (2018).

Ye, J. et al. Resilience of temporal processing to early and extended visual deprivation. Vis. Res. 186 , 80–86 (2021).

Ben-Ami, S. et al. Human (but not animal) motion can be recognized at first sight — after treatment for congenital blindness. Neuropsychologia 174 , 108307 (2022).

Bourne, J. A. & Rosa, M. G. Hierarchical development of the primate visual cortex, as revealed by neurofilament immunoreactivity: early maturation of the middle temporal area (MT). Cereb. Cortex 16 , 405–414 (2006).

Ciesielski, K. T. et al. Maturational changes in human dorsal and ventral visual networks. Cereb. Cortex 29 , 5131–5149 (2019).

Distler, C., Bachevalier, J., Kennedy, C., Mishkin, M. & Ungerleider, L. Functional development of the corticocortical pathway for motion analysis in the macaque monkey: a 14 C-2-deoxyglucose study. Cereb. Cortex 6 , 184–195 (1996).

Biagi, L., Tosetti, M., Crespi, S. A. & Morrone, M. C. Development of BOLD response to motion in human infants. J. Neurosci. 43 , 3825–3837 (2023).

Rosch, E. in Cognition and Categorization (eds Rosch, E. & Lloyd, B.) 27–48 (Erlbaum, 1978).

Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M. & Boyes-Braem, P. in Cognitive Psychology: Key Readings (eds Balota, D. A. & Marsh, E. J.) 448–471 (Psychology Press, 2004).

Mareschal, D. & Quinn, P. C. Categorization in infancy. Trends Cogn. Sci. 5 , 443–450 (2001).

Turati, C., Simion, F. & Zanon, L. Newborns’ perceptual categorization for closed and open geometric forms. Infancy 4 , 309–325 (2003).

Quinn, P. C., Slater, A. M., Brown, E. & Hayes, R. A. Developmental change in form categorization in early infancy. Br. J. Dev. Psychol. 19 , 207–218 (2001).

Bomba, P. C. & Siqueland, E. R. The nature and structure of infant form categories. J. Exp. Child. Psychol. 35 , 294–328 (1983).

Quinn, P. C., Eimas, P. D. & Rosenkrantz, S. L. Evidence for representations of perceptually similar natural categories by 3-month-old and 4-month-old infants. Perception 22 , 463–475 (1993).

Quinn, P. C. & Johnson, M. H. Global-before-basic object categorization in connectionist networks and 2-month-old infants. Infancy 1 , 31–46 (2000).

Mareschal, D., French, R. M. & Quinn, P. C. A connectionist account of asymmetric category learning in early infancy. Dev. Psychol. 36 , 635–645 (2000).

Oakes, L. M. & Spalding, T. L. The role of exemplar distribution in infants’ differentiation of categories. Infant. Behav. Dev. 20 , 457–475 (1997).

Quinn, P. C. The categorical representation of visual pattern information by young infants. Cognition 27 , 145–179 (1987).

Sorscher, B., Ganguli, S. & Sompolinsky, H. Neural representational geometry underlies few-shot concept learning. Proc. Natl Acad. Sci. USA 119 , e2200800119 (2022).

Feldman, J. The structure of perceptual categories. J. Math. Psychol. 41 , 145–170 (1997).

Article   MathSciNet   CAS   PubMed   Google Scholar  

Feldman, J. & Singh, M. Bayesian estimation of the shape skeleton. Proc. Natl Acad. Sci. USA 103 , 18014–18019 (2006).

Article   MathSciNet   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Landau, B., Smith, L. & Jones, S. Object perception and object naming in early development. Trends Cogn. Sci. 2 , 19–24 (1998).

Smith, L. B., Jones, S. S. & Landau, B. Naming in young children: a dumb attentional mechanism? Cognition 60 , 143–171 (1996).

Smith, L. B. Learning to recognize objects. Psychol. Sci. 14 , 244–250 (2003).

Clerkin, E. M., Hart, E., Rehg, J. M., Yu, C. & Smith, L. B. Real-world visual statistics and infants’ first-learned object names. Phil. Trans. R. Soc. B 372 , 20160055 (2017).

Jayaraman, S., Fausey, C. M. & Smith, L. B. The faces in infant-perspective scenes change over the first year of life. PLoS One 10 , e0123780 (2015).

Jayaraman, S. & Smith, L. B. Faces in early visual environments are persistent not just frequent. Vis. Res. 157 , 213–221 (2019).

Tartaglini, A. R., Vong, W. K. & Lake, B. M. A developmentally-inspired examination of shape versus texture bias in machines. Preprint at arXiv https://arxiv.org/abs/1811.12231 (2022).

Huber, L. S., Geirhos, R. & Wichmann, F. A. A four-year-old can outperform ResNet-50: out-of-distribution robustness may not require large-scale experience. In 3rd Worksh. on Shared Visual Representations in Human and Machine Intelligence (SVRHM) (NeurIPS, 2021).

Smith, L. B., Jayaraman, S., Clerkin, E. & Yu, C. The developing infant creates a curriculum for statistical learning. Trends Cogn. Sci. 22 , 325–336 (2018).

Slone, L. K., Smith, L. B. & Yu, C. Self-generated variability in object images predicts vocabulary growth. Dev. Sci. 22 , e12816 (2019).

James, K. H., Jones, S. S., Smith, L. B. & Swain, S. N. Young children’s self-generated object views and object recognition. J. Cogn. Dev. 15 , 393–401 (2014).

Perez, J. & Feigenson, L. Violations of expectation trigger infants to search for explanations. Cognition 218 , 104942 (2022).

Stahl, A. E. & Feigenson, L. Observing the unexpected enhances infants’ learning and exploration. Science 348 , 91–94 (2015).

Lee, D., Gujarathi, P. & Wood, J. N. Controlled-rearing studies of newborn chicks and deep neural networks. In 3rd Worksh. on Shared Visual Representations in Human and Machine Intelligence (SVRHM) (NeurIPS, 2021).

Orhan, E. A., Gupta, P. V. & Lake, B. M. Self-supervised learning through the eyes of a child. In Advances in Neural Information Processing Systems 116 (NeurIPS, 2021).

Zhuang, C. et al. Unsupervised neural network models of the ventral visual stream. Proc. Natl Acad. Sci. USA 118 , e2014196118 (2021).

Bambach, S., Crandall, D. J., Smith, L. B. & Yu, C. in Joint Int. Conf. on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) 290–295 (IEEE, 2017).

Pak, D., Lee, D., Wood, S. M. & Wood, J. N. A newborn embodied Turing test for view-invariant object recognition. Preprint at arXiv https://arxiv.org/abs/2306.05582 (2023).

Rajalingham, R. & DiCarlo, J. J. Reversible inactivation of different millimeter-scale regions of primate IT results in different patterns of core object recognition deficits. Neuron 102 , 493–505. e495 (2019).

Dehaene, S. et al. How learning to read changes the cortical networks for vision and language. Science 330 , 1359–1364 (2010).

Saygin, Z. M. et al. Connectivity precedes function in the development of the visual word form area. Nat. Neurosci. 19 , 1250–1255 (2016).

Arcaro, M. J., Schade, P. F., Vincent, J. L., Ponce, C. R. & Livingstone, M. S. Seeing faces is necessary for face-domain formation. Nat. Neurosci. 20 , 1404–1412 (2017).

Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P. & Gore, J. C. Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects. Nat. Neurosci. 2 , 568–573 (1999).

Srihasam, K., Vincent, J. L. & Livingstone, M. S. Novel domain formation reveals proto-architecture in inferotemporal cortex. Nat. Neurosci. 17 , 1776–1783 (2014).

Kosakowski, H. L. et al. Selective responses to faces, scenes, and bodies in the ventral visual pathway of infants. Curr. Biol. 32 , 265–274.e5 (2021).

Deen, B. et al. Organization of high-level visual cortex in human infants. Nat. Commun. 8 , 13995 (2017).

Powell, L. J., Deen, B. & Saxe, R. Using individual functional channels of interest to study cortical development with fNIRS. Dev. Sci. 21 , e12595 (2018).

Yan, X. et al. When do visual category representations emerge in infants’ brains? Preprint at bioRxiv https://doi.org/10.1101/2023.05.11.539934 (2023).

Germine, L. T., Duchaine, B. & Nakayama, K. Where cognitive development and aging meet: face learning ability peaks after age 30. Cognition 118 , 201–210 (2011).

Cohen, M. A. et al. Representational similarity precedes category selectivity in the developing ventral visual pathway. NeuroImage 197 , 565–574 (2019).

Xie, S. et al. Visual category representations in the infant brain. Curr. Biol. 32 , 5422–5432.e5426 (2022).

Kiat, J. E. et al. Linking patterns of infant eye movements to a neural network model of the ventral stream using representational similarity analysis. Dev. Sci. 25 , e13155 (2021).

Yamins, D. L. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111 , 8619–8624 (2014).

Gao, W. et al. Temporal and spatial evolution of brain network topology during the first two years of life. PLoS One 6 , e25278 (2011).

Gogtay, N. et al. Dynamic mapping of human cortical development during childhood through early adulthood. Proc. Natl Acad. Sci. USA 101 , 8174–8179 (2004).

Keunen, K., Counsell, S. J. & Benders, M. J. N. L. The emergence of functional architecture during early brain development. NeuroImage 160 , 2–14 (2017).

Natu, V. S. et al. Infants’ cortex undergoes microstructural growth coupled with myelination during development. Commun. Biol. 4 , 1191 (2021).

Spriet, C., Abassi, E., Hochmann, J.-R. & Papeo, L. Visual object categorization in infancy. Proc. Natl Acad. Sci. USA 119 , e2105866119 (2022).

Ayzenberg, V., Granovetter, M. C., Robert, S., Patterson, C. & Behrmann, M. Differential functional reorganization of ventral and dorsal visual pathways following childhood hemispherectomy. Dev. Cogn. Neurosci. 64 , 101323 (2023).

Kamps, F. S., Hendrix, C. L., Brennan, P. A. & Dilks, D. D. Connectivity at the origins of domain specificity in the cortical face and place networks. Proc. Natl Acad. Sci. USA 117 , 6163–6169 (2020).

Hasson, U., Levy, I., Behrmann, M., Hendler, T. & Malach, R. Eccentricity bias as an organizing principle for human high-order object areas. Neuron 34 , 479–490 (2002).

Yetter, M. et al. Curvilinear features are important for animate/inanimate categorization in macaques. J. Vis. 21 , 3 (2021).

Yue, X., Robert, S. & Ungerleider, L. G. Curvature processing in human visual cortical areas. NeuroImage 222 , 117295 (2020).

Ponce, C. R., Hartmann, T. S. & Livingstone, M. S. End-stopping predicts curvature tuning along the ventral stream. J. Neurosci. 37 , 648–659 (2017).

Cassia, V. M., Valenza, E., Simion, F. & Leo, I. Congruency as a nonspecific perceptual property contributing to newborns’ face preference. Child. Dev. 79 , 807–820 (2008).

Cassia, V. M., Turati, C. & Simion, F. Can a nonspecific bias toward top-heavy patterns explain newborns’ face preference? Psychol. Sci. 15 , 379–383 (2004).

Turati, C., Simion, F., Milani, I. & Umiltà, C. Newborns’ preference for faces: what is crucial? Dev. Psychol. 38 , 875–882 (2002).

Johnson, M. H. Subcortical face processing. Nat. Rev. Neurosci. 6 , 766–774 (2005).

Hafed, Z. M. & Chen, C.-Y. Sharper, stronger, faster upper visual field representation in primate superior colliculus. Curr. Biol. 26 , 1647–1658 (2016).

Versace, E., Damini, S. & Stancher, G. Early preference for face-like stimuli in solitary species as revealed by tortoise hatchlings. Proc. Natl Acad. Sci. USA 117 , 24047–24049 (2020).

Johnson, M. H., Dziurawiec, S., Ellis, H. & Morton, J. Newborns’ preferential tracking of face-like stimuli and its subsequent decline. Cognition 40 , 1–19 (1991).

Reid, V. M. et al. The human fetus preferentially engages with face-like visual stimuli. Curr. Biol. 27 , 1825–1828.e1823 (2017).

Simion, F., Valenza, E., Umilta, C. & Barba, B. D. Preferential orienting to faces in newborns: a temporal–nasal asymmetry. J. Exp. Psychol. Hum. Percept. Perform. 24 , 1399 (1998).

Arcaro, M. J. & Livingstone, M. S. On the relationship between maps and domains in inferotemporal cortex. Nat. Rev. Neurosci. 22 , 573–583 (2021).

Gomez, J., Barnett, M. & Grill-Spector, K. Extensive childhood experience with Pokémon suggests eccentricity drives organization of visual cortex. Nat. Hum. Behav. 3 , 611–624 (2019).

Xu, S., Zhang, Y., Zhen, Z. & Liu, J. The face module emerges from domain-general visual experience: a deprivation study on deep convolution neural network. Front. Comput. Neurosci. 15 , 626259 (2020).

Baek, S., Song, M., Jang, J., Kim, G. & Paik, S.-B. Face detection in untrained deep neural networks. Nat. Commun. 12 , 7328 (2021).

Hannagan, T., Agrawal, A., Cohen, L. & Dehaene, S. Emergence of a compositional neural code for written words: recycling of a convolutional neural network for reading. Proc. Natl Acad. Sci. USA 118 , e2104779118 (2021).

Nordt, M. et al. Cortical recycling in high-level visual cortex during childhood development. Nat. Hum. Behav. 5 , 1686–1697 (2021).

Dehaene, S. & Cohen, L. Cultural recycling of cortical maps. Neuron 56 , 384–398 (2007).

Behrmann, M. & Plaut, D. C. Hemispheric organization for visual object recognition: a theoretical account and empirical evidence. Perception 49 , 373–404 (2020).

Bakhtiari, S., Mineault, P., Lillicrap, T., Pack, C. & Richards, B. The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning. Adv. Neural Inf. Process. Syst. 34 , 25164–25178 (2021).

Zhu, M. & Gupta, S. To prune, or not to prune: exploring the efficacy of pruning for model compression. Preprint at arXiv https://arxiv.org/abs/1710.01878 (2017).

Lu, H. & Erlikhman, G. Enhancement of representational sparsity in deep neural networks can improve generalization. J. Vis. 19 , 209b (2019).

Yuan, L., Xiang, V., Crandall, D. & Smith, L. Learning the generative principles of a symbol system from limited examples. Cognition 200 , 104243 (2020).

Smith, L. B. & Slone, L. K. A developmental approach to machine learning? Front. Psychol. 8 , 2124 (2017).

Blauch, N. M., Behrmann, M. & Plaut, D. C. Computational insights into human perceptual expertise for familiar and unfamiliar face recognition. Cognition 208 , 104341 (2020).

Stojnić, G., Gandhi, K., Yasuda, S., Lake, B. M. & Dillon, M. R. Commonsense psychology in human infants and machines. Cognition 235 , 105406 (2023).

Wichmann, F. A. et al. Methods and measurements to compare men against machines. Electron. Imaging 2017 , 36–45 (2017).

Yermolayeva, Y. & Rakison, D. H. Connectionist modeling of developmental changes in infancy: approaches, challenges, and contributions. Psychol. Bull. 140 , 224–255 (2014).

Yates, T. S. et al. Neural event segmentation of continuous experience in human infants. Proc. Natl Acad. Sci. USA 119 , e2200257119 (2022).

Yates, T. S., Ellis, C. T. & Turk-Browne, N. B. Emergence and organization of adult brain function throughout child development. NeuroImage 226 , 117606 (2021).

Lerner, Y., Scherf, K. S., Katkov, M., Hasson, U. & Behrmann, M. Changes in cortical coherence supporting complex visual and social processing in adolescence. J. Cogn. Neurosci. 33 , 2215–2230 (2021).

Wilcox, T., Haslup, J. A. & Boas, D. A. Dissociation of processing of featural and spatiotemporal information in the infant cortex. NeuroImage 53 , 1256–1263 (2010).

Bachevalier, J., Hagger, C. & Mishkin, M. in Alfred Benzon Symposium Vol. 31 Brain Work And Mental Activity (eds Lassen, N. A. et al.) 231–240 (Munksgaard, 1991).

Arcaro, M. J. & Livingstone, M. S. A hierarchical, retinotopic proto-organization of the primate visual system at birth. eLife 6 , e26196 (2017).

Ellis, C. T. et al. Retinotopic organization of visual cortex in human infants. Neuron 109 , 2616–2626.e2616 (2021).

Hubel, D. H. & Wiesel, T. N. Receptive fields of cells in striate cortex of very young, visually inexperienced kittens. J. Neurophysiol. 26 , 994–1002 (1963).

Mohammed, C. P. D. & Khalil, R. Postnatal development of visual cortical function in the mammalian brain. Front. Syst. Neuro. 14 , 29 (2020).

Rodman, H. R., Scalaidhe, S. & Gross, C. G. Response properties of neurons in temporal cortical visual areas of infant monkeys. J. Neurophysiol. 70 , 1115–1136 (1993).

Rodman, H. R. Development of inferior temporal cortex in the monkey. Cereb. Cortex 4 , 484–498 (1994).

Kamps, F. S., Pincus, J. E., Radwan, S. F., Wahab, S. & Dilks, D. D. Late development of navigationally relevant motion processing in the occipital place area. Curr. Biol. 30 , 544–550.e543 (2020).

Grotheer, M. et al. Human white matter myelination rate slows down at birth. Preprint at bioRxiv https://doi.org/10.1101/2023.03.02.530800v1 (2023).

Ahmad, Z., Behrmann, M., Patterson, C. & Freud, E. Unilateral resection of both cortical visual pathways in a pediatric patient alters action but not perception. Neuropsychologia 168 , 108182 (2022).

Grinter, E. J., Maybery, M. T. & Badcock, D. R. Vision in developmental disorders: is there a dorsal stream deficit? Brain Res. Bull. 82 , 147–160 (2010).

Pitcher, D. & Ungerleider, L. G. Evidence for a third visual pathway specialized for social perception. Trends Cogn. Sci. 25 , 100–110 (2021).

Weiner, K. S. & Gomez, J. Third visual pathway, anatomy, and cognition across species. Trends Cogn. Sci. 25 , 548–549 (2021).

Braddick, O. & Atkinson, J. Development of human visual function. Vis. Res. 51 , 1588–1609 (2011).

Dubowitz, L. M. S., De Vries, L., Mushin, J. & Arden, G. B. Visual function in the newborn infant: is it cortically mediated? Lancet 327 , 1139–1141 (1986).

Ma, Z., Tu, W. & Zhang, N. Increased wiring cost during development is driven by long-range cortical, but not subcortical connections. NeuroImage 225 , 117463 (2021).

King, A. J., Schnupp, J. W. H., Carlile, S., Smith, A. L. & Thompson, I. D. The development of topographically-aligned maps of visual and auditory space in the superior colliculus. Prog. Brain Res . 112 , 335–350 (1996).

O’Reilly, R. C., Russin, J. L., Zolfaghar, M. & Rohrlich, J. Deep predictive learning in neocortex and pulvinar. J. Cogn. Neurosci. 33 , 1158–1196 (2021).

Sewards, T. V. & Sewards, M. A. Innate visual object recognition in vertebrates: some proposed pathways and mechanisms. Comp. Biochem. Physiol. A 132 , 861–891 (2002).

Arcaro, M. J., Pinsk, M. A., Chen, J. & Kastner, S. Organizing principles of pulvino-cortical functional coupling in humans. Nat. Commun. 9 , 5382 (2018).

Arcaro, M. J., Pinsk, M. A. & Kastner, S. The anatomical and functional organization of the human visual pulvinar. J. Neurosci. 35 , 9848–9871 (2015).

Baizer, J. S., Desimone, R. & Ungerleider, L. G. Comparison of subcortical connections of inferior temporal and posterior parietal cortex in monkeys. Vis. Neuro. 10 , 59–72 (1993).

Gattass, R., Galkin, T. W., Desimone, R. & Ungerleider, L. G. Subcortical connections of area V4 in the macaque. J. Comp. Neurol. 522 , 1941–1965 (2014).

Ungerleider, L. G., Galkin, T. W., Desimone, R. & Gattass, R. Subcortical projections of area V2 in the macaque. J. Cogn. Neurosci. 26 , 1220–1233 (2014).

Webster, M. J., Bachevalier, J. & Ungerleider, L. G. Connections of inferior temporal areas TEO and TE with parietal and frontal cortex in macaque monkeys. Cereb. Cortex 4 , 470–483 (1994).

Mercuri, E. et al. Basal ganglia damage and impaired visual function in the newborn infant. Arch. Dis. Child. Fet. Neonat. Edn 77 , F111–F114 (1997).

Blumberg, M. S. & Adolph, K. E. Protracted development of motor cortex constrains rich interpretations of infant cognition. Trends Cogn. Sci . https://doi.org/10.1016/j.tics.2022.12.014 (2023).

Güçlü, U. & van Gerven, M. A. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35 , 10005–10014 (2015).

Conwell, C., Prince, J. S., Alvarez, G. A. & Konkle, T. What can 5.17 billion regression fits tell us about artificial models of the human visual system? In 3rd Worksh. on Shared Visual Representations in Human and Machine Intelligence (SVRHM) (NeurIPS, 2021).

Baker, N., Lu, H., Erlikhman, G. & Kellman, P. J. Deep convolutional networks do not classify based on global object shape. PLoS Comp. Biol. 14 , e1006613 (2018).

Article   ADS   Google Scholar  

Download references

Author information

Authors and affiliations.

Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA

Vladislav Ayzenberg

Department of Psychology, Carnegie Mellon University, Pittsburgh, PA, USA

Vladislav Ayzenberg & Marlene Behrmann

Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA, USA

Marlene Behrmann

You can also search for this author in PubMed   Google Scholar

Contributions

The authors contributed equally to all aspects of the article.

Corresponding authors

Correspondence to Vladislav Ayzenberg or Marlene Behrmann .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Reviews Psychology thanks Cameron Ellis, Peter Gerhardstein, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Ayzenberg, V., Behrmann, M. Development of visual object recognition. Nat Rev Psychol 3 , 73–90 (2024). https://doi.org/10.1038/s44159-023-00266-w

Download citation

Accepted : 28 November 2023

Published : 22 December 2023

Issue Date : February 2024

DOI : https://doi.org/10.1038/s44159-023-00266-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

The development of human causal learning and reasoning.

  • Mariel K. Goddu
  • Alison Gopnik

Nature Reviews Psychology (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

visual representation definition psychology

  • A-Z Publications

Annual Review of Psychology

Volume 74, 2023, review article, open access, understanding human object vision: a picture is worth a thousand representations.

  • Stefania Bracci 1 , and Hans P. Op de Beeck 2
  • View Affiliations Hide Affiliations Affiliations: 1 Center for Mind/Brain Sciences, University of Trento, Rovereto, Italy; email: [email protected] 2 Leuven Brain Institute, Research Unit Brain & Cognition, KU Leuven, Leuven, Belgium; email: [email protected]
  • Vol. 74:113-135 (Volume publication date January 2023) https://doi.org/10.1146/annurev-psych-032720-041031
  • First published as a Review in Advance on November 15, 2022
  • Copyright © 2023 by the author(s). This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See credit lines of images or other third-party material in this article for license information

Objects are the core meaningful elements in our visual environment. Classic theories of object vision focus upon object recognition and are elegant and simple. Some of their proposals still stand, yet the simplicity is gone. Recent evolutions in behavioral paradigms, neuroscientific methods, and computational modeling have allowed vision scientists to uncover the complexity of the multidimensional representational space that underlies object vision. We review these findings and propose that the key to understanding this complexity is to relate object vision to the full repertoire of behavioral goals that underlie human behavior, running far beyond object recognition. There might be no such thing as core object recognition, and if it exists, then its importance is more limited than traditionally thought.

Article metrics loading...

Full text loading...

Literature Cited

  • Arcaro MJ , Livingstone MS 2017 . A hierarchical, retinotopic proto-organization of the primate visual system at birth. eLife 6 : e26196 [Google Scholar]
  • Arcaro MJ , Livingstone MS. 2021 . On the relationship between maps and domains in inferotemporal cortex. Nat. Rev. Neurosci. 22 : 9 573– 83 [Google Scholar]
  • Attneave F. 1957 . Physical determinants of the judged complexity of shapes. J. Exp. Psychol. 53 : 4 221– 27 [Google Scholar]
  • Avberšek LK , Zeman A , Op de Beeck HP. 2021 . Training for object recognition with increasing spatial frequency: a comparison of deep learning with human vision. J. Vis. 21 : 10 14 [Google Scholar]
  • Baker N , Lu H , Erlikhman G , Kellman PJ. 2020 . Local features and global shape information in object classification by deep convolutional neural networks. Vis. Res. 172 : 46– 61 [Google Scholar]
  • Baldassi C , Alemi-Neissi A , Pagan M , DiCarlo JJ , Zecchina R , Zoccolan D. 2013 . Shape similarity, better than semantic membership, accounts for the structure of visual object representations in a population of monkey inferotemporal neurons. PLOS Comput. Biol. 9 : 8 e1003167 [Google Scholar]
  • Bao P , She L , McGill M , Tsao DY. 2020 . A map of object space in primate inferotemporal cortex. Nature 583 : 7814 103– 8 [Google Scholar]
  • Barton JJ , Press DZ , Keenan JP , O'Connor M. 2002 . Lesions of the fusiform face area impair perception of facial configuration in prosopagnosia. Neurology 58 : 1 71– 78 [Google Scholar]
  • Baylis GC , Rolls ET , Leonard CM. 1987 . Functional subdivisions of the temporal lobe neocortex. J. Neurosci. 7 : 2 330– 42 [Google Scholar]
  • Beauchamp MS , Lee KE , Haxby JV , Martin A. 2002 . Parallel visual motion processing streams for manipulable objects and human movements. Neuron 34 : 1 149– 59 [Google Scholar]
  • Bernardi R , Pezzelle S. 2021 . Linguistic issues behind visual question answering. Lang. Linguist. Compass 15 : 6 e12417 [Google Scholar]
  • Biederman I. 1972 . Perceiving real-world scenes. Science 177 : 4043 77– 80 [Google Scholar]
  • Biederman I. 1987 . Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94 : 2 115– 47 [Google Scholar]
  • Blauch NM , Behrmann M , Plaut DC. 2022 . A connectivity-constrained computational account of topographic organization in primate high-level visual cortex. PNAS 119 : 3 e2112566119 [Google Scholar]
  • Bonner MF , Epstein RA. 2018 . Computational mechanisms underlying cortical responses to the affordance properties of visual scenes. PLOS Comput. Biol. 14 : 4 e1006111 [Google Scholar]
  • Booth AE , Waxman S. 2002 . Object names and object functions serve as cues to categories for infants. Dev. Psychol. 38 : 6 948– 57 [Google Scholar]
  • Bracci S , Caramazza A , Peelen MV. 2015 . Representational similarity of body parts in human occipitotemporal cortex. J. Neurosci. 35 : 38 12977– 85 [Google Scholar]
  • Bracci S , Caramazza A , Peelen MV. 2018 . View-invariant representation of hand postures in the human lateral occipitotemporal cortex. Neuroimage 181 : 446– 52 [Google Scholar]
  • Bracci S , Cavina-Pratesi C , Ietswaart M , Caramazza A , Peelen MV. 2012 . Closely overlapping responses to tools and hands in left lateral occipitotemporal cortex. J. Neurophysiol. 107 : 5 1443– 56 [Google Scholar]
  • Bracci S , Ietswaart M , Peelen MV , Cavina-Pratesi C. 2010 . Dissociable neural responses to hands and non-hand body parts in human left extrastriate visual cortex. J. Neurophysiol. 103 : 6 3389– 97 [Google Scholar]
  • Bracci S , Mraz J , Zeman A , Leys G , Op de Beeck HP. 2022 . The representational hierarchy in human and artificial visual systems in the presence of object-scene regularities. bioRxiv 456197. https://doi.org/10.1101/2021.08.13.456197 [Crossref]
  • Bracci S , Op de Beeck HP. 2016 . Dissociations and associations between shape and category representations in the two visual pathways. J. Neurosci. 36 : 2 432– 44 [Google Scholar]
  • Bracci S , Peelen MV. 2013 . Body and object effectors: the organization of object representations in high-level visual cortex reflects body–object interactions. J. Neurosci. 33 : 46 18247– 58 [Google Scholar]
  • Bracci S , Ritchie JB , Kalfas I , Op de Beeck HP. 2019 . The ventral visual pathway represents animal appearance over animacy, unlike human behavior and deep neural networks. J. Neurosci. 39 : 33 6513– 25 [Google Scholar]
  • Bracci S , Ritchie JB , Op de Beeck HP. 2017 . On the partnership between neural representations of object categories and visual features in the ventral visual pathway. Neuropsychologia 105 : 153– 64 [Google Scholar]
  • Buiatti M , Di Giorgio E , Piazza M , Polloni C , Menna G et al. 2019 . Cortical route for facelike pattern processing in human newborns. PNAS 116 : 10 4625– 30 [Google Scholar]
  • Canário N , Jorge L , Silva ML , Soares MA , Castelo-Branco M 2016 . Distinct preference for spatial frequency content in ventral stream regions underlying the recognition of scenes, faces, bodies and other objects. Neuropsychologia 87 : 110– 19 [Google Scholar]
  • Caramazza A , Shelton JR. 1998 . Domain-specific knowledge systems in the brain: the animate-inanimate distinction. J. Cogn. Neurosci. 10 : 1 1– 34 [Google Scholar]
  • Chang L , Tsao DY. 2017 . The code for facial identity in the primate brain. Cell 169 : 6 1013– 28 [Google Scholar]
  • Chao LL , Haxby JV , Martin A. 1999 . Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nat. Neurosci. 2 : 10 913– 19 [Google Scholar]
  • Chklovskii DB , Koulakov AA. 2004 . Maps in the brain: What can we learn from them?. Annu. Rev. Neurosci. 27 : 369– 92 [Google Scholar]
  • Cohen L , Lehéricy S , Chochon F , Lemer C , Rivaud S , Dehaene S. 2002 . Language-specific tuning of visual cortex? Functional properties of the Visual Word Form Area. Brain 125 : 5 1054– 69 [Google Scholar]
  • Connolly AC , Guntupalli JS , Gors J , Hanke M , Halchenko YO et al. 2012 . The representation of biological classes in the human brain. J. Neurosci. 32 : 8 2608– 18 [Google Scholar]
  • Cox DD , Savoy RL. 2003 . Functional magnetic resonance imaging (fMRI) “brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex. Neuroimage 19 : 2 261– 70 [Google Scholar]
  • Deng J , Dong W , Socher R , Li LJ , Li K , Fei-Fei L. 2009 . ImageNet: a large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition 248– 55 New York: IEEE [Google Scholar]
  • Desimone R , Albright TD , Gross CG , Bruce C. 1984 . Stimulus-selective properties of inferior temporal neurons in the macaque. J. Neurosci. 4 : 8 2051– 62 [Google Scholar]
  • Di Giorgio E , Lunghi M , Simion F , Vallortigara G. 2017 . Visual cues of motion that trigger animacy perception at birth: the case of self-propulsion. Dev. Sci. 20 : 4 e12394 [Google Scholar]
  • DiCarlo JJ , Maunsell JH. 2003 . Anterior inferotemporal neurons of monkeys engaged in object recognition can be highly sensitive to object retinal position. J. Neurophysiol. 89 : 6 3264– 78 [Google Scholar]
  • DiCarlo JJ , Zoccolan D , Rust NC. 2012 . How does the brain solve visual object recognition?. Neuron 73 : 3 415– 34 [Google Scholar]
  • Dobs K , Martinez J , Kell AJ , Kanwisher N. 2022 . Brain-like functional specialization emerges spontaneously in deep neural networks. Sci. Adv. 8 : 11 eabl8913 [Google Scholar]
  • Downing PE , Jiang Y , Shuman M , Kanwisher N. 2001 . A cortical area selective for visual processing of the human body. Science 293 : 5539 2470– 73 [Google Scholar]
  • Dujmović M , Malhotra G , Bowers JS 2020 . What do adversarial images tell us about human vision?. eLife 9 : e55978 [Google Scholar]
  • Duyck S , Martens F , Chen CY , Op de Beeck HP 2021 . How visual expertise changes representational geometry: a behavioral and neural perspective. J. Cogn. Neurosci. 33 : 12 2461– 76 [Google Scholar]
  • Dwivedi K , Bonner MF , Cichy RM , Roig G. 2021 . Unveiling functions of the visual cortex using task-specific deep neural networks. PLOS Comput. Biol. 17 : 8 e1009267 [Google Scholar]
  • Elmoznino E , Bonner MF. 2022 . High-performing neural network models of visual cortex benefit from high latent dimensionality. bioRxiv 499969. https://doi.org/10.1101/2022.07.13.499969 [Crossref]
  • Epstein R , Kanwisher N. 1998 . A cortical representation of the local visual environment. Nature 392 : 6676 598– 601 [Google Scholar]
  • Geirhos R , Rubisch P , Michaelis C , Bethge M , Wichmann FA , Brendel W. 2018 . ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv:1811.12231 [cs.CV]
  • Gibson JJ. 1979 . The Ecological Approach to Visual Perception New York: Psychol. Press [Google Scholar]
  • Gobbini MI , Koralek AC , Bryan RE , Montgomery KJ , Haxby JV. 2007 . Two takes on the social brain: a comparison of theory of mind tasks. J. Cogn. Neurosci. 19 : 11 1803– 14 [Google Scholar]
  • Goffaux V , Dakin S. 2010 . Horizontal information drives the behavioral signatures of face processing. Front. Psychol. 1 : 143 [Google Scholar]
  • Gomez J , Natu V , Jeska B , Barnett M , Grill-Spector K. 2018 . Development differentially sculpts receptive fields across early and high-level human visual cortex. Nat. Commun. 9 : 788 [Google Scholar]
  • Goodale MA , Milner AD. 1992 . Separate visual pathways for perception and action. Trends Neurosci . 15 : 1 20– 25 [Google Scholar]
  • Grabner H , Gall J , Van Gool L. 2011 . What makes a chair a chair?. CVPR 2011: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition 1529– 36 New York: IEEE [Google Scholar]
  • Graziano MS , Aflalo TN. 2007 . Mapping behavioral repertoire onto the cortex. Neuron 56 : 2 239– 51 [Google Scholar]
  • Grill-Spector K , Kushnir T , Edelman S , Avidan G , Itzchak Y , Malach R. 1999 . Differential processing of objects under various viewing conditions in the human lateral occipital complex. Neuron 24 : 1 187– 203 [Google Scholar]
  • Grill-Spector K , Kushnir T , Hendler T , Edelman S , Itzchak Y , Malach R. 1998 . A sequence of object-processing stages revealed by fMRI in the human occipital lobe. Hum. Brain Mapp . 6 : 4 316– 28 [Google Scholar]
  • Grill-Spector K , Weiner KS. 2014 . The functional architecture of the ventral temporal cortex and its role in categorization. Nat. Rev. Neurosci. 15 : 8 536– 48 [Google Scholar]
  • Gross CG , Rocha-Miranda CD , Bender DB. 1972 . Visual properties of neurons in inferotemporal cortex of the macaque. J. Neurophysiol. 35 : 1 96– 111 [Google Scholar]
  • Grossman ED , Blake R. 2002 . Brain areas active during visual perception of biological motion. Neuron 35 : 6 1167– 75 [Google Scholar]
  • Grossman S , Gaziv G , Yeagle EM , Harel M , Mégevand P et al. 2019 . Convergent evolution of face spaces across human face-selective neuronal groups and deep convolutional networks. Nat. Commun. 10 : 4934 [Google Scholar]
  • Güçlü U , van Gerven MA. 2015 . Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35 : 27 10005– 14 [Google Scholar]
  • Hasson U , Levy I , Behrmann M , Hendler T , Malach R. 2002 . Eccentricity bias as an organizing principle for human high-order object areas. Neuron 34 : 3 479– 90 [Google Scholar]
  • Hebart MN , Zheng CY , Pereira F , Baker CI. 2020 . Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nat. Hum. Behav. 4 : 11 1173– 85 [Google Scholar]
  • Hoffman DD , Richards WA. 1984 . Parts of recognition. Cognition 18 : 1–3 65– 96 [Google Scholar]
  • Hong H , Yamins DL , Majaj NJ , DiCarlo JJ. 2016 . Explicit information for category-orthogonal object properties increases along the ventral stream. Nat. Neurosci. 19 : 4 613– 22 [Google Scholar]
  • Hu JM , Song XM , Wang Q , Roe AW 2020 . Curvature domains in V4 of macaque monkey. eLife 9 : e57261 [Google Scholar]
  • Hubel DH , Wiesel TN. 1968 . Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195 : 1 215– 43 [Google Scholar]
  • Hung CP , Kreiman G , Poggio T , DiCarlo JJ. 2005 . Fast readout of object identity from macaque inferior temporal cortex. Science 310 : 5749 863– 66 [Google Scholar]
  • Jagadeesh AV , Gardner JL. 2022 . Texture-like representation of objects in human visual cortex. PNAS 119 : 17 e2115302119 [Google Scholar]
  • Jain N , Wang A , Henderson MM , Lin R , Prince JS et al. 2022 . Food for thought: selectivity for food in human ventral visual cortex. bioRxiv 492983. https://doi.org/10.1101/2022.05.22.492983 [Crossref]
  • James W. 1890 . The Principles of Psychology , Vol. 1 London: Macmillan [Google Scholar]
  • Jenkins R , Dowsett AJ , Burton AM. 2018 . How many faces do people know?. Proc. R. Soc. B 285 : 1888 20181319 [Google Scholar]
  • Josephs EL , Konkle T. 2020 . Large-scale dissociations between views of objects, scenes, and reachable-scale environments in visual cortex. PNAS 117 : 47 29354– 62 [Google Scholar]
  • Jozwik KM , Najarro E , van den Bosch JJ , Charest I , Kriegeskorte N , Cichy RM. 2021 . Disentangling five dimensions of animacy in human brain and behaviour. bioRxiv 459854. https://doi.org/10.1101/2021.09.12.459854 [Crossref]
  • Kaiser D , Quek GL , Cichy RM , Peelen MV. 2019 . Object vision in a structured world. Trends Cogn. Sci. 23 : 8 672– 85 [Google Scholar]
  • Kanwisher N , McDermott J , Chun MM. 1997 . The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17 : 11 4302– 11 [Google Scholar]
  • Kar K , Kubilius J , Schmidt K , Issa EB , DiCarlo JJ. 2019 . Evidence that recurrent circuits are critical to the ventral stream's execution of core object recognition behavior. Nat. Neurosci. 22 : 6 974– 83 [Google Scholar]
  • Kayaert G , Biederman I , Op de Beeck HP , Vogels R 2005 . Tuning for shape dimensions in macaque inferior temporal cortex. Eur. J. Neurosci. 22 : 1 212– 24 [Google Scholar]
  • Keller TA , Gao Q , Welling M. 2021 . Modeling category-selective cortical regions with topographic variational autoencoders. arXiv:2110.13911 [q-bio.NC]
  • Khosla M , Murty NAR , Kanwisher NG. 2022 . A highly selective response to food in human visual cortex revealed by hypothesis-free voxel decomposition. bioRxiv 496922. https://doi.org/10.1101/2022.06.21.496922 [Crossref]
  • Kietzmann TC , Spoerer CJ , Sörensen LK , Cichy RM , Hauk O , Kriegeskorte N. 2019 . Recurrence is required to capture the representational dynamics of the human visual system. PNAS 116 : 43 21854– 63 [Google Scholar]
  • Kim E , Rego J , Watkins Y , Kenyon GT. 2020 . Modeling biological immunity to adversarial examples. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition 4666– 75 New York: IEEE [Google Scholar]
  • Konkle T , Alvarez GA. 2022 . Beyond category-supervision: computational support for domain-general pressures guiding human visual system representation. Nat. Commun. 13 : 491 [Google Scholar]
  • Konkle T , Oliva A. 2012 . A real-world size organization of object responses in occipitotemporal cortex. Neuron 74 : 6 1114– 24 [Google Scholar]
  • Kourtzi Z , Kanwisher N. 2001 . Representation of perceived object shape by the human lateral occipital complex. Science 293 : 5534 1506– 9 [Google Scholar]
  • Kravitz DJ , Peng CS , Baker CI. 2011 . Real-world scene representations in high-level visual cortex: It's the spaces more than the places. J. Neurosci. 31 : 20 7322– 33 [Google Scholar]
  • Kravitz DJ , Saleem KS , Baker CI , Ungerleider LG , Mishkin M. 2013 . The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends Cogn. Sci. 17 : 1 26– 49 [Google Scholar]
  • Kriegeskorte N , Mur M , Ruff DA , Kiani R , Bodurka J et al. 2008 . Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60 : 6 1126– 41 [Google Scholar]
  • Krizhevsky A , Sutskever I , Hinton GE. 2012 . ImageNet classification with deep convolutional neural networks. Adv. Neural Inform. Proc. Syst. 25 : 1097– 105 [Google Scholar]
  • Kubilius J , Bracci S , Op de Beeck HP. 2016 . Deep neural networks as a computational model for human shape sensitivity. PLOS Comput. Biol. 12 : 4 e1004896 [Google Scholar]
  • Ito M , Tamura H , Fujita I , Tanaka K. 1995 . Size and position invariance of neuronal responses in monkey inferotemporal cortex. J. Neurophysiol. 73 : 1 218– 26 [Google Scholar]
  • Laiacona M , Barbarotto R , Capitani E. 1993 . Perceptual and associative knowledge in category specific impairment of semantic memory: a study of two cases. Cortex 29 : 4 727– 40 [Google Scholar]
  • LeCun Y , Bengio Y , Hinton G. 2015 . Deep learning. Nature 521 : 7553 436– 44 [Google Scholar]
  • Lee H , Margalit E , Jozwik KM , Cohen MA , Kanwisher N et al. 2020 . Topographic deep artificial neural networks reproduce the hallmarks of the primate inferior temporal cortex face processing network. bioRxiv 185116. https://doi.org/10.1101/2020.07.09.185116 [Crossref]
  • Levy I , Hasson U , Avidan G , Hendler T , Malach R. 2001 . Center–periphery organization of human object areas. Nat. Neurosci. 4 : 5 533– 39 [Google Scholar]
  • Li SPD , Bonner MF. 2021 . Tuning in scene-preferring cortex for mid-level visual features gives rise to selectivity across multiple levels of stimulus complexity. bioRxiv 461733. https://doi.org/10.1101/2021.09.24.461733 [Crossref]
  • Lindsay GW. 2021 . Convolutional neural networks as a model of the visual system: past, present, and future. J. Cogn. Neurosci. 33 : 10 2017– 31 [Google Scholar]
  • Long B , Yu CP , Konkle T. 2018 . Mid-level visual features underlie the high-level categorical organization of the ventral stream. PNAS 115 : 38 E9015– 24 [Google Scholar]
  • Lotter W , Kreiman G , Cox D. 2020 . A neural network trained for prediction mimics diverse features of biological neurons and perception. Nat. Mach. Intell. 2 : 4 210– 19 [Google Scholar]
  • Mahon BZ , Caramazza A. 2011 . What drives the organization of object knowledge in the brain?. Trends Cogn. Sci. 15 : 3 97– 103 [Google Scholar]
  • Maimon-Mor RO , Makin TR 2020 . Is an artificial limb embodied as a hand? Brain decoding in prosthetic limb users. PLOS Biol . 18 : 6 e3000729 [Google Scholar]
  • Malach R , Levy I , Hasson U. 2002 . The topography of high-order human object areas. Trends Cogn. Sci. 6 : 4 176– 84 [Google Scholar]
  • Malach R , Reppas JB , Benson RR , Kwong KK , Jiang H et al. 1995 . Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. PNAS 92 : 18 8135– 39 [Google Scholar]
  • Marr D. 1980 . Visual information processing: the structure and creation of visual representations. Philos. Trans. R. Soc. B 290 : 1038 199– 218 [Google Scholar]
  • Marr D , Nishihara HK. 1978 . Representation and recognition of the spatial organization of three-dimensional shapes. Proc. R. Soc. B 200 : 1140 269– 94 [Google Scholar]
  • Martin A , Weisberg J. 2003 . Neural foundations for understanding social and mechanical concepts. Cogn. Neuropsychol. 20 : 3–6 575– 87 [Google Scholar]
  • Mehrer J , Spoerer CJ , Jones EC , Kriegeskorte N , Kietzmann TC. 2021 . An ecologically motivated image dataset for deep learning yields better models of human vision. PNAS 118 : 8 e2011417118 [Google Scholar]
  • Miller LE , Montroni L , Koun E , Salemme R , Hayward V , Farnè A. 2018 . Sensing with tools extends somatosensory processing beyond the body. Nature 561 : 7722 239– 42 [Google Scholar]
  • Mishkin M , Ungerleider LG. 1982 . Contribution of striate inputs to the visuospatial functions of parieto-preoccipital cortex in monkeys. Behav. Brain Res. 6 : 1 57– 77 [Google Scholar]
  • Morgenstern Y , Hartmann F , Schmidt F , Tiedemann H , Prokott E et al. 2021 . An image-computable model of human visual shape similarity. PLOS Comput. Biol. 17 : 6 e1008981 [Google Scholar]
  • Mormann F , Kornblith S , Cerf M , Ison MJ , Kraskov A et al. 2017 . Scene-selective coding by single neurons in the human parahippocampal cortex. PNAS 114 : 5 1153– 58 [Google Scholar]
  • Nasr S , Echavarria CE , Tootell RB. 2014 . Thinking outside the box: Rectilinear shapes selectively activate scene-selective cortex. J. Neurosci. 34 : 20 6721– 35 [Google Scholar]
  • Nguyen A , Yosinski J , Clune J. 2015 . Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. 2015 IEEE Conference on Computer Vision and Pattern Recognition 427– 36 New York: IEEE [Google Scholar]
  • Oakes LM , Madole KL. 2008 . Function revisited: how infants construe functional features in their representation of objects. Adv. Child Dev. Behav. 36 : 135– 85 [Google Scholar]
  • Op de Beeck HP , Deutsch JA , Vanduffel W , Kanwisher NG , DiCarlo JJ. 2008a . A stable topography of selectivity for unfamiliar shape classes in monkey inferior temporal cortex. Cereb. Cortex 18 : 7 1676– 94 [Google Scholar]
  • Op de Beeck HP , Haushofer J , Kanwisher NG. 2008b . Interpreting fMRI data: maps, modules and dimensions. Nat. Rev. Neurosci. 9 : 2 123– 35 [Google Scholar]
  • Op de Beeck HP , Pillet I , Ritchie JB 2019 . Factors determining where category-selective areas emerge in visual cortex. Trends Cogn. Sci. 23 : 9 784– 97 [Google Scholar]
  • Op de Beeck HP , Torfs K , Wagemans J. 2008c . Perceived shape similarity among unfamiliar objects and the organization of the human object vision pathway. J. Neurosci. 28 : 40 10111– 23 [Google Scholar]
  • Op De Beeck HP , Vogels R 2000 . Spatial sensitivity of macaque inferior temporal neurons. J. Comp. Neurol. 426 : 4 505– 18 [Google Scholar]
  • Op De Beeck HP , Wagemans J , Vogels R. 2001 . Inferotemporal neurons represent low-dimensional configurations of parameterized shapes. Nat. Neurosci. 4 : 12 1244– 52 [Google Scholar]
  • Papeo L , Stein T , Soto-Faraco S. 2017 . The two-body inversion effect. Psychol. Sci. 28 : 3 369– 79 [Google Scholar]
  • Peelen MV , Downing PE. 2017 . Category selectivity in human visual cortex: beyond visual object recognition. Neuropsychologia 105 : 177– 83 [Google Scholar]
  • Pitcher D , Charles L , Devlin JT , Walsh V , Duchaine B. 2009 . Triple dissociation of faces, bodies, and objects in extrastriate cortex. Curr. Biol. 19 : 4 319– 24 [Google Scholar]
  • Pitcher D , Ungerleider LG. 2021 . Evidence for a third visual pathway specialized for social perception. Trends Cogn. Sci. 25 : 2 100– 10 [Google Scholar]
  • Powell LJ , Kosakowski HL , Saxe R 2018 . Social origins of cortical face areas. Trends Cogn. Sci. 22 : 9 752– 63 [Google Scholar]
  • Proklova D , Goodale MA. 2022 . The role of animal faces in the animate-inanimate distinction in the ventral temporal cortex. Neuropsychologia 169 : 108192 [Google Scholar]
  • Proklova D , Kaiser D , Peelen MV. 2016 . Disentangling representations of object shape and object category in human visual cortex: the animate–inanimate distinction. J. Cogn. Neurosci. 28 : 5 680– 92 [Google Scholar]
  • Rajimehr R , Devaney KJ , Bilenko NY , Young JC , Tootell RB. 2011 . The “parahippocampal place area” responds preferentially to high spatial frequencies in humans and monkeys. PLOS Biol . 9 : 4 e1000608 [Google Scholar]
  • Ratan Murty NA , Bashivan P , Abate A , DiCarlo JJ , Kanwisher N 2021 . Computational models of category-selective brain regions enable high-throughput tests of selectivity. Nat. Commun. 12 : 5540 [Google Scholar]
  • Riesenhuber M , Poggio T. 1999 . Hierarchical models of object recognition in cortex. Nat. Neurosci. 2 : 11 1019– 25 [Google Scholar]
  • Ritchie JB , Zeman AA , Bosmans J , Sun S , Verhaegen K , Op de Beeck HP. 2021 . Untangling the animacy organization of occipitotemporal cortex. J. Neurosci. 41 : 33 7103– 19 [Google Scholar]
  • Rosch E , Mervis CB , Gray WD , Johnson DM , Boyes-Braem P. 1976 . Basic objects in natural categories. Cogn. Psychol. 8 : 3 382– 439 [Google Scholar]
  • Rosenke M , van Hoof R , van den Hurk J , Grill-Spector K , Goebel R. 2021 . A probabilistic functional atlas of human occipito-temporal visual cortex. Cereb. Cortex 31 : 1 603– 19 [Google Scholar]
  • Saxe R , Wexler A. 2005 . Making sense of another mind: the role of the right temporo-parietal junction. Neuropsychologia 43 : 10 1391– 99 [Google Scholar]
  • Saygin ZM , Osher DE , Koldewyn K , Reynolds G , Gabrieli JD , Saxe RR. 2012 . Anatomical connectivity patterns predict face selectivity in the fusiform gyrus. Nat. Neurosci. 15 : 2 321– 27 [Google Scholar]
  • Schrimpf M , Kubilius J , Hong H , Majaj NJ , Rajalingham R et al. 2020 . Brain-score: Which artificial neural network for object recognition is most brain-like?. bioRxiv 407007. https://doi.org/10.1101/407007 [Crossref]
  • Seijdel N , Loke J , Van de Klundert R , Van der Meer M , Quispel E et al. 2021 . On the necessity of recurrent processing during object recognition: It depends on the need for scene segmentation. J. Neurosci. 41 : 29 6281– 89 [Google Scholar]
  • Sha L , Haxby JV , Abdi H , Guntupalli JS , Oosterhof NN et al. 2015 . The animacy continuum in the human ventral vision pathway. J. Cogn. Neurosci. 27 : 4 665– 78 [Google Scholar]
  • Shepard RN , Chipman S. 1970 . Second-order isomorphism of internal representations: shapes of states. Cogn. Psychol. 1 : 1 1– 17 [Google Scholar]
  • Taubert J , Ritchie JB , Ungerleider LG , Baker CI. 2022 . One object, two networks? Assessing the relationship between the face and body-selective regions in the primate visual system. Brain Struct. Funct. 227 : 4 1423– 38 [Google Scholar]
  • Thorat S , Proklova D , Peelen MV 2019 . The nature of the animacy organization in human ventral temporal cortex. eLife 8 : e47142 [Google Scholar]
  • Van den Heiligenberg FM , Orlov T , Macdonald SN , Duff EP , Henderson Slater D et al. 2018 . Artificial limb representation in amputees. Brain 141 : 5 1422– 33 [Google Scholar]
  • Vogels R. 1999 . Categorization of complex visual images by rhesus monkeys. Part 2: single-cell study. Eur. J. Neurosci. 11 : 4 1239– 55 [Google Scholar]
  • Voynov A , Babenko A. 2020 . Unsupervised discovery of interpretable directions in the GAN latent space. PMLR 119 : 9786– 96 [Google Scholar]
  • Yovel G , Kanwisher N. 2005 . The neural basis of the behavioral face-inversion effect. Curr. Biol. 15 : 24 2256– 62 [Google Scholar]
  • Wardle SG , Taubert J , Teichmann L , Baker CI. 2020 . Rapid and dynamic processing of face pareidolia in the human brain. Nat. Commun. 11 : 4518 [Google Scholar]
  • Wurm MF , Caramazza A. 2022 . Two “what” pathways for action and object recognition. Trends Cogn. Sci. 26 : 2 103– 16 [Google Scholar]
  • Zeman AA , Ritchie JB , Bracci S , de Beeck HO. 2020 . Orthogonal representations of object shape and category in deep convolutional neural networks and human visual cortex. Sci. Rep. 10 : 1 2453 [Google Scholar]

Data & Media loading...

  • Article Type: Review Article

Most Read This Month

Most cited most cited rss feed, job burnout, executive functions, social cognitive theory: an agentic perspective, on happiness and human potentials: a review of research on hedonic and eudaimonic well-being, sources of method bias in social science research and recommendations on how to control it, mediation analysis, missing data analysis: making it work in the real world, grounded cognition, personality structure: emergence of the five-factor model, motivational beliefs, values, and goals.

VISUAL REPRESENTATION

The article explores important historical aspects of visual representations in art of various psychological concepts and terms belonging to different psychological systems and schools.

visual representation definition psychology

 

Kevin Leo Yabut Nadal, Ph.D.

Why Representation Matters and Why It’s Still Not Enough

Reflections on growing up brown, queer, and asian american..

Posted December 27, 2021 | Reviewed by Ekua Hagan

  • Positive media representation can be helpful in increasing self-esteem for people of marginalized groups (especially youth).
  • Interpersonal contact and exposure through media representation can assist in reducing stereotypes of underrepresented groups.
  • Representation in educational curricula and social media can provide validation and support, especially for youth of marginalized groups.

Growing up as a Brown Asian American child of immigrants, I never really saw anyone who looked like me in the media. The TV shows and movies I watched mostly concentrated on blonde-haired, white, or light-skinned protagonists. They also normalized western and heterosexist ideals and behaviors, while hardly ever depicting things that reflected my everyday life. For example, it was equally odd and fascinating that people on TV didn’t eat rice at every meal; that their parents didn’t speak with accents; or that no one seemed to navigate a world of daily microaggressions . Despite these observations, I continued to absorb this mass media—internalizing messages of what my life should be like or what I should aspire to be like.

Ron Gejon, used with permission

Because there were so few media images of people who looked like me, I distinctly remember the joy and validation that emerged when I did see those representations. Filipino American actors like Ernie Reyes, Nia Peeples, Dante Basco, and Tia Carrere looked like they could be my cousins. Each time they sporadically appeared in films and television series throughout my youth, their mere presence brought a sense of pride. However, because they never played Filipino characters (e.g., Carrere was Chinese American in Wayne's World ) or their racial identities remained unaddressed (e.g., Basco as Rufio in Hook ), I did not know for certain that they were Filipino American like me. And because the internet was not readily accessible (nor fully informational) until my late adolescence , I could not easily find out.

Through my Ethnic Studies classes as an undergraduate student (and my later research on Asian American and Filipino American experiences with microaggressions), I discovered that my perspectives were not that unique. Many Asian Americans and other people of color often struggle with their racial and ethnic identity development —with many citing how a lack of media representation negatively impacts their self-esteem and overall views of their racial or cultural groups. Scholars and community leaders have declared mottos like how it's "hard to be what you can’t see," asserting that people from marginalized groups do not pursue career or academic opportunities when they are not exposed to such possibilities. For example, when women (and women of color specifically) don’t see themselves represented in STEM fields , they may internalize that such careers are not made for them. When people of color don’t see themselves in the arts or in government positions, they likely learn similar messages too.

Complicating these messages are my intersectional identities as a queer person of color. In my teens, it was heartbreakingly lonely to witness everyday homophobia (especially unnecessary homophobic language) in almost all television programming. The few visual examples I saw of anyone LGBTQ involved mostly white, gay, cisgender people. While there was some comfort in seeing them navigate their coming out processes or overcome heterosexism on screen, their storylines often appeared unrealistic—at least in comparison to the nuanced homophobia I observed in my religious, immigrant family. In some ways, not seeing LGBTQ people of color in the media kept me in the closet for years.

How representation can help

Representation can serve as opportunities for minoritized people to find community support and validation. For example, recent studies have found that social media has given LGBTQ young people the outlets to connect with others—especially when the COVID-19 pandemic has limited in-person opportunities. Given the increased suicidal ideation, depression , and other mental health issues among LGBTQ youth amidst this global pandemic, visibility via social media can possibly save lives. Relatedly, taking Ethnic Studies courses can be valuable in helping students to develop a critical consciousness that is culturally relevant to their lives. In this way, representation can allow students of color to personally connect to school, potentially making their educational pursuits more meaningful.

Further, representation can be helpful in reducing negative stereotypes about other groups. Initially discussed by psychologist Dr. Gordon Allport as Intergroup Contact Theory, researchers believed that the more exposure or contact that people had to groups who were different from them, the less likely they would maintain prejudice . Literature has supported how positive LGBTQ media representation helped transform public opinions about LGBTQ people and their rights. In 2019, the Pew Research Center reported that the general US population significantly changed their views of same-sex marriage in just 15 years—with 60% of the population being opposed in 2004 to 61% in favor in 2019. While there are many other factors that likely influenced these perspective shifts, studies suggest that positive LGBTQ media depictions played a significant role.

For Asian Americans and other groups who have been historically underrepresented in the media, any visibility can feel like a win. For example, Gold House recently featured an article in Vanity Fair , highlighting the power of Asian American visibility in the media—citing blockbuster films like Crazy Rich Asians and Shang-Chi and the Legend of the Ten Rings . Asian American producers like Mindy Kaling of Never Have I Ever and The Sex Lives of College Girls demonstrate how influential creators of color can initiate their own projects and write their own storylines, in order to directly increase representation (and indirectly increase mental health and positive esteem for its audiences of color).

When representation is not enough

However, representation simply is not enough—especially when it is one-dimensional, superficial, or not actually representative. Some scholars describe how Asian American media depictions still tend to reinforce stereotypes, which may negatively impact identity development for Asian American youth. Asian American Studies is still needed to teach about oppression and to combat hate violence. Further, representation might also fail to reflect the true diversity of communities; historically, Brown Asian Americans have been underrepresented in Asian American media, resulting in marginalization within marginalized groups. For example, Filipino Americans—despite being the first Asian American group to settle in the US and one of the largest immigrant groups—remain underrepresented across many sectors, including academia, arts, and government.

Representation should never be the final goal; instead, it should merely be one step toward equity. Having a diverse cast on a television show is meaningless if those storylines promote harmful stereotypes or fail to address societal inequities. Being the “first” at anything is pointless if there aren’t efforts to address the systemic obstacles that prevent people from certain groups from succeeding in the first place.

visual representation definition psychology

Instead, representation should be intentional. People in power should aim for their content to reflect their audiences—especially if they know that doing so could assist in increasing people's self-esteem and wellness. People who have the opportunity to represent their identity groups in any sector may make conscious efforts to use their influence to teach (or remind) others that their communities exist. Finally, parents and teachers can be more intentional in ensuring that their children and students always feel seen and validated. By providing youth with visual representations of people they can relate to, they can potentially save future generations from a lifetime of feeling underrepresented or misunderstood.

Kevin Leo Yabut Nadal, Ph.D.

Kevin Leo Yabut Nadal, Ph.D., is a Distinguished Professor of Psychology at the City University of New York and the author of books including Microaggressions and Traumatic Stress .

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Online Therapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Self Tests NEW
  • Therapy Center
  • Diagnosis Dictionary
  • Types of Therapy

September 2024 magazine cover

It’s increasingly common for someone to be diagnosed with a condition such as ADHD or autism as an adult. A diagnosis often brings relief, but it can also come with as many questions as answers.

  • Emotional Intelligence
  • Gaslighting
  • Affective Forecasting
  • Neuroscience

IMAGES

  1. Basic principles of visual and auditory perception

    visual representation definition psychology

  2. Visual Perception Theory In Psychology

    visual representation definition psychology

  3. What Is Visual Perception In Psychology?

    visual representation definition psychology

  4. Visual Perception Theory In Psychology

    visual representation definition psychology

  5. Visual Imagery Psychology

    visual representation definition psychology

  6. A visual representation of the definitions for human behavior

    visual representation definition psychology

VIDEO

  1. representation definition stereotypes

  2. Representation

  3. Visual Perception: How do you understand your world?

  4. What Is Visual Thinking? With Andrew Park

  5. Social Representation Theory #simplified #psychology #sociology

  6. LES NOMBRES COMPLEXES EX3 (solution détaillée avec rappel du cours durant la solution)

COMMENTS

  1. Visual Perception Theory In Psychology

    Summary. A lot of information reaches the eye, but much is lost by the time it reaches the brain (Gregory estimates about 90% is lost). Therefore, the brain has to guess what a person sees based on past experiences. We actively construct our perception of reality. Richard Gregory proposed that perception involves a lot of hypothesis testing to ...

  2. The Pitfalls of Visual Representations: A Review and Classification of

    Despite the notable number of publications on the benefits of using visual representations in a variety of fields (Meyer, Höllerer, Jancsary, & Van Leeuwen, 2013), few studies have systematically investigated the possible pitfalls that exist when creating or interpreting visual representations.Some information visualization researchers, however, have raised the issue and called to action ...

  3. The Science Behind Imagery and Visualisation

    Imagery, the process of forming mental representations of sensory experiences, is a cognitive marvel that extends far beyond mere visualization. When we engage in multisensory imagery, we tap into ...

  4. Decision making with visualizations: a cognitive framework across

    Decision making with visualizations: a cognitive framework ...

  5. The human imagination: the cognitive neuroscience of visual mental

    The human imagination - nature reviews neuroscience

  6. What is Visual Representation?

    What is Visual Representation? — updated 2024 | IxDF

  7. Differences between spatial and visual mental representations

    A difference between spatial and visual mental images is that spatial mental images contain more information, in the sense that the current visual mental image in the visual buffer only contains a "visualized" part of what is represented in the spatial mental image ( Kosslyn et al., 2006, p. 138).

  8. Visual Representation

    Definition. The concept of "representation" captures the signs that stand in for and take the place of something else [ 5 ]. Visual representation, in particular, refers to the special case when these signs are visual (as opposed to textual, mathematical, etc.). On the other hand, there is no limit on what may be (visually) represented ...

  9. Introduction: What is human visual cognition?

    Psychology Professional Development and Training. Research Methods in Psychology. Social Psychology. ... On our view, one and the same objective stimulus can give rise to a perceptual visual representation—a visual percept for short—and to what we shall call a 'visuomotor representation'. Visuomotor representations, which are visual ...

  10. On the Function of Visual Representation

    Abstract. The advent of the computer age enabled significant developments in the study of visual representation, particularly in the emergence of computational theories which are able to make sense of the large volume of data collected. This background leads into a discussion of the "Literalist View," which explains the phenomenon of ...

  11. What is Visual Imagery?

    Definition. Visual Imagery is the mental representation or recreation of something that is not physically present. It involves the mind's 'eye' forming images, enabling us to 'see' a concept, idea, or physical object even when it is not before our eyes. This cognitive process can significantly impact our thought processes, memory ...

  12. (PDF) Visual research in psychology

    visual research is to empower and give voice to mar-. ginalized groups and individuals, but those individ-. uals and groups are anonymized against their. wishes, this raises important questions ...

  13. A New Look at Visual Thinking

    Visual thinking comes in many forms, but in every case, it is hard work. It may involve the derivation of a new image that connects others, or the manipulation of an image that needs to change. In ...

  14. The role of visual representations in scientific practices: from

    The role of visual representations in scientific practices

  15. Representational Theories of Consciousness

    The leading representational approaches to (1) and (2) are "higher-order representation" theories, which divide into "inner sense" or "higher-order perception" views, "acquaintance" accounts, and "higher-order thought" theories. For discussion of those, see the entry on higher-order theories of consciousness. 1.

  16. Learning Through Visuals

    Based upon research outcomes, the effective use of visuals can decrease learning time, improve comprehension, enhance retrieval, and increase retention. In addition, the many testimonials I hear ...

  17. Creating visual explanations improves learning

    Creating visual explanations improves learning - PMC

  18. Development of visual object recognition

    Psychology. Object recognition is the process by which humans organize the visual world into meaningful perceptual units. In this Review, we examine the developmental origins and maturation of ...

  19. Understanding Human Object Vision: A Picture Is Worth a Thousand

    Objects are the core meaningful elements in our visual environment. Classic theories of object vision focus upon object recognition and are elegant and simple. Some of their proposals still stand, yet the simplicity is gone. Recent evolutions in behavioral paradigms, neuroscientific methods, and computational modeling have allowed vision scientists to uncover the complexity of the ...

  20. Mental representation

    Mental representation

  21. Peter Tzanev. Visual representation of Psychological Concepts

    Table 2. Table showing visual representations of psychological concepts and terms in art (1930-1970). Table created by author Peter Tzanev. Gestalt psychology has contributed to the establishment of the psychology of art as an independent discipline, and also has influenced the emergence of Op art.

  22. Why Representation Matters and Why It's Still Not Enough

    Why Representation Matters and Why It's Still Not Enough

  23. APA Dictionary of Psychology

    mental representation. Updated on 04/19/2018. a hypothetical entity that is presumed to stand for a perception, thought, memory, or the like during cognitive operations. For example, when doing mental arithmetic, one presumably operates on mental representations that correspond to digits and numerical operators; when one imagines looking at the ...