Mixed Reality Images

Lars C. Grabbe, Patrick Rupert-Kruse, Norbert M. Schmitz (eds.)

# Mixed Reality Images

Trilogy of Synthetic Realities III

The Open Access publication of this book was funded by KOALA consortia (https://projects.tib.eu/koala). www.buechner-verlag.de

Lars C. Grabbe, Patrick Rupert-Kruse, Norbert M. Schmitz (eds.) Mixed Reality Images Trilogy of Synthetic Realities III

ISBN (Print) 978-3-96317-365-3 ISBN (ePDF) 978-3-96317-929-7 DOI 10.14631/978-3-96317-929-7 ISBN (Hardcover) 978-3-96317-310-3 ISBN (ePDF) 978-3-96317-859-7 Copyright © 2022 Büchner-Verlag eG, Marburg, Germany

Published in 2023 by Büchner-Verlag eG, Marburg/Germany All rights reserved. No part of this book may be reproduced in any form by

Layout: Büchner-Verlag eG, Marburg, Germany Proofreading: Stephanie Kramer any means without permission in writing from the publisher. Cover design by Büchner-Verlag eG, Marburg, Germany

This work is licensed under CC BY-SA 4.0: https://creativecommons.org/ licenses/by-sa/4.0/. The terms of the Creative Commons license apply only to original material. The reuse of material from other sources (marked with source reference) such as charts, illustrations, photos and text excerpts may require further permission for use from the respective rights holder. The German National Library lists this publication in the Deutsche Nationalbibliografie (German National Bibliography); detailed bibliographic information is available online at www.dnb.de.

Print edition Printing and binding: BoD – Books on Demand, Norderstedt

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at https://dnb.dnb.de.

www.buechner-verlag.de

## Contents


#### 6 Contents


### Acknowledgements

This publication is based on the special scientific cooperation of the University of Applied Sciences in Kiel, the Muthesius Academy of Fine Arts and Design in Kiel, and the MSD – Münster School of Design in Münster.

The basic idea and the core concepts of the *Yearbook of Moving Image Studies* (YoMIS) were systematically developed by the editors Prof. Dr. Lars C. Grabbe, Prof. Dr. Patrick Rupert-Kruse and Prof. Dr. Norbert M. Schmitz.

A special thanks goes to the University of Applied Sciences in Kiel, the Muthesius Academy of Fine Arts and Design in Kiel, and the MSD – Münster School of Design for funding and support.

Finally, the editors wish to thank the authors and the members of the editorial board for excellent work, global thinking, and inspiration.

> *Lars C. Grabbe, Patrick Rupert-Kruse & Norbert M. Schmitz October 2023*

### About the *Yearbook of Moving Image Studies* (YoMIS)

The significant work that led to the concept and idea of the Yearbook dates to 2011 and is closely connected with the initial establishment of the *Research Group Moving Image Science Kiel*|*Münster* in Kiel, Germany. Established as a doctoral seminar at the Christian-Albrechts-University in Kiel, the research group is now working in all areas of modern media theory, focusing on the essential role of visual media, technology and the structures of visual and pictorial media communication in the context of multimodality, intermediality or transmediality. The interdisciplinary research includes media and film studies, image science, philosophy of media and mind, phenomenological and semiotic approaches, art history, design theory, computer graphics, aesthetics, presence research, game studies, theories of perception and psychology and other research areas related to moving, technological, procedural, and dynamic images.

The academic engagement of the research group led to a series of conferences termed *Moving Images* (in 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2022 and 2023), which intended to discuss and reflect the concepts and structures of images used in traditional image sciences (in terms of static pictures or images) and in a modern perspective; according to new and immersive media and image technologies.

The necessary consideration for the establishment of YoMIS is the interdisciplinary connection of German, European and international media research to improve the academic exchange of ideas. Therefore, YoMIS is innovatively conducted as an electronic and print publication to enhance the range of impact.

The Yearbook is based on a prolific scientific cooperation of the University of Applied Sciences Kiel, the Muthesius Academy of Fine Arts and Design in Kiel, and the MSD—Münster School of Design in Münster; and is edited and published by Prof. Dr. Lars C. Grabbe, Prof. Dr. Patrick Rupert-Kruse and Prof. Dr. Norbert M. Schmitz.

YoMIS is conducted as a periodic forum for international scholarly and intellectual exchange and interdisciplinary discussion, not determined as a publication for a specific academic school or tradition. The editors are formulating the specific topic of each issue, but the members of the editorial board make the final decision for the publication of articles, in a double-blind peer review process. The content-related broadness of the different topics, and the variety of methodological approaches, forces a productive opposition of academic perspectives, which can certainly differ from the subjective perspectives of the editors.

> *Lars C. Grabbe, Patrick Rupert-Kruse & Norbert M. Schmitz October 2023*

## Introduction

#### *Lars C. Grabbe, Patrick Rupert-Kruse & Norbert M. Schmitz*

With the concept of the *Trilogy of Synthetic Realities* book series, the editors of the *Yearbook of Moving Image Studies* (*YoMIS*) want to extend the analytical range of research in the field of a modern and progressive image theory or image aesthetics. In this context, there is a need for reconfiguring the analytical frame in image theory with an explicit addressing of the interdependency of representation, action, and image technology. With a focus on recent developments in virtual, augmented, and mixed image media, future research should clearly carry forward what the *Trilogy of Synthetic Realities* has started in a fruitful manner. The goal was to address technological developments, the embedding of images in multi- or intermedial media conditions, the performance and action *of* and *with* images, the visual addressing of the sense modalities of the recipient, the bodily involvement and the corporeality of display images, the different aspects of learning and cognition through images, the shift from analog image patterns to digital image procedures—by hardware-software-dynamics—the aspect of image augmentation and real-world coupling, and the transformation of images from visual surface phenomena to embodied quasi-objects, avatarial bodies, multisensory excitation patterns or augmented and mixed reality display patterns.

For sure, in modern and urban societies images have become an important part of technological communication processes and civic media environments, and they are impacting the real-life communication in very drastic and intense ways. The mediatized society has already turned into a screen-based media ecology that is impacting a polysensual use of a large variety of image visualizations. The editors want to argue that image communication has evolved from framed images in specific distances to the recipient (museum, art exhibition) and clearly defined image carriers (film reel, photography, sketch, illustration, etc.) to process-driven image operations displayed through complex screen

#### Introduction 1 1

technologies: Therefore, the technological image is a visual effect and software process, based on hardware variations and embedded in the framework of user interaction in the range of surface and interface. The editors are confident that a modern image theory must develop analytical tools and frameworks that are able to describe and evaluate the process-driven aspects of images between pictoriality and technicity.

The screen-based images are integrated in a variety of display devices that influence the procedural aspects of the image itself. These procedural images could be described as the transmitting effects of data procedures, and the display device as a medium becomes invisible beneath the perception threshold. The digital image is a two-folded construct that is directly depending on the screen whereby the screen itself is an elusive medium that allows a pictorial infinity: the image that is generated by the display—no matter if in virtual, augmented, or mixed reality—is never finished, complete or final at any given moment. The procedural image contains the infinite aspects of the screen as "elusive and difficult to grasp. As surfaces of moving images, continuous flow of text and data, they have the appearance of elasticity, transparency, and immateriality (or even virtuality)" (Strauven 2021, 154). So, the screen images are closer to processes than to classical images. This aspect includes a specific plus or benefit of the screen image (cf. Gotto 2018) regarding its re- and decomposability, which means that it never reaches a finalized version but rather different processing states depending on the media technology (cf. Engell 2000).

With a focus on screen images and technology, the editors argue for a specific extended reality turn that is addressing the use of images in the range of mediatized interplay of physical and digital realities: immersive image technologies like virtual reality, augmented reality, and mixed reality—summed up under the concept of extended or synthetic realities—allow and demand a novel form of interaction and corporeal relation with the procedural images and the digital image objects.

The editors of the third volume of the *Trilogy of Synthetic Realities* with the specific emphasis on *Mixed Reality Images* would like to argue, that the pictorial aspects of mixed realities are getting more physical—or more real in a hapto-tactile point of view—based on embedded artefacts and physical interaction. The *Mixed Reality Image* is getting realized by a technological procedure (in this case a software-hardware relation) that is a mode of movement (in this case a data detection interval, algo-

#### 1 2 Introduction

rithm and software activation and a final image visualization). Finally, the mixed reality dynamic can be described as a temporal interval with a specific image duration (the duration of the visualized digital image). It seems plausible that the screen image enhances the structural aspects of the moved and interactive image when it is activated in the context of mixed reality. This means that the phenomenal structure of a physical extension and coupling becomes more evident in the context of a mixed or merged reality image: based on a full-body 360-degree visual simulation in a VR space or realized as a partially proprioceptive device simulation with AR image overlay that connects digital image objects and physical space or background. In a fully enhanced mixed reality condition, the visual simulation works in the mode of a (partially) proprioceptive and physical image interlinking that connects controllable digital image objects with a physical background in combination with a physical artefact interaction.

It was already shown in the volumes on *Virtual* and *Augmented Images* that working and living with extended reality technologies has become a challenging aspect of everyday life and that this implies some enriching dynamics in the information society with unexpected impulses for aesthetics, art, and design of image media. Additionally, it seems evident that the different media practices will, on the one hand, structure a set of conventional forms, like the development of the *classical style* in film history and aesthetics, or that they will generate a variety of experimental opportunities in the form processes, as it is characteristic for the liberal arts.

The editors of *Mixed Reality Images: Trilogy of Synthetic Realities III* will address the theoretical and analytical aspects of *mixed reality images* that are challenging and enriching life in ways that have already been characterized by science fiction movies, comics, and novels. Thus, the authors of the *Mixed Reality Images* issue of *the Yearbook of Moving Image Studies* are concentrating on mixed reality images and physical artefacts, specific augmented media technologies, graphic representations, and material interfaces of mixed and augmented reality. They are focusing on aspects like perception, simulation, augmented performance, and virtual modes of action. Aspects of mixed reality aesthetics, art and design, and communication will be highlighted as well as forms of interaction and narration in digital media ecologies.

#### Introduction 1 3

In *The "Art of Immersion" as a Reflection of Human Nature: Illusionistic Forms as Aesthetic Strategies* Norbert M. Schmitz asks about the anthropological conditions of spatial, in particular stereoscopic, vision in order to establish that "immersion" represents the normal case of our human perception, which stands precisely only in a functional relationship to "objective" world, which remains inaccessible to us. Based on this, Schmitz asks about the possibility of an "art of immersion" as an aesthetic strategy of modern art under the conditions of contemporary concepts of biological constructivism.

In *On the Politics of Augmented Reality* Jens Schröter focuses on the field of augmented reality in which 3D virtual objects are integrated into 3D real environments in real time. He differentiates AR historically from virtual reality and discusses different applications of AR. Finally, he discusses the political functions applications can have in late capitalism in the context of the Deleuze's control society.

Niklas F. Becker shows in *Hardware Effects on AR Pictoriality: A Phenomenological Approach* how pictorial media structure images in different ways and evoke modifications in their pictoriality. He argues that augmented reality technologies evoke virtual images, and he uses concepts of Edmund Husserl's phenomenology of the image to analyze the relation of the mesh between AR image carriers (screens and HMDs), the image objects, and the represented image subjects, as well as the perceptive and interactional position of the user.

In *A kind of Mixed, Intermediate Experience: On the Entanglement of Image and Bodies,* Julia Reich and Manuel van der Veen focus on the concept of mixture that primes the description of mixed reality images. They focus on the entanglement of image and bodies and understand the body by the concept of Michel Foucault, who argues that real bodies reach into virtuality as well as virtual bodies require a localization. Based on this viewpoint, they examine three contemporary artworks by the artists Banz & Bowinkel, Sarah Rothberg, and Charlotte Triebus.

Pamela C. Scorzin discusses mixed reality and its prominent role in contemporary art in *The Phygital as the Virtual Real: The Role of Mixed Realities in Contemporary Art*, with reference to the mixed reality experiences of Rimini Protokoll and the digital artists Manuel Rossner and Marie Lienhard. She shows how immersive art experiences refer to structural aspects of immersion, interaction, incorporation, and illusion

1 4 Introduction

that transcend the categories of the staging realm and lead to the 'phygital' as the new virtual real.

In *Inhabitable Bodies: On Embodying Virtual Reality Experiences* Anna Caterina Dalmasso is focussing on virtual reality as a medium of providing first-person experiences to transcend the limitations of physical embodiment. She asks how virtual reality engages with the possibility of inhabiting a different body, to provide users with prosthetic or augmented bodies. With a phenomenological analysis of the conditions of embodiment by contemporary immersive environments, she gives insights into the augmentation of the virtual by immersive interfaces and the living and moving body as an aspect of performativity.

In her reflections in *Exploring Architecture with Image Technologies: From Narrative Film to VR*, *AR and MR Narrative Structures*, Katarina Andjelkovic discusses image technologies in the range of modeling, reconstruction and documentation of architectural buildings, experimental architecture, human tracking and video representation. Her goal is to discuss heritage architecture and narrative film, virtual reality, augmented reality and mixed reality environments to understand various ways of experiencing space, reality and illusion.

#### References


## The "Art of Immersion" as a Reflection of Human Nature: Illusionistic Forms as Aesthetic Strategies

*Norbert M. Schmitz*

#### Abstract

This essay inquires into the "art of immersion" and the significance the recent "progress" of digitally-based three-dimensional and multimodal illusionistic technologies holds for art. Issues addressed include classic applications in design, the artistic deconstruction of these new forms of perfected mimesis, and the potential for positively defining an "art of immersion." The essay expands upon this by addressing the neurobiological and anthropologically determined perceptual preconditions for generating illusionary and immersive artifacts. Early in the essay, the author postulates that immersion is quite simply our normal state of perception. In other words, the perception apparatus of our nervous system interprets certain stimuli as objects, situations, etc., which give the impression of presence as "phenomena" or qualitative sensations neuroscientists call "qualia." But we know nothing about what lies behind this, the Kantian "thing in itself." Thus, in a narrower sense, immersion as the goal of formative artifacts ranging from the Renaissance image to cyberspace reproduces the forms of our perception by creating figurations that stimulate the brain to perceive the presence of something imaginary. Would it then be the task of modern art to address a contemporary epistemology in which the "art of immersion" reflects our everyday perception in a new way?

#### Keywords

Immersion as aesthetic strategy, neuronal construction, digital mimesis, digital realism, aesthetic difference of immersion, art of immersion, artistic reflection of immersion

… Parrhasius …, it is said, entered into a pictorial contest with Zeuxis, who represented some grapes, painted so naturally that the birds flew towards the spot where the picture was exhibited. Parrhasius, on the other hand, exhibited a curtain, drawn with such singular truthfulness, that Zeuxis, elated with the judgment which had been passed upon his work by the birds, haughtily demanded that the curtain should be drawn aside to let the picture be seen. Upon finding his mistake, with a great degree of ingenuous candour he admitted that he had been surpassed, for that whereas he himself had only deceived the birds, Parrhasius had deceived him, an artist. (Pliny, Nat. Hist. XXXV, 64)

*Figure 1: Florence: Schmitz standing before Masaccio.*

In 2001, in his seminal history of virtual art, Oliver Grau offered a conception of the term "immersion" that is both cogent and useful:

In virtual space, both historically and in the present, the illusion works on two levels: first, there is the classic function of illusion which is the playful and conscious submission to appearance that is the aesthetic enjoyment of illusion. Second, by intensifying the suggestive image effects and through appearance, this can temporarily overwhelm perception of the difference between image space and reality. This suggestive power may, for a certain time, suspend the relationship between subject and object, and the 'as if' may have effects on awareness. The power of a hitherto unknown or perfected medium of illusion to deceive the senses leads the observer to act or feel according to the scene or logic of the images and, to a certain degree, may even succeed in captivating awareness. This is the starting point for historic illusion spaces and their immersive successors in art and media history. They use multimedia to increase and maximize suggestion in order to erode the inner distance of the observer and ensure maximum effect for their message. (Grau 2003 (English ed.), 17)

This is Grau's argument. But is it enough to describe and analyze the almost positivistic progress of this illusionism, which is currently working its way into our everyday media usage through new, marketable head-mounted display sets? *Shouldn't the question concerning the "art of immersion" be phrased very differently? It's not the phantasms of poststructuralism we're interested in here, because the agony of the real has failed to materialize, even well over thirty years following the onset of the "digital revolution" and its "technoaesthetics"* (cf. Kerckhove 1993; Weibel 1991).1

<sup>1</sup> Peter Gendolla summarizes the problem succinctly using a careful derivation of the concept of simulation: "Ranging in application between the technical-descriptive and the morally evaluating/media theories to date draw inconclusive, contradictory, or paradoxical conclusions, in all likelihood unfavorably influenced by the writings of Jean/Baudrillard, most notably his *Symbolic Exchange and Death* (1976). The book refers to technical simulations that have progressed beyond analysis in highly industrialized countries, particularly the USA. With its erratic combination of phenomena, but with a single spectacular thesis on genealogy, structure, and other consequences in a simulated world of this kind, it caused a sensation, especially among cultural studies scholars with little technological knowledge, with its claim that our traditional systems of symbols and references had been dissolved. The genre has gradually made nature, originally inaccessible to us, available and transformed it into reality via technical reproduction and construction, only to finally remove it again through the development of the latest computer-aided and networked/media systems, erasing its reference to materials and bodies (/physicality), the world both acting and acted upon transformed into a pure drawing process. The inconsistencies, paradoxes, and hasty nature of such a theory—which was at least able to crystallize something akin to a 'basis' of the entire

*I'm not referring to its various applications in terms of prolonging mimesis and realism in modern art in the fields of film and design, which has frequently been impressive; my question is aimed at a modern, autonomous aesthetics, one that was, after all, characterized by a "crisis of representation."* 

*The term "immersion" can be derived from the baptisms of the ancient and Eastern Churches. I'm using it here in the sense of a secularized world view, of course, as a metaphor for entry into a new reality and not as the entry itself.* 

In this context, the terms "virtual reality," "simulation," "immersion," etc., cannot always be used with clear distinctions in the sense of a philosophical system. In keeping with Lambert Wiesing, however, it should be noted that immersion and virtual reality do not necessarily go together, in that there are visual artifacts that function immersively, completely independently of any illusionistic effects (Wiesing 2005, in particular: 107–115), one example being the abstract optical art. In contrast, some virtual worlds tend to establish distance, as do most depictions of objects in scientific illustrations. Here, immersion is understood as a particular psychological relationship between perception and the object of perception, and the question is to what extent this becomes changed by the new media—that is, those that are indeed new.2 Presence alone does not generate a desire to 'immerse' oneself in the simulated world. This postulation, however, is independent of whether we're looking at a simulation of real or merely possible things—in other words an assumed reality or the imagination—for instance, a military scenario in the Iraq War rendered as realistically as possible, or the fantasy world of a computer game. It nonetheless remains unequivocal in this regard that the new digital media are generating a boost towards increasing immersive potential and thus boosting their importance in visual culture to a previously unknown degree. Lambert Wiesing specifies further:

In the history of the development of digital media, the American Jaron Larnier is one of the key inventors and producers of simulation and cyberspace technologies. In 1990—the same year Flusser published 'The New Imagination'—he was asked what vision he was pursuing with his numerous inven-

postmodern discussion—were gradually cleared up in the further development of media studies and through more systematic knowledge of the possibilities and limitations of simulations" (Gendolla 2002, 332).

<sup>2</sup> In this regard, Wiesing differentiates between new media and truly new media (cf. Wiesing 2005, 118).

tions. His answer was short and to the point: 'I want to externalize your mind'" (Larnier 1990, 46). In fact, it is precisely this intended goal that constitutes a first clear qualitative leap in the image's development through the new media: the visual object becomes an externalized fantasy object. From this moment on, not only does what someone has thought or imagined become visible in the image, but the process of imagining is itself transformed into the visible. Things can now be seen that one could previously only imagine. It's no longer the products of the imagination that are represented: the act of imagining itself is presented visually in a visible, hence public form. It's a matter of adapting the possibilities for changing image content to the possibilities for changing one's fantasy. (Wiesing 2005, 118f.)

Wiesing describes the special field in which these new media really are new and not mere continuations of older traditions of illusionistic pictorial practices. However, immersive aesthetics combine this new quality with older forms of simulating perception and the affects ranging from the use of perspective in art to a rhetoric of affect as dominant forms of visual culture in modern times. In any case, the only place new media appear as 'pure digital art' is at Ars Electronica.

*What are the aesthetic challenges for an "art of immersion" as an artistic reflection on immersion? To begin with, the central thesis of this essay is as follows: in a neurophysiological sense, immersion represents the normal state of perception, i.e., the human being lives in the continuum of an objective world, which he not only perceives as reality on a daily basis, but also*  has *to perceive in order to carry out the practical tasks that guarantee his survival. To reflect on this is the program of an art of immersion.*

It may be possible for Asian monks in a state of complete meditative immersion to remove themselves from their physical existence to such an extent that they can rise above it. But even the ordinary philosopher, reflecting at his desk or in a seminar on questions of idealism in, let's say, Johann Gottlieb Fichte's radical formulations, does not stop at the working hypothesis that both the entire world and the particular room in which he is currently speculating could be no more than mere appearance. At the same time, he 'speaks' to his body in a clear bodily language, and in doing so, confirms his second working hypothesis, less conscious than the philosophical speculation, that the chair he is sitting on is as real as the fact that it is supporting him. Similarly, the neuroscientist's understanding of the way our brains construct the world in neuronal connections has no effect on his actions, whether they be walking in the classroom, assessing the perceptions of his patients and

*Figure 2: Conference room.*

subjects, or his own self speaking. *We are, therefore, constantly in a state of complete immersion; in other words, we consider the world around us to be "real," regardless of the ontological status we ascribe to it theoretically.* Thus, the whole world, as it surrounds us in our self-evident perception, is nothing other than the result of complex neuronal processing—or, as one could say in the spirit of ancient Indian and Buddhist thinking, 'Maya', whose veil we can never move past with our ordinary senses.3 In the process, Kant's transcendental critique collides with the biology of knowledge. *We have no way of transcending this framework, at least not at the level of our nature.* We can at most reflectively visualize its conditions, its natural-historical development, culturally refined forms, etc., and it is precisely this framework that is the actual object of an epistemology of the aesthetic. *Thus, we've always found ourselves in a space of complete immersion, and so cyberspace is perhaps a better metaphor for the contingency of human knowledge of the world than the Platonic cave.*<sup>4</sup>

<sup>3</sup> I would like to point here to an introduction founded in religious studies, but based on a Western philosophical perspective: Schumann 1976.

<sup>4</sup> For this promises an ascent to true ideas beyond deception, while cyberspace claims nothing more than to be a perfect illusion, regardless of whether it represents reality or the imagination.

*Figure 3: Head-mounted display, Oculus Rift.*

In any case, in the context of the following discourse there is no outside of any cave, no ontology of the 'truly real', the idea hidden behind the *essent,* etc., because it's not about possibilities for mystically transgressing sensual experience, but about what can appear within the limitations of this very concrete bodily perception. Even artistic artifacts cannot move beyond this sensual framework, despite counterfactual claims made by all sorts of artistic endeavors.

This all has very little to do with radical constructivism, however, because, as we will also see in outline here, our 'fantasies' refer to an external world that determines our existence and concrete nature, which they do not depict, however, but interpret in terms of very few characteristics which are nevertheless necessary for survival. My approach is therefore that of a 'biological constructivism' as it was understood in the humanities primarily through the writings of Humberto Maturana and Francisco Varela (Riedl 1980). *But what is the function of all the media artifacts from Renaissance painting to cyberspace? To my mind, for a truly modern "art of immersion," this question is crucial.*

#### 1. The Nature of Immersion

Let us first take a contemporary look at our worldview from the perspective of an anthropology of perception, that is to say, at the status of objective reality, the aforementioned sensual appearance that surrounds us every day.5 Its dubiousness is now an integral part of everyday knowledge, at least for the educated public. Once again: *our original, immediate experience of the world forms in the brain as a construct, the result of the neuronal processing of certain configurations of stimuli which our sense organs selectively perceive. In neuroscientific terms, we experience it subjectively as qualia, as a colorful, separate, dazzling outside reality brimming with fascinating details, many of them mysterious.*

In the following, I will at first largely limit my argument to the sense of sight as the most important sense for Homo sapiens in terms of developing the unique intellectual abilities of the species.6 The focus here is (almost) exclusively on the light stimuli independent of objects in a light wavelength specifically delineated by human perceptual abilities.7 These stimuli are separate from the properties of the objects they 'report' about, i.e., they are neutral in terms of beneficial or detrimental effects on humans, and this is one reason why we are so uniquely suited to perceiving them. This is important to note, because artificial stimuli such as brushstrokes on a canvas and pixels on a screen are just as independent of what they represent.

<sup>5</sup> Pitted here against a radical constructivism in the sense of an unrestricted relativism and the 'salon discourse' of excited 'media bohemians' it turned into in the late nineties is a simpler localization of knowledge in the pragmatic spaces of action of living organisms. It remains especially incomprehensible, at least to me, why Maturana and Varela were called upon as 'star witnesses à la mode' of an unrestricted relativism. In fact, it seems to me that the results of 'constructivist biology' have not yet been sufficiently embedded in a modern theory of mimesis (cf. Maturana and Varela 1987).

<sup>6</sup> Methodologically, this is permissible for the following argumentation, which then necessarily includes other sensory modalities. How representational and immersion affects can be enhanced through multimodality is another topic; what is essential, however, is the sophistication of the different sensory modalities with regard to their processing in the brain. The human experience of the world is not simply the sum of all sensory data; rather, the various evolutionarily differentiated senses play a no less differentiated role in the construction of our everyday, objective worldview (see Grabbe et al. 2013).

<sup>7</sup> Recent remnants of archaic exposure to light, for example the effects of excessive or weak solar exposure to the hormonal system, are not taken into account. The processing of this information takes place relatively independently of 'conscious' seeing.

One can visualize the special quality of light's 'lack of properties' with a more archaic sense, such as touch, which is why the role this primal sense plays in every state of immersion, that is, in the perception of reality, is much more difficult to simulate.8 In terms of phylogenetic history, touch can be traced far back to the earliest days of animal evolution and to the time before light was discovered as an information carrier, namely, to the first reaction of a closed organism to any direct 'touch' by the outside world. Such stimuli do not represent a world at this level of organization of living beings, but are already a direct part either immediately useful or detrimental to the survival of the organism. The stimuli offer no information about possible events; they are the event. *Representation and that which is represented, signifier and signified are identical here; I'm speaking of a pre-semiotic level of perception.* Even more highly developed creatures, such as the frogs so popular in experiments, do not yet have an 'idea' of the world around them, since they only perceive it as a specific constellation of stimuli through congenital receptors in the brain, which lead to an immediate, reflexive, non-conscious response.

When the light stimulus of a moving fly is perceived by a frog's eye, which, as a vertebrate eye, is physiologically and morphologically similar to its human counterpart, it isn't an insect the amphibian sees. The amphibian brain merely recognizes a certain stimulus dynamic on account of its speed, and this triggers a reflex of the tongue aimed at the prey. Can the reconstruction of isolated stimuli of this nature, as ethologists attempt in spectacular experiments, be described as an immersion apparatus? The stimulus patterns, naturally selected over countless generations—one might say an organism's "way of reading"—are innate to the animal and ensure its survival. The constellation of stimuli, i.e., the insect's customary manner of flying, match perfectly with the corresponding, genetically determined neuronal patterns in the amphibian's brain. Evolution also, however, means ongoing changes in environmental

<sup>8</sup> Think, for example, of the fragrance strips in the Odorama patented by Morton L. Heilig in 1962, which were also intended to provide olfactory immersion. An effective 'smell cinema' has not, however, been realized to this day. What is still missing is a universal code that can permanently simulate every possible stimulus. It is precisely the objective, concrete properties of the stimuli in the more archaic sensory modalities of touch, smell, and taste that make technical solutions more difficult than the translation of neutral light and sound stimuli into signals from technical devices.

stimuli, and so flies survive by altering their flight patterns and frogs by constantly adaptating to these ever-changing conditions; the mutation in the wing beat of the prey thus requires a corresponding mutation in the stimulus pattern inside the predator's brain. In the following, however, I am less concerned with natural-historical/evolutionary dynamics than with systematically comparing this "being anchored in the world" of a creature without a cerebrum and the biological structure of our "world view." Things are entirely different in the case of humans, since they can, apart from the archaic remnants of their evolutionary history, perceive the stimuli of the outside world neutrally and independently of their concrete causes.9 *In general, on the basic level of world perception, humans do not select the stimuli of their environment in advance, but absorb them in a fairly remarkable abundance in order to then "interpret" them after the fact, in a preconscious manner. This interpretation is "open to the world" in Gehlen's sense, i.e., a Homo sapiens recognizes his environment independently of any phylogenetically acquired programs, purely from the experience of interacting with it* (Gehlen 1940)*.* This has the effect that similar configurations of stimuli trigger similar and occasionally almost identical interpretations, even if their 'causes' or triggers are completely different. The spectrum ranges from simple stimulus triggers on archaic levels of perception to the complicated stimulus configuration of the physical world we are immersed in every day.

In most animals, perception is phylogenetically determined. It is not possible to elaborate here on where precisely the extraordinarily complex neuronal performance of an actual consciousness begins when 'recognizing' an autonomous objective world. This means that in most living beings, the organism's ability to 'process information' is limited to a few phylogenetically acquired, i.e., individually non-variable slices of the outside world. The living creature 'responds' to certain configurations of stimuli, which the brain recognizes in a genetically programmed manner, leading to reactions that are physically and neuronally determined, such as the tick reacting to the smell of butyric acid or the stickleback responding to the hue of the belly in a potential sexual partner. Such

<sup>9</sup> The following considerations primarily address the neurophysiologically-based explanation of the function and evolutionary genesis of perception and its physical foundations, as in Edelman (1993). Like other neuroscientific positions, this view clearly sets itself apart from the static models of AI research.

stimuli, as the experiments of classical ethology teach us, are easy to simulate, and every torero knows how to use a simple red cloth to make the bull charge into an immersive trap.10 Humans, too, harbor archaic remnants of older systems of perceiving the environment such as these, as evidenced by all the pornography11 that advertising is only too happy to profit from beneath the surface of the photographic illusion.12

*Figure 4: Poster for the Galeria department store around 2010.*


These phenomena are often vital to survival, yet as significant as they are, the distinguishing feature of human perception—the essential adaptation nearly unique to the human animal—is the non-determined objective interpretation of a rich, unspecified flow of stimuli as a world continuum.13 Thus, I would like to reserve the concept of immersion in the further course of argumentation for objective world experience, although its credibility and the impression it makes can be intensified by strong affects, or, to put it another way, emotional excitement all too easily allows us to overlook the technical deficits of illusionistic means.14 Because in the complex structure of our psyche, all stages of our phylogenetic development are still present, and their complex interaction makes up our everyday perception.15 And when we speak of illusion and immersion here, it is chiefly a matter of simulating a halfway conscious perception of an objective world, as we constantly generate it from the sensory data available to us.16

Conversely, one could describe the unambiguity with which we react to ever more roughly formulated stimuli, depending on our hormonal levels, as the highest form of immersion. It is well known that in the Age of Discovery, deprived sailors "saw the bodies of mermaids in Beluga whales".

<sup>13</sup> Irrespective of this, 'remnants' of far more archaic forms of perception have also been preserved in humans, such as the involuntary knee jerk in response to a doctor testing the body's reflexes. At a higher level of neuronal organization, archaic, 'preconscious' patterns can lead to more 'conscious' actions on the part of the individual. More than anything, what should be mentioned here are unvarying stimulus patterns, such as particular sexual attractors or the phenomenon of cuteness, which also function entirely independently of real triggers, i.e., an attractive person or a baby, as in the case of a rough erotic drawing or teddy bear. However, even these biologically primeval forms of perception have something to do with immersion, i.e., diving into a virtual world created by artificial artifacts, in which the awareness of a difference to ordinary reality disappears. However, it was an error of Lorenz's classical ethology to interpret such remnants of earlier evolutionary stages as essential biological structures of Homo sapiens (see Lorenz 1965).

<sup>14</sup> For a comparison of these components cf. also Grabbe 2015.

<sup>15</sup> In this respect, Haeckel's rule, that phylogeny is repeated to some extent in the ontogeny of every living being, refutes all one-dimensional technical conceptions of humans in AI research, which, based on a myopic analogy, compare the structure of the human brain, which can only be understood in evolutionary terms, with the data organization of the Turing machine. The computer, however, was developed by the purposeful designs of the 'creator gods' Alan Turing and John von Neumann, while the human psyche is the 'product' of a structural-logical process taking place over millions of years without an 'intelligent designer' of any kind.

<sup>16</sup> I use the word "halfway" because we are, of course, not really aware, outside the focus of our attention, of the specific space in which we move, but rather consider its con-

I'd like to interject a brief remark here. All these considerations are of course the observations of a layman in natural science who is well aware of the fact that the concrete act of 'processing' sensory data is infinitely more complex than could even be summarized here, and part of an ongoing field of research.17 What is relevant in this context is a basic structure and its importance for the humanities and cultural sciences, exclusively in terms of the interface between natural vision and artificial illusion and as such, to my mind, long considered a legitimate framework, despite all the progress made in neuroscience and sensory physiology.18 But back to the techniques of immersion.

*The perceiving subject therefore precedes all media-based implementations of immersion.19 In this sense, pornography is its most potent form. Thus, while we continue discussing technology's progress, it should be noted that an "immersion" as complete as this requires only the slightest technical effort, and we become as blind to the artificiality of the stage sets and props as the stickleback to the red spot.* 

tinuity a given in a kind of ongoing 'construction work'. When I turn my head to the right to look at a student, the left side never disappears into nothingness. In this respect, it would be interesting to more closely analyze the reception of Renaissance paintings or photographs cropped by a frame.

<sup>17</sup> The complex selection and translation of stimuli between the eye and the brain alone is hardly comprehensible to the layperson. A popular, somewhat detailed account can be found in: Gregory 2001.

<sup>18</sup> On the basic procedure in more methodological detail: Schmitz 2002. In this vein, an older source can still be considered a reliable and didactically convincing presentation of the evolutionary foundations of the human mind for the interested layperson: Ditfurth 1976. Equally recommended is the series he conceived and moderated, "Cross Sections," broadcast from 1971 to 1989 on ZDF (the so-called the *second* German television channel). Incidentally, Ditfurth's masterpiece of scientific journalism would be hard to imagine in today's public media landscape, which operates on reductionism and sensationalist entertainment.

<sup>19</sup>To my mind, this should not be transgressed. Film studies approaches based, e.g., on Bruno Latour's multiperspectivity in the sense of an 'other', non-modern way of thinking such as that of shamanism, radical feminist approaches like that of Laura Mulvey, who identifies central perspective as a male-patriarchal mode of looking, and similar concepts overlook the fact that in every single filmic reconstruction, including digital film, the above-mentioned is a given in technology-immanent terms. It cannot, then, be transcended in the medium, which means that it can be deconstructed, thematized, or symbolically transformed by a wide variety of aesthetic processes, but only by means of a central perspective, its technical precondition. Perhaps one should differentiate more clearly here between medial or perception-anthropological frameworks and their cultural use, a distinction that would certainly be an excellent object of artistic scrutiny (see: Mulvey 1994).

For the immersive experience, the degree of illusionism is only one component alongside an affective willingness to 'believe and want to experience something.' Indeed, my remarks about its role within the 'nature of perception' as a condition of every immersion and thus, in turn, of the artistic reflection of the same should be explained more clearly. The limited space this essay provides, however, only allows me to mention a few points. The skills of art since the Renaissance to simulate the affect-producing stimuli, i.e., the developed visual rhetoric, are an absolute prerequisite for the current 'triumph' of immersive technologies (essential: Knape and Grüner 2007). But the main argument is this: the evolutionary advantage and relative objectivity of this type of perception are evident, because they allow us to freely interpret new environments and thus engage with constantly changing surroundings. Summed up in far too few words, this is all crucial to the question of immersion. It should be noted that a special feature of humans, perhaps also of some higher mammals possessing a cerebrum, is that they are surrounded by a continuous flow of stimuli, which they also continuously interpret as a particular world constellation. This 'biological objectivity' is nothing but a statistical value in the sense of 'biological constructivism.' The adequacy of the interpretation, its objectivity, consists of the fact that this neural construct is sufficient to ensure our ability to survive. It is not a matter of recognizing, let alone understanding, one or another ontological reality. We may well accept or assume its existence, but it is as inaccessible to us as the 'thing-in-itself' in the sense of Kant.

*Figure 5: A red stimulus: Torero.*

Fundamentally, our human perception is 'stimulus-neutral', i.e., depending on the context and stimulus environment, we can interpret a certain configuration of light stimuli as a red spot, a bit of tablecloth, a splatter of blood, or as the abstract red square of a modern painter (cf. also Danto 1991, 17–21). Except for the fact that we can't *not* see anything: the only thing we can do is close our eyes.

*In the context of these considerations, illusion means nothing more than the simulation of certain configurations of stimuli independent of the various objects that trigger them—such as the sight of the Baptistery in Florence, or a blob of paint—if the constellation is close enough.* (What is amazing is the tolerance with which our brain accepts similar stimuli as sufficient.) We are compelled to interpret certain stimulus configurations accordingly, for example when we recognize our grandfather in an arrangement of photographic gelatin on barium sulfate paper, or interpret points of light on a screen as an image from the Syrian Civil War. Incidentally, the name of the first black rectangle publicly presented as a work of art was *Combat de Nègres dans une cave pendant la nuit,* which the caricaturist Alphonse Allais published as a reproduction in his "tableaux célèbres" a good twenty years before the "Last Futurist Exhibition" of 1915.20

*Figure 6: Malevich, Last Futurist Exhibition.*

<sup>20</sup> Cf. Allais 1993. At this point I would like to thank the art student Alexander Wagner, who found this object while researching Kazimir Malevich. For further elaboration see Schmitz 2013.

In the unambiguity of the undifferentiated stimulus constellation of a black field, the image of a black surface, so open to interpretation, offers the viewer perhaps the widest range of hypotheses. We can imagine basically anything and everything in the darkness of this cave of unknown proportions. However, while we readily recognize the Suprematist 'icon' as a flat canvas, when we visit the Cappella Tornabuoni in Santa Maria Novella with Ghirlandaio's frescoes, it's not so easy to attain the same insight that we're looking at no more than a painted and plastered wall.

*Figure 7: Ghirlandaio as a dirty wall, around 1488.*

This becomes particularly clear with the abstract interpretation of classical figurative painting by the classical avant-garde, the "analyses of old masters." It is not an involuntary act, but rather a difficult, purposeful act, to look at a painting by Raphael and see not a deeply spatial Madonna in a Tuscan landscape, but a flat triangular composition (for further elaboration see Schmitz 1999, 242f.).

It is no less difficult to disregard the representational nature of an immersive arrangement that is perhaps inadequate by today's standards, for example in an older computer game, than to create a perfect illusion. Accordingly, even with the simplest technical set-ups, immersive effects appear almost automatically, the affective preconditions are fulfilled, and attention becomes focused.

*Figure 8: Johannes Itten: Analytic sketch of an ancient Egyptian sculpture, 1915.*

In everyday perception, this is inconsequential. Here, interpreting an ongoing flow of stimuli is about generating hypotheses for practical orientation in the world acquired through simple everyday habit, and not about reflecting on them. As a rule, lines that converge in a point indicate depth of space and not a surface. This equivalence between everyday perception and the perception of illusionistic artifacts can be shown using a famous experiment in Gestalt psychology by the artist and scientist Adelbert Ames Jr., which Ernst Gombrich describes as follows:

One of them which can be fairly successfully illustrated makes use of three peepholes through which we can look with one eye at each of three objects displayed in the distance. Each time the object looks like a tubular chair. But when we go round and look at the three objects from another angle, we discover that only one of them is a chair of normal shape. The right-hand one is really a distorted, skewy object which only assumes the appearance of a chair from the one angle at which we first looked at it; the middle one presents an even greater surprise: it is not even one coherent object but a variety of wires extended in front of a backdrop on which is painted what we took to be the seat of the chair. One of the three chairs we saw was real, the other two illusions. (Gombrich 1961, 248f.)

*Figure 9: Adelbert Ames Jr.: The Ames chair demonstrations, from: Ernst Gombrich: Art and Illusion. A Study in the Psychology of Pictorial Representation, New York (Pantheon) 1961.*

The set-up, in fact, treads the border between natural perception and constructed illusion. The experiment allows us to feel for a moment the illusion we always find ourselves within, the inexorability of which leads me to speak of our 'natural immersion space.' Gombrich continues: "What is hard to imagine is the tenacity of the illusion, the hold it maintains on us even after we have been undeceived. We return to the three peepholes and, whether we want it or not, the illusion is there" (Ibid.). And so immersing ourselves in the virtual reality of our own minds is no less compulsive than immersing ourselves in artificially illusive spaces, whether they be Renaissance images or cyberspace. This requires, as Oliver Grau observes, the "most exact adaptation of illusionary information to the physiological disposition of the human senses" (Grau 2003, 14). Artificial illusion and immersion, then, entail nothing more than simulations of this nature, configurations of intrinsically neutral stimuli.

The preceding observations on the anthropologically determined framework of our perception form the basis for the following connection drawn between the classic concept of illusion and immersion, because in our everyday neuronal life, we do not live inside our heads like homunculi and observe the workings of the illusion-producing machine that presents us with deceptive images, but are always in the midst of this illusion, with all our senses.

The actual illusion takes place in the mind of the viewer because, as explained above, his or her brain is forced to process similar stimuli in the same way (cf. Greenlee et al. 2013). It is amazing how the brain can construct extremely complex worlds from a relatively small number of stimuli, which themselves undergo selection and considerable alterations in form in the course of neuronal processing. It is an evolutionary adaptation to the usual hominid environment when, with a reasonable degree of certainty, ordinary gaps—such as prey hidden partially behind a tree, or to take an example from our own everyday surroundings, a leg concealed behind the back of a chair—are automatically 'seen'; or if, in keeping with the phenomenon of perceptual constancy, distant objects in a large space appear to us in their correct proportions instead of disappearing with the perspective, etc.21 Here, too, the brain always interprets the same constellation of stimuli in the same way, regardless of whether they are natural or artificial.22 *This is the insurmountable 'nature of immersion.'*

This compulsion in the illusionistic interpretation of certain stimuli configurations has been endemic to classical illusion painting since the Renaissance: every ordinary peep box has the power to draw us in. However, this form of visual perception is too closely tied to a more complex experience of reality with everything that goes along with it, i.e., the room the museum painting is hanging in, or the sofa in front of the living room TV.23 Even still, and this would be the other major

<sup>21</sup> For a clear explication of these phenomena from the point of view of today's Gestalt psychology, see Irtel 2007.

<sup>22</sup> This becomes clear in the various ways in which differences in distance are processed in the normal anthropological horizontal as opposed to the vertical, which is less customary for the human 'bottom-dwelling animal': an effect that artists such as Moholy-Nagy and Vertov knew how to use aesthetically. For more detail see: Schmitz 1994 and Simmen 1990. The peculiar immersive effect of extreme verticals, as in Hitchcock's *Vertigo*, or with ordinary roller coaster rides wearing smartglasses, is something that merits more discussion.

<sup>23</sup> As is well known, Günther Anders describes the infiltration of television into modern society as an anthropological turning point. And the new digital technologies only

topic I haven't discussed here, the emotional charging of images during normal media consumption alternating between the dispositives of cinema, television, and the Internet succeeds often enough—think of a soccer match—in absorbing our attention to the extent that all surrounding information disappears and we become almost completely immersed in the other world of movie or television images (see Baudry 2001, 95–135). In purely technical terms, it's almost impossible to simulate the dynamic flow of the sum total of all visual—and certainly all sensory—stimuli surrounding us. Leaving aside the manipulation of the visual neuronal areas in the brain, there are basically two ways of 'undermining' this restriction: either we generate a situation through an artificial arrangement, for example a monocular peep-box arrangement in central perspective, in which the stimulus flow is drastically reduced, or we use smartglasses in an attempt to create a dynamic binocular configuration of stimuli that is as detailed as possible.

In the first case, there is no movement, and so it literally remains a *nature morte*. The history of trompe-l'œil painting embodies this very attempt, which aims at creating a complete illusion, at least for a moment. Who hasn't 'fallen' for a violin or the like hanging at an angle behind a partially open (real) door? *This aspect of the paradoxical simultaneity of immediate illusionistic credibility and the aesthetic enjoyment of the illusion itself characterizes the psychology of immersion, or rather illusion's immersiveness.* Winfried Menninghaus elaborates:

… accompanying awareness of the unreality of their scenarios … oscillates between the poles of deep immersion (immersion in a game, music, book, or film, to the point that reality is forgotten) and a distanced awareness of the ontological difference. The two should not be considered alternatives, because they occur in different mixtures: even an immersive reader hardly ever loses complete awareness of the ontological difference. From Cervantes's *Don Quixote* to Woody Allen's *The Purple Rose of Cairo*, works of art use the phantasm of the boundary between reality and the world on the page or canvas dissolving completely in order to generate strong comic effects while revealing the pathology of this loss of distinction. Even the redefinition of Pygmalion's statue as his wife does not, on the part of the observer/recipient, lead to an analogous elimination of the differences between art and reality. Rather, it's precisely this abolition as a rhetorical

increase the effects of the media's penetration into everyday culture. To my mind, in this respect its importance for media theory cannot be overestimated (see Anders 1980).

adynaton—as a spectacular representation of a human impossibility—which becomes the subject of an allegorical reading that lives entirely from the distinction between the cognitive frames of art and life and by no means from the definitive annulment of this difference. (Menninghaus 2011, 211)

*Figure 11: Trompe-l'œil painting from: Cornelis Norbertus Gijsbrechts: Cabinet of curiosities with a Hercules-Group, 1670.*

For this moment, the real exhibition space—a side corridor in a Dutch castle as the combination of an actual spatial situation that can be physically entered and an illusionary image inserted into its visual order—is perhaps a more perfect immersion space than even the most sophisticated smartglasses currently have to offer. But as we know, this always has no more than a short-term effect. The moment we marvel at the painter's artistry, the immersion dissolves before our eyes. It's a bit as if the power suddenly went out in the digital dome of a modern planetarium.

The second possibility, which has only become possible in recent times, are the dynamic three-dimensional immersion spaces we see with smartglasses, which stand at the forefront of a long lineage of immersion spaces, from the wall illusions of Pompeii to the panoramas of the nineteenth century, but which chiefly represent a merging of cinematographic and stereoscopic arrangements.24 But with regard to immersion, even the newest technical innovations employ the same strategy of making a stimulus configuration comprised of technologically generated pixels as similar as possible to a potential constellation that real objects and spaces in pre-media normal life would have produced. It is irrelevant whether the artificial object is presented as part of reality, as in a topographical view, or as imaginary, such as the creatures of some fantasy world. *In this sense, creating immersion amounts to creating an artificial reconstruction, in a given medium, of configurations of stimuli equivalent to those one finds in the ordinary, "pre-medial" world.*

And that's how we use this term every day. The remarkable examples from classical Gestalt psychology, which have been an integral part of the art school curriculum for decades, tend, like the remarkable works of optical art, to remind us how susceptible and easily deceived our sense of sight can be. But the surprise the famous bistable images supposedly produce turns out to be banal, while conversely, it is amazing how reliably and in what detail Homo sapiens creates a continuous worldview from the infinite number of stimuli surrounding us, an 'image' of the world that makes a very complex individual learning possible. This capability for virtual imagination is a prerequisite for key human skills, including the manufacture of tools as a culturally creative achievement.25 *In this respect, the illusion is not a surprise epistemologically; it merely reflects or "doubles" normal, objective world perception, and we can, as I've already proposed, call the effects of perception associated with this "immersive."*

*It's not merely due to technological limitations that any real indistinguishability between reality and artifact remains rhetorical. In general, one would not be able to perceive an immersion this all-encompassing. Even in the immersive space of illusion, the iconic difference here is the aesthetic, if not to say the artistic, surplus of the whole.* 

This inability to distinguish, then, becomes a problem of representation not only in popular culture, but also in auteur films such as *Welt am Draht* (World on a Wire, 1973) by Rainer Werner Fassbinder. The

<sup>24</sup> A good summary can still be found in Grau 2003. Cf. also Halbach 1994.

<sup>25</sup> For more detail see Schmitz 2016. One should also distinguish between a humanly conscious and planned use of tools, as we know it from the great apes in the wild, and analogous phenomena, which have been found among many different animal species.

*Figure 12: Computer game EA Dice 2016.*

protagonists and locations are in no way different from those in the conventional world. The 'wires' connecting them to the experimenters from a superordinate world are not visible as marionette threads. The complete immersion of the participants in this virtual world can only be experienced discursively, through the conversations between the actors. A popular film like *The Matrix* (1999) by the Wachowski siblings remains, despite all its speculation over immersion, on the level of a simple confrontation between the 'real' and simulated worlds and a corresponding defense of the authentic.

The usual notion of illusion as deception, often presented with a morally negative undertone, misses the point—at best, it merely describes the inadequacy of a certain equivalence between the stimulus constellation of the signifier, i.e., the image, and the signified, the represented object.26 The only important factor here is that the act of perception works—because in terms of evolutionary biology, objectivity of perception is no more than a statistical phenomenon, while the idiosyncratic constellations in the classic arrangements of perception psychology occur as rarely—or not at all—in nature as the color combinations on the

<sup>26</sup> This, as is well known, is the gesture of international modernist literature from Greenberg to Haftmann. In view of the obvious artificiality of computer-generated images, their 'abstract pathos' actually seems dated today, as if from another era. For an example of this attitude, which is particularly characteristic of the postwar years, see Haftmann 1954 and Greenberg 1997b.

canvases in the Louvre occur in the open countryside.27 In fact, any criticism of mimesis, if it does not want to succumb to a naïve positivism, would have to apply to the normal pre-media perception of the world and not to its simulation in artifacts such as pictures, photographs, and films. It would have to consistently denounce our sensual perception of the world, as a radical idealism would demand, and recognize it as mere appearance, as Maya. However, this philosophical or religious line of thinking also demarcates the limitations of this consideration, because it fails to exclude such radical doubts. It seeks to separate categorically anthropological-functional reality experience from the ontological-essential representation of reality and to allocate immersion only to the first, purely phenomenal level of everyday perception. The location of art is certainly not limited to this. It is full of attempts to symbolically transcend the circumscribed cave of our own neural make-up, or even to skeptically describe the limits of potential experience. At the same time, like any sensory artifact, it is limited to the specific conditions of the nature of our perception. Even an abstract painting by Malevich is still a three-dimensional object in the concrete space of the museum.28

Let us summarize these considerations on natural science and perceptual psychology: *the various forms of illusionistic media practice, from central-perspective Renaissance images to today's smartglasses, are indeed nothing other than more or less well-crafted experiments in Gestalt psychology.* 

They are distinct in that they do not simulate a special instance of failure, but a part of our continuously functioning world perception. *If people from a wide array of cultural and historical backgrounds feel drawn to an illusion, then, as indicated above, we can speak of a double immersion, that is, our natural immersion is overlaid by a cultural one. The artifact in no way simulates a possible object, only the potential form of its perception.*

This also applies to dynamic and multi-sensory illusion machines. In contrast to the traditional static illusion images, we can enter into

<sup>27</sup> Leonardo's famous reference to the figures the artist should look for in cloud formations for inspiration is the exception, one in which the simplest formal stimuli must first be augmented by the viewer to form complex figures.

<sup>28</sup> Bazon Brock argues fundamentally against the idea that we can withdraw from our sensual origins: "No viewer of the work … can avoid asking why Malevich made it, because we are forced to assume, through the functions of our natural, naive perception, that nothing happens without a reason, and that a person's statement can never be so arbitrary that there is no connection between him and what he states [as a sensual-concrete natural being]" (1990, 312).

them more or less freely and, if necessary, interactively. But here, too, in the succession of processed images: in the concrete, literal moment of his perception, the viewer can only see one view at a time.29 This has nothing to do with the technical inadequacy of the apparatus, but with the mechanics of human perception, which can only be understood in terms of process. The same applies to other sensory modalities, even though these, with the exception of hearing, can hardly be simulated anywhere nearly as closely to our everyday perception as vision.

#### 2. The Applied Immersion

Let's return to the beginning: what consequences do the observations discussed above entail for the "art of immersion"? Can it even exist in the sense of an autonomous modernity? Up to this point, it has been obvious that it's not a matter of an art historical question in a narrower sense; rather, it can be understood as a comment on media theory. In this respect, it is important to consider once again the status of immersion in the pre-modernist art of the modern era, because an "art of immersion" in the sense of an autonomous modernity can only be understood in relation to its paradigms. Illusionistic art is by no means representative of all art, but it has nonetheless dominated the history of modern painting for centuries. The issue here is no longer the image in general, but the illusionary image and illusionism as a specific, albeit extremely influential and multifaceted artistic strategy employed across a number of epochs and media. And occasionally, this strategy has been combined with the concept of realism and mimesis, without these fields necessarily having to overlap.

<sup>29</sup> Virtual reality or simulations are always models, but it is important to note that not every model, for instance a floor plan for a new building, is immersive. Every illusion is based on models, but what we initially see, even with dynamic artifacts, is only the model of a specific view. *This* is how it looks when I move through Iraq as a GI. In this vein, the player's hypotheses are confirmed in the first-person shooter game *Call of Duty*. If the underlying model of war, military, and weaponry has been well researched, this naturally increases immersion at the level of cognitive assent and affective involvement. But here, and this would be the subject of a separate essay, a distinction must be made between realism and illusionism, no differently than in classical art. For this concept of realism see Dvořák 1924.

Immersion, then, is a modern cultural technique developed by classical art of the thirteenth and fourteenth centuries and extending to the borders of modernity as a natural aesthetic method and an independent visual entity.30 This use finds its natural continuation in the mass visual communication of media and design and the computer games of our day, while in autonomous modernity it has been at least called into question, if not made entirely taboo. The decisive factor with every artificial illusion in 'classical art' is that it is visible as such, i.e., that immersion remains perceptible as such. It is precisely the artificial, perhaps artistic construction that constitutes the fascination: be it with the birds trying to pick Zeuxis's grapes, or with the spots of light on Vermeer's bare wall.

*Figure 13: Vermeer. White wall in: Lady with the Pearl Necklace 1662–65.*

This also applies to the numerous pre-industrial room ensembles ranging from the *Camera dei Misteri* to the heavenly skies of the Baroque era, the overwhelming effect of which was not least due to their artistic skill. From the panorama to film, later forms of mass entertainment

<sup>30</sup> If, that is, one disregards the (albeit mainly literary) testimonies of antiquity.

in nineteenth-century industrial culture also derived their power from this fascination for an 'as-if.' In this respect, classical mimesis was always artificial and illusion was always, simultaneously, artistic skill. Depending on the priorities of idealistic or realistic conventions of representation prevailing at a respective time, illusion tended to be of secondary importance, as in the Roman art of Raphael or Michelangelo, or, as was the case with the Florentine Mannerists, it was displayed as a work of art in itself. Even the numerous and widely circulated anecdotes of artists from antiquity and the Renaissance often contain an element of immersion when they describe themselves as being overwhelmed by deception. Ultimately, people's fascination with early photography and cinema was no more than a continuation of this response (cf. Busch 1989).

As technology advances, however, values shift and media technologies, all of which succeeded in capturing audiences' interest in their infancy, soon become commonplace and boring. In the tradition of a county fair, one marvels at how amazingly real the simulation seems. The benchmark is always a previous standard set by the media. But illusion alone—with which Lumière's cinema, for instance, was once able to fascinate people in the early years of film—is just as uninteresting today as the full-color image, which for a few years attracted audiences to the movies, in competition with TV, which was still black and white at the time (for more detail see Schmitz 2013). But even in the comparatively brief history of digital simulation, this moment of amazement is a thing long gone; today, everyone has access to conventional threedimensional animation on their personal computer. The short-lived nature of the effect already occurred with the central perspective of the Renaissance, as well as later, with photography. Today, decades after they were made, only historians get excited over the first computer simulations. The dynamic of ever more illusionistic technologies outdoing one another is an inevitable one; today, it once again dictates the promises of the entertainment industry. But this race cannot be won, because viewers become familiar with the latest generation of illusionistic techniques so quickly that they can no longer take anyone in. *On the one hand, for an aesthetics of immersion to be successful in terms of communication, a corresponding* invention *of appropriate spectacular content and specific narrative forms of representation has always been necessary. On the other, once the initial media appeal has faded and the technique has become* 

*Figure 14: Na'vi in Avatar: James Cameron, Avatar (2009).*

*a social convention, every technology of illusion since Renaissance painting has required an emotional framing, a familiar affective rhetoric.* 

*For the most part, therefore, immersion is also an effect of the emotional charge that pure illusion carries.*<sup>31</sup>

The Catholic Church demonstrated this again and again during the Baroque period, when it promoted illusionism less as realism for its own sake than as an instrument for simulating heavenly promises of paradise, etc.32 One need only think of the realism of Caravaggio, Bernini, or Spanish painting of the Counter-Reformation, which is still sometimes 'shocking.'

And it's no different today in the entertainment branch of the film industry, whose occasionally silly subject matter is made more compelling by the state of the art in three-dimensional illusion. But even this rhetoric of overwhelming viewers only works when they are aware of the difference, if it promises them an experience "larger than life" and

<sup>31</sup> For a visually rhetorical example of an individual media-specific analysis see Scheuermann 2009.

<sup>32</sup> As Max Dvořák demonstrated, it was especially the artists of the Baroque period who used the new illusionist artistic skills for church propaganda and a brief, anticipatory view of heaven (see Dvořák 1928, 82ff.). As an aside, I'd like to point out the relevance the observations of the spearhead of 'art history as intellectual history' still hold for media studies. The same applies to the theory of realism. For more details see Schmitz 1994a.

*Figure 15: Catholic Immersion: Bernini, Saint Teresa.*

far beyond the ordinariness of the everyday. It's this awareness that creates the particular appeal of first-person shooter games like *Avatar,* because who really wants to share the fate of the troops in *Call of Duty*<sup>33</sup> or the Na'vi in *Avatar*?

*Thus, in mass industrial culture, immersion never aims at abolishing the distinction between reality and fiction, but promises, at best, to help the latter resist the impositions of the former.* This is what modern entertainment shares with the images of the church's promise of paradise and the threat to the damned of the fate awaiting them. Even back then, people liked to work with the latest media. The Jesuit Athanasius Kircher used the magic lantern to surprise the faithful, while the painters of the Counter-Reformation used central perspective to offer a view of heaven through the dome of Il Gesù.

The difference between the object and its representation remains crucial for an aesthetics of 'applied immersion.' This is what creates the viewer's desire. In this respect, today's popular culture is the heir of classical art. *The aesthetics of immersion, as occasionally seen in classi-*

<sup>33</sup> For instance: *Call of Duty: Black Ops 3*, vol. 10.

*cal art and pervasive in today's popular culture, can be described as paradoxical in that it seeks full immersion in simulated worlds while deriving aesthetic pleasure from the remaining difference between the technological artifact and the world. In the applied arts, immersion therefore has an aesthetic character, but not an epistemic one: a difference that does not apply to autonomous art.*

#### 3. The Art of Immersion

How does art, in the narrower sense as autonomous art, relate to this? *In any case, the critique of ideology between Brecht and Adorno focused on the epistemological "naivety" of the "illusionists" and their potential for seduction and deception.*34 Immersion literally marks a 'losing sight' of reality, the pleasure of which has the power of a drug. Immersion as the opium of the people: this can be seen etymologically in the history of the meaning of a related term, "simulation." The Latin *simulare*, which means "to pretend," describes "in the original usage of the word the feigning or pretense of physical or mental illness in order to obtain pecuniary advantages in insurance claims or to be deemed ineligible by the military" (Gendolla 2002, 332).35

Immersion, always regarded here in the context of an irresistible emotional connection, seems to exclude any rational distancing, such as that demanded by a 'negative aesthetics.' In 2001, Gernot Böhme described the flaws in ideology-critical approaches of this nature:

Once the world of images has been recognized as a genuine part of human life, one no longer sees its development as a loss of nature in favor of an artificial world, as a farewell to reality in favor of a world of appearances

<sup>34</sup> Herein lies the moral verdict of classical modernism against illusion and immersion, as it served to frame a negative aesthetics in the debates of the postwar period (see Adorno 1973).

<sup>35</sup> The author continues: "The term has been largely detached from this usage, which is strictly bound to the logic of truth, particularly with the advancement of computeraided processes, and replaced by more a neutral technical semantics. In this, he describes the reproduction of physical, biological, social, and economic processes using models that make possible an analysis and application that is highly comparable to the simulated process, but is cheaper and less dangerous."

(the simulacra, Baudrillard); one can no longer simply criticize the world of image consumption as fascist (Flusser) or as a culture industry (Horkheimer, Adorno). The criticism will have to be more concrete and not so much a critique of the visual worlds as such, but rather an inner critique, a critique of how they are produced and used. (Böhme 2001, 59)

In everyday life, highly developed images of spatial illusion have long since found their way into the collective consciousness in ways that are completely different from those of high art. Computer games and other adventure simulations, i.e., forms in which the impression of reality is superimposed by a strong rhetorical appeal to affect, have become a matter of course. In the fast-paced first-person shooter game, there is little time to consider the aesthetic difference, and its absence in perception leads to a loss of that "mental space of prudence" that Warburg placed above a state of being overwhelmed by the media (Warburg 1932, 534). And it is just such a powerful effect that art's surpassing of the phenomena demands, because only in this way can it escape the pornography inherent in it as a confusion between object and depiction. But the modernists' aversion to immersion was more fundamental than a 'political' perspective of this kind.

*Figure 16: Picasso, Still Life with Violin and Grapes.*

*Wasn't it one of the founding legends of the modern "truth seekers" that they questioned the realistic work of art's naive representational character in favor of the deconstruction of mimesis in order to bring the hidden "true reality" to the fore—in good Platonic tradition?*<sup>36</sup>

Even still, nineteenth-century realism was recognized as a pioneer in terms of the way it questioned pictorial means; on the other hand, modernism has been characterized by a large number of 'new realisms' since at least the 1960s. Pure illusionism, that is to say illusionism without reflection, has at the latest—with the photographic realism of Anton von Werner's historicism—been excluded from the artistically permissible and has drifted into popular culture or kitsch.

*Figure 17: Anton von Werner, The Proclamation of the German Empire.*

<sup>36</sup> The positions of the modernists oscillated between a kind of transcendental critique of visual forms in the tradition of Fiedler and Cézanne and a search for transcendent truths among the newly religious symbolists grouped around Gauguin and Kandinsky. For more detail on these aspects see Schmitz 1993.

Gottfried Böhm concisely described the modern consensus on this: "The old idea of pictorial representation, the idea of representing content by means of the work, is subject to examination. Mirroring, heightening, and celebrating reality with pictorial means can also be seen in modern art. Taking the equivalence between image and reality for granted has become a problem, of course, and the relation itself is up for discussion" (Böhm 1985, 113). But it is precisely this widespread consensus that has faltered over the past 20 years. After art has largely left them behind since the beginning of the twentieth century, the power of illusion in digital image processes has once again drawn attention to illusion and immersion as media-technical processes for creating virtual spaces. This applies not only to modernism, as Clement Greenberg deemed it the dominant aesthetic norm of 'Western art' in the postwar years,37 but no less to the many varieties from *nouveau réalisme* to *pop art,* whose ideology-critical interest or affirmation of a late- or postcapitalist media world are ultimately based on the difference between the artificiality of the media and the authenticity of a pre-media reality.38 But precisely in view of the mass practices of everyday image production and reception in a largely digitized society, this approach seems increasingly old-fashioned, and an attractive political perspective can hardly be developed out of such image criticism. And so one cannot simply repeat old concepts of deconstructing mimetic processes here; rather, one must take into account the above-mentioned advances in natural scientific knowledge, the technical prerequisites of which, for example in neurophysiology, are closely related to those that made the new digital image worlds possible in the first place. Perhaps this is also the real meaning the universal machine of the computer holds for our

<sup>37</sup> In particular, the universal machine of the computer had to question the criteria of self-reflective media specifics in Greenberg's sense, not so much because its own 'materiality', compared to classic media such as painting or woodcut, i.e., the phenomenology of hardware and interfaces in relation to canvas and wood, remains nearly invisible, but because with advancing technology, all, or at least almost all, media can be represented in it (see Greenberg 1997b). The late Friedrich Kittler deeply regretted this when he compared the shift from the huge circuit diagrams laid out in gymnasiums from the pioneering years of IBM with the Intel chips of our day. The beauty of pure media materiality, which is one of the medium and not of representation, disappeared (ca. 1988, at a panel discussion held at the University of Wuppertal with Bazon Brock).

<sup>38</sup> On the importance of rethinking representation in contemporary art see Rebentisch (2013, 150–165).

time—beyond the ontologies and fetishizations of the media materialists (see Bowlter 1990)? What would a truly modern, i.e., literally contemporary, artistically reflected approach to the practices of illusion and immersion described above look like?

*If immersion in an ever-progressing quality merely depicts our normal state as physical beings in the 'prison' of our neuronal constructions, then this doubling also denotes the aesthetic difference.* In the applied arts, this distinction becomes the effect of instruction, edification, or entertainment in the tradition of the trident *docere*, *delectare*, and *movere*. *To my mind, the real "autonomous art of immersion" consists in using the most advanced technologies to make us recognize illusionism as the normal state of our nature, which we can never transcend. It is precisely the perfection of today's technological standards that takes us to the limits of our own world construction.*

The history of classical mimesis was by no means that of consciousness or reflection on the above-mentioned susceptibility of human perception. More Aristotelian than not, it always proceeded from the idea of an adequate reproduction of the outside world as an object confronting the subject, right up to positivist objectivism, which to a certain extent found its manifestation in the subjectless mechanical camera. On the other end are the traditions of advanced modernist realism from Édouard Manet to Gerhard Richter.39 As much as modern realism practices the 'photographic gaze', it has been informed about the constructed nature of our human perception at least since Helmholtz and the spectacular research into the psychology of perception (cf. Crary 1998).

*In this respect, the "art of immersion" marks the end of the classical conception of mimesis imagined as an asymptote in our approach to reality. At the moment when the perfection of illusion's techniques renders the technology responsible for creating them almost invisible, it reveals the way human perception functions, which is a compelling and inescapable immersion.*

Monet's late "abstractions" and the radical intensification in the divisionist methods of the "scientist" Seurat led to what is perhaps the most subtle immersive effect in the whole of art history: if one keeps

<sup>39</sup> And occasionally, the Romantics also made significant contributions to this when they sought to reconcile their transcendent view of the world, the 'sacredness of nature', with the findings of modern natural sciences.

looking from the right angle, one can eventually see the 'skin' on the surface of the water. This makes seeing itself visible.

So what is the fundamental problem of the critique of mimesis, which has accompanied the entire discussion of aesthetics since antiquity and has become almost intrinsic to the idea of a self-reflective modernity since the nineteenth century?40 *When the critics of illusionistic representational art find every depiction inadequate compared to the complexity of visual perception, particularly when regarded in close connection with the other senses, they assume that our sense organs are capable of adequately depicting the world. This is the position of a naive positivism.*

This is where the discourses on the 'crisis of representation' should change, if art does not wish to lose its function as observer and as the avant-garde of visual culture.

*The "art of immersion," whether as a subversive strategy within popular culture or the art system, consists in making the aesthetic difference between the object and its depiction visible again—not in the traditional spirit of deconstructing mimesis, but as a way of addressing the constructed nature of our everyday perception of phenomena as the insurmountable limitation of the human condition.* The conventional critique of representation still presupposes a simple positivism in the negation. The experience of near immersion in particular refers to *natural perception* as permanent immersion. *The immersive artwork is at best a mirror of the mechanisms by which we create our own literal worldview.*

In one respect, the classic trompe-l'œil can be a helpful epistemic tool here, for example when Cornelis Norbertus Gijsbrechts in his image of the reverse side of a canvas, painted with the utmost technical mastery, gave almost involuntary expression to the paradox of dual immersion in the doubling of the body of the painting as a picture. *The paradox here is not merely technological, but also conceptual.*

Through this refusal of immersion—in the illusion we expect in the painting, we see no more than a deceptively painted canvas—we overlook, for a moment, the illusion of the picture as a whole. This was how Parrhasius once triumphed in his contest with Zeuxis. When the latter, after lifting the curtain concealing his picture, proudly presented his still life to the applause of his competitor, he curiously sought, in all likelihood out of the excited atmosphere of the showdown, to lift the

<sup>40</sup> On the concept of realism see Kohl 1977.

*Figure 18: Cornelis Norbertus Gijsbrechts: reverse side of a canvas, 1670.*

curtain covering his competitor's work—only to see that he'd mistaken a painted veil for reality.41 Indeed, paintings such as the unframed back of a canvas by Cornelis Norbertus Gijsbrechts from 1670 do not stop at this last deception, because the real material picture on the canvas, with all its perfection owed to Parrhasius, is itself the subject of the picture and forms only one marker within an infinite loop of illusion. This is the site of a kairos of immersion—and it can be considered a program for a digital "art of immersion."42

<sup>41</sup> Plinius, *Naturalis historia,* 35, 64.

<sup>42</sup> Younger contemporaries may imagine the kairos as the inventor Gyro Gearloose, when a light bulb above his head announces an ingenious idea.

#### List of Figures


#### References


*und (proto-)filmische Apparate*, edited by Lars C. Grabbe, Dimitri Liebsch, and Patrick Rupert-Kruse, 115–140. Cologne: von Halem.


## On the Politics of Augmented Reality

*Jens Schröter*

#### Abstract

This chapter surveys the field of augmented reality (AR), in "which 3D virtual objects are integrated into a 3D real environment in real time" (Azuma 1997, 355).1 Augmented reality thus means: in real time, digitally generated information is superimposed on the views of real objects on site.2 In section 1. AR is historically differentiated from 'virtual reality' (VR). In 2. some applications of AR are presented and problematized. It is especially asked what political functions they can have in late capitalism or—as Deleuze (1992) has put it—"control society." A conclusion is given in 3.

#### Keywords

Augmented reality, virtual reality, Gilles Deleuze, control society

<sup>1</sup> All quotations from German sources are given in English (author's translation).

<sup>2</sup> Cf. on some of the informatics background: Bimber and Raskar (2005) and Haller et al. (2007). The cultural and media studies debate on AR is small, cf. Fahle 2006: Fahle essentially refers to a special AR project at the Bauhaus University Weimar and its image-theoretical implications. Cf. also Manovich 2006: Manovich, in turn, treats AR only as a subset of his preoccupation with 'augmented space' and mentions the use of smartphones discussed here rather in passing (Fahle does not mention it at all).

#### 1. AR and VR

AR can best be outlined by highlighting the difference to VR.3 The basic idea of VR was to create an immersive, simulated environment that more or less encloses the user through appropriate display and interaction techniques, in which the user is no longer aware of the outside world that actually surrounds him or her.4 In contrast, the idea of AR is to combine elements of a simulated environment with elements of a real environment. This is intended to 'enhance' (augment) the perception of reality, e.g., by superimposing certain types of pictorial, written or acoustic information on the image of the real space. Insofar as it is a matter of connecting audio-visually presented information with the currently given surrounding space at the currently given location, AR applications are virtually prototypical examples of location- and situation-related media processes.

In the following, some brief notes on the archaeology of AR are given, which on the one hand show that the concept was already laid out (for good reasons) at the beginning of that development, but which on the other hand initially led to the discourse on VR at the end of the 1980s. One of the names that is always mentioned when talking about the history of VR is Ivan Sutherland (cf. Schröter 2007). This is firstly due to the fact that he published his essay The *Ultimate Display* in 1966, in which he envisioned an ultimate visualization technology whose images would be indistinguishable from reality—one can thus see where the scenarios of, e.g., The *Matrix* (Wachowski 1999) come from (cf. Sutherland 1966). Sutherland describes the final image environment, so to speak. These ideas were also perpetuated in the theorizing of the 1990s; as late as 1995, Elena Esposito wrote: "In a fully [sic] successful virtual reality project, the reality effect is supposed to be so effective that the objects can no longer be distinguished from the objects of 'real reality' independent of the machine" (187). But Sutherland was not only the first 'visionary' of VR. Second, and more importantly, he contributed real technical developments to the genealogy of AR as well as VR—in particular, the head-mounted display (HMD) that virtually became an

<sup>3</sup> Cf. Milgram et al. 1994 for the placement of AR and VR on a continuum of different 'mixed realities'.

<sup>4</sup> Cf. for the following in more detail Schröter 2004a, on 166–168 there are remarks on the genealogy of the concept of the 'virtual', which are decisive for the present essay.

*Figure 1: 'Data glasses' as a typical representation of VR, circa early 1990s.* 

icon of VR as 'data glasses' in the early 1990s. Fig. 1 shows a typical picture of the time.

Sutherland and his collaborators develop the first HMD by 1968. Their work is published in 1969 in a paper titled *A Head-Mounted Three Dimensional Display.* The first paragraphs outline the basic idea:

The fundamental idea behind the three-dimensional display is to present the user with a perspective image which changes as he moves. … The image presented by the three-dimensional display must change in exactly the way that the image of a real object would change for similar motions of the user's head. … Our objective in this project has been to surround the user with displayed three-dimensional information. (Sutherland 1968, 757)

At first, it all sounds like VR: the user is environmentally 'surrounded' by information, and the constant recalculation of the image depending on the user's movement causes the virtual environment to change for perception in the same way as it would when looking at real objects (of course, at the time of this writing, it was about simple wireframe graphics). But what is sometimes overlooked in placing this first text in the genealogy of VR is that Sutherland's HMD was semi-transparent, allowing computer imagery to be superimposed on real-space imagery:

Half-silvered mirrors in the prisms through which the user looks allow him to see both the images from the cathode ray tubes and objects in the room simultaneously. Thus, displayed material can be made either to hang disembodied in space or to coincide with maps, desk tops, walls, or the keys of a typewriter. (Sutherland 1968, 759)

In other words, Sutherland's goal in developing the HMD was not alone to create an immersive space (which would seal off the viewer). The HMD was conceived as an interface that should enable the presentation of information in a meaningful and complexity-reduced way (e.g., for scientific visualization or military purposes—see the "maps" Sutherland mentions). HMDs should rather serve to increase the efficiency of the subject.5 In this sense, it is precisely not a precursor of the illusionistescapist VR of the early nineties. For this discourse, an example: Jaron Lanier is often portrayed as the inventor of the term virtual reality and was long considered the VR guru (cf. Hayward 1993, 198–200). He also produced the first commercially available VR systems (brand names: *EyePhone* and *DataGlove*) with his company *VPL*. In Lanier's view, despite the realism that is otherwise always invoked, the virtual environment is by no means committed from the outset to a realistic rendering of real scenery and real bodies. What's the point? After all, creating a VR that then appears just like 'normal' reality is somehow pointless. Lanier calls for the fictionalization of VR. According to him, a whole spectrum of possibilities is available, which also allows for the self-representation of the user as a fictional character. In VR, Lanier elaborates, "[one] could easily be a mountain range, or a galaxy, or a pebble on the ground" (1991, 72). Thus, at least in principle, a free fictionalization of one's own body also becomes possible—even if it remains unclear exactly what it means to 'be a galaxy'. Lanier repeatedly underscores the recalcitrant character of the material and corporeal world: "The tragedy of physical reality is that it is compelling" (1991, 81).

Lanier's discourse shows quite clearly what the attraction of VR was—another world into which one thought one could escape, as it were. It seemed possible to leave the prison of physical reality. Perhaps it is no coincidence that such ideas flourished around 1990. In 1989/90 the Cold War ended, the 'end of utopias' was proclaimed—and so perhaps utopian charges of the new computer technologies pushed into

<sup>5</sup> Cf., e.g., HMDs as special displays for fighter pilots: Furness 1986.

that vacuum. These utopian charges are precisely what is now called the socio-technical imaginary in STS, in Jasanoff or in Kirby (cf. Kirby 2010 and Jasanoff and Kim 2015). Thus, Bernhard Waldenfels noted, "It may be that the 'old European' illusions of history, after their decline, will be replaced by technological fantasies of omnipotence from the New World" (1998, 197). The liberation from one's own body supposedly possible in VR leads Lanier to the thesis, illustrating its utopian status, that VR "means the absolute abolition of class and race distinctions and all other advanced forms [since] all forms are mutable" (1991, 83). This, too, can be read post-1989 as a displaced return of the otherwise obsolete social utopias that had promised precisely the overcoming of social injustice and racism.

It should hardly come as a surprise anymore that VR (at least in this strong form) never established itself (even if currently, with technologies like the Oculus Rift, a certain comeback of VR seemed to be in the offing). For example, the creation of even a reasonably convincing virtual image-sound space is technically demanding (although there's been a lot of progress recently), the simulation of the tactile experience (such as through 'data gloves') is cumbersome and costly, VR perception encounters problems such as the conflict between audiovisual and proprioceptive perception ('simulator sickness'), and collective processes of reception are hampered. Most importantly, their escapist function is hardly compatible with the functional imperatives of the post-1989/90 global capitalist world order (as is the case for drugs). If VR-like environments are used today, it is in simulators to train subjects and optimize them for specific tasks (Schröter 2022). The point is precisely not to replace the world with a VR, but rather to master and control the world with the help of virtual spaces. Our highly technical, high-risk culture (airplanes, nuclear power plants, etc.) needs such "control environments" (Ellis 1991, 327), as one author put it in the journal *Computing Systems in Engineering* in 1991, in order to be able to operate at all (cf. Schröter 2004b).

Therefore, the fact that the possibility of AR, i.e., an overlay of the virtual space with the real place, which was already suggested by Sutherland, is becoming increasingly important today—apart from training simulators—is not surprising. While VR (at least in its phantasmatic form) is supposed to allow escape from this world, AR serves to enrich it with information, i.e., to functionalize and optimize it. Therefore, it is much more important today—and its diffusion ultimately a sign that new media do not (or not only) fundamentally change the world as a rule, but are integrated into the dominant structures in order to accelerate them, for example, and thereby generate productivity advantages in capitalist competition (which does not mean that the new technical processes do not also lead to shifts, disruptions and conflicts).

#### 2. Different forms of AR applications

Due to the proliferation of smartphones, we can all now overlay the world with data and information in real time. A nice overview of more than 50 AR apps for the iPhone is provided by the website *Iphoness* with the article "50+ Best Augmented Reality iPhone Applications" (Ci 2019). These applications use the iPhone's camera to make the image of the location visible on the display and overlay it with information in real time. One can roughly and heuristically distinguish three different categories of AR applications at this site:


In the following, we will discuss these different forms and their implications.

#### 2.1 Location-optimizing applications

As Fig. 2 shows, many of the apps are designed to provide geographic information. The idea is to be able to better orient oneself in a given environment by superimposing GPS data, the image of a compass, etc., on the location in real time. Very practical applications can be included: For example, you can tag the place where you parked your car to easily find your way back to the car (although no image overlay is actually necessary for such a function). That is, the space is functionalized to save time. In this way, the AR can serve to erase the figure of the flâneur as it emerged in modernist literature by Baudelaire, Benjamin, and oth-

#### *Figure 2: Excerpt from the article "50+ Best Augmented Reality iPhone Applications: Location-Optimizing Applications."*

ers. If the "minimal definition" holds, "that the flâneur roams the metropolis directionless and aimless" (Neumeyer 1999, 17),6 then he can be associated with a 'poetics of doing nothing'.7 And insofar as 'directionless and aimless roaming' is also a refusal of efficiency and functionality, AR can be understood as a technology of increasing the efficiency of the subject.8 Moreover, the possibility of orientation necessarily means at the same time that the position of the user must be known—and as scandals around this have shown—can also be stored by smartphones: "The close connection between surveillance/monitoring and assistance/

<sup>6</sup> Cf. a beautiful example in Bergman: "We roamed the city without purpose, got lost, found our way again, got lost again" (1987, 197).

<sup>7</sup> Cf. Fuest 2008, in particular chapter III.

<sup>8</sup> Basically, the archaeology of increasing efficiency through 'augmentation' can be traced back to Douglas Engelbart's program of an 'augmentation of human intellect' through the targeted use of computers, cf. Engelbart 1962.

augmentation is one of the key characteristics of the high-tech society" (Manovich 2006, 222). The optimization of the moving subject is thus twofold: not only is the movement itself made efficient, but movement profiles also potentially accumulate, which are less likely to be used for political surveillance than for commercial exploitation.

At the same time, or in other apps, background information from databases such as Wikipedia, etc., can be superimposed on the image, so it is a matter of charging the surroundings with meaning. For example, the article "50+ Best Augmented Reality iPhone Applications" on *Wikitude states*, "[A]nother cool augmented iPhone application that helps you explore your surroundings effectively on your phone" (Ci 2019). As practical as this is, the question certainly remains as to how to classify this operationalization of the environment through its overlay with virtual information spaces. A paper nicely titled "7 Things You Should Know About Augmented Reality" (Educause Learning Initiative 2005) discusses the didactic use of AR, stating outright that one of the possibilities of AR is to extend learning to everyday life, in a sense turning everything into education. In this, one can see an element of the control society order described by Deleuze, in which "*perpetual training* tends to replace the *school*, and continuous control to replace the examination" (Deleuze 1992, 5). Similarly, to how, thanks to geomedial apps, no time should be lost in searching for a car, for example, leisure itself becomes a space for further education: both times it is about optimizing subjects and their actions. At the very least, the question can be raised as to whether the superimposition of information on things does not also limit the scope for interpretation and contribute to a homogenization of the experience of things. The spread of AR apps on smartphones could therefore also lead to a homogenized interpretation of things—as a global semantic matrix, as it were, which is part of the globalization processes.

#### 2.2 Place-ludic applications

A large part of the apps in the article "50+ Best Augmented Reality iPhone Applications" are games.

Here, the image of the location is overlaid in real time with game characters or the like; in one application mentioned, you can look

#### *Figure 3: Excerpt from the applications presented in the article "50+ Best Augmented Reality iPhone Applications: Place-Ludic Applications."*

through the iPhone at your feet and the image of a soccer ball is overlaid, the app recognizes the feet, and you can kick the virtual ball in front of you. Such casual games are ideal tools for passing the time, e.g., while waiting or on the way to work, and as such contribute to the operationalization of the increasingly demanded mobility.

Games are obviously not about enriching the outside reality with information in order to functionalize it, as in the applications mentioned in 2.1, but about making the (mostly) known surrounding space the setting of the games themselves. Thus, a virtual game image does not replace the view of the outside world, but rather it can be experienced in a new way. An almost childlike pleasure in the rediscovery of the world is opened. A certain Harald Ebert of Nintendo remarks precisely in this sense: "There, one's own living room table becomes a video game level" (Ebert 2011). At the same time, handling AR games

may also require incessant movement of the body to move the console so that ever new sections of real space become visible and superimposed.

The increasing popularity of AR games—or rather AR 'gimmicks' seems to be an expression of a steady (demographic) expansion of computer game culture: This has probably reached its temporary peak with *Pokemon Go.* It can be observed that computer games appeal to ever broader sections of the population and that the stereotype of the 'hardcore gamer' is now the exception rather than the rule (cf. Newman 2004). Especially the triumph of so-called casual games—Jesper Juul rightly speaks of a casual revolution (Cf. Juul 2010)—marks an important developmental step here, which was also supported by the success of the *Nintendo Wii* (with its new interface possibilities). If, for example, people play more on the move, the appeal of AR gaming lies precisely in making use of the particular place where one is located in a playful way. The interesting contrast of the applications discussed here to 2.1 is that it is not about an optimization, but a gamification of the location in real time. And as mentioned above: This can be seen as a strategy to make mobility more bearable.

#### 2.3 Place-aesthetic applications

Finally, the aforementioned website about the best AR apps for the iPhone mentions an app that does not quite want to fit into the previous two categories (Fig. 4).

This program, called *Ikea Place*, by the Swedish furniture store Ikea, is about overlaying images of the place with images of pieces of furniture in real time, and in this sense having creative access to one's own environment. Therefore, the term 'place-aesthetic application' was proposed, even if it is not an artistic application in the strict sense—although there are of course such applications (see below).

The AR application thus reduces the anxiety (and again: the time needed) that can accompany furniture shopping, insofar as it allows testing in advance whether a piece of furniture will fit into the home environment. If we disregard the weakness of visual imagination that is thus revealed, another optimizing function of the AR application cannot be overlooked. In a sense, it functions as a new form of catalog that allows the goods presented in isolation in the catalog to be

*Figure 4: Excerpt from the article "50+ Best Augmented Reality iPhone Applications: Place-Aesthetic Application."*

situated and thus enables a better assessment of whether the object to be purchased fits into the overall design of the living space. The catalog begins to overlay the real space. Initially, therefore, the purchase of goods is to be facilitated. Obviously, such applications imply an aestheticization, since questions about the price, workmanship, etc., of a piece of furniture recede in favor of the question of whether the object 'functions' aesthetically in the surrounding space of one's own home. In this respect, a social segmentation—in Bourdieu's sense—is also evident here, which is not entirely surprising. Users who can afford an iPhone can also put aside the question of the cost of a piece of furniture in favor of their self-stylization. At the same time, this self-stylization, e.g., through 'coherent' furnishings, is an option of gaining difference, or 'individualization' vis-à-vis others. In this respect, this AR application is a technology of self (Foucault) for the aestheticist individualityproduction of postmodern consumers. The production of difference is essential for market and brand diversification, because for consumers the connection to certain modes of design can appear as 'selfhood' and thus circumvent the very impression of heteronomy through a 'culture industry' (Adorno/Horkheimer) (which, by the way, should also apply to Apple itself). Insofar as AR applications make the place in real time the permanent field of design of this seemingly autonomous practice of difference, they are a technology of domination.

But there are also other aestheticizing practices. AR processes, for example, can actually be a starting point for artistic practices. There are also AR art projects for smartphones.

*Figures 5 and 6: Augmented Reality Art Invasion 2010, http://www. sndrv.nl/moma/.*

These images document a project that took place on October 9, 2010 (Figs. 5 and 6). Visitors with the appropriate smartphones and AR software (Wikipedia contributors 2019) can participate in a virtual and unofficial exhibition at MoMA:

The virtual exhibition will occupy the space inside the MoMA building using Augmented Reality technology. The show will not be visible to regular visitors of the MoMA, but those who are using a mobile phone application called "Layar Augmented Reality Browser" on their iPhone or Android smartphones, will see numerous additional works on each of the floors. (Veenhof 2010)

That is, with AR, the space of MoMA is in a sense occupied and the authoritative selection of works and the narrative of their arrangement are subverted, broken through, and thus shifted. This can certainly be understood as a subversive attack on MoMA's hegemonic function (however, the AR exhibition can equally be seen as a recognition of MoMA's hegemonic role). Here, critical potentials of an AR art are hinted at, which opens the stabilized spatial structures to new ways of interpretation and perception.

### 3. Conclusion

It can be seen that AR addresses an important area of image environments, which allows a wide range of optimizations, gamifications, and aestheticizations of the control society, but also contains critical potentials. Image environments, especially in simulators or even in optimizing AR applications have functions of control, but the variety of applications can probably not be reduced to this. To investigate the exact forms of application concretely in their situatedness (and this is by definition unavoidable in AR) is a central task of (media ethnographic) research. A media aesthetics of AR or of AR as a form of a more general 'mixed reality' (cf. Milgram et al. 1994) would have to systematically explain how auditorily and visually virtual and real spaces and objects are related to each other, and which parameters are decisive in this process (Schröter 2018). It could thereby also describe which paths of movement, modes of interaction, and perceptual potentials are thereby

enabled or obstructed for the potential viewers and users and in what way. Today, the question of the difference between the real and the virtual must be further developed into the question of the media-aesthetic strategies of their connection, as well as their political implications.

#### References


*Navigationen. Zeitschrift für Medien- und Kulturwissenschaften* 7.2. (2007): 33–48.


## Hardware Effects on AR Pictoriality: A Phenomenological Approach

*Niklas F. Becker*

#### Abstract

Different pictorial media let images appear in different ways and evoke differences in their pictoriality. Following the premise that augmented reality (AR) technologies let, inter alia, virtual images appear, this chapter examines changes in the pictoriality of AR images in regard to the hardware that is used for displaying such images. Utilizing the ideas and terminology of Edmund Husserl's phenomenology of the image, the article focuses on the changes of the relational mesh between AR image carriers (namely, mobile screens and HMDs), the image objects they display, and the represented image subjects, as well as the perceptive and interactional position of the user in this mesh.

#### Keywords

Augmented reality, AR pictoriality, displays, interfaces, phenomenology of the image, Husserl

### 1. Theoretical Introduction

Never before has the world of images around us changed so fast as over recent years, never before have we been exposed to so many different image worlds, and never before has the way in which images are produced changed so fundamentally. (Grau 2004, 3)

#### For whether we regret or rejoice over the undoubtedly epochal step in the history of humanity constituted by the new images, we can no longer stop or even reverse this process. (Wiesing 2010, 101)

Computer technology changed, and still is changing, the ways images appear to us. The two epigraphs are drawn from works on the subject of virtual reality. If we look at media technologies that have the term "reality" in their names, this very nomenclature already hints at a shift in the understanding of image worlds with regard to their (non) relation to the objective reality. All of these technologies—virtual, augmented and mixed reality—are computer technologies, but each one shows different hardware manifestations in comparison to the others. Even within the spectrum of hardware manifestations of one of the "realities," differences can be found that might significantly change the pictoriality produced by the technology at hand. In the following I will focus on the pictoriality of phenomena in augmented reality and how this pictoriality is subject to change depending on which hardware is used for making these AR phenomena appear.

The following analysis is based on a phenomenological understanding of pictoriality. While there is more than one "approach… to perception-oriented image theories" (Wiesing 2011, 239), there is "a fundamental idea at the basis of all phenomenological image theories, namely, that the perception of images leads to a perception *sui generis*." (Wiesing 2011, 239) When we look at an image, whether it is painted on a canvas, drawn on a piece of paper or generated on a computer screen, we see an object that is only visible—an object that Edmund Husserl calls the *image object* (*Bildobjekt*). The image object is free from the physics of time and space and therefore has to be differentiated from the image as a physical thing (*Bild als physisches Ding*)—the canvas, the paper, the screen—or: the *image carrier*. The third aspect of image consciousness (*Bildbewusstsein*) Husserl names is the *image subject* (*Bildsujet*), the real or fictitious thing which is represented by the image object. When looking at the effects different hardware has on pictorial phenomena in computer technology, the focus will be on the relations between the image carriers (i.e., the hardware) and the image objects. However, the three named aspects are intertwined in such a way that looking mainly at the relations between two of them will not leave out the third one entirely. The mesh of relationships between these three aspects of pictorial perception is characterized by two essential differences.

First, there is always a difference between the image object and the image subject, because "if the image appearance showed no difference whatsoever from the perceptual appearance of the object itself, a depictive consciousness could scarcely come about" (Husserl 2005, 22). The second essential difference, which is more important for the issues discussed here, is the one between the physical image carrier and the image object. Gottfried Boehm introduced the concept of the *iconic difference* for the "difference between the physical image carrier and the imaginary image content" (Wiesing 2000, 22), which is the precondition for this conflict. This conflict, which is the constitutive basis of pictoriality (Wiesing 2000, 22), is one of the central ideas in Husserl's phenomenology of the image.1 Regarding an engraving, Husserl states:

[T]he paper apprehension … is also there in a certain way, connected with the continuously united *apprehension pertaining to our field of regard*; it is excited by it. However, while the rest of the field of regard enters into appearance, the paper apprehension itself is not in appearance, since it has been deprived of apprehension contents. Its apprehension contents now function as the apprehension contents of the image object. And yet it *belongs* to these apprehension contents: in short, there is *conflict*. (Husserl 2005, 49–50)

While the paper is part of the "*real* surroundings," the appearing image object is "however much it appears, … *a nothing* [*ein Nichts*]" (Husserl 2005, 50). Since the apprehension of the image object and that of the image carrier have at least "in part … the same substratum of sensation" (Husserl 2005, 52), there is a conflict between the two apprehensions, of which only one can appear to the viewer at once.

Even though Husserl's understanding and description of pictoriality—or rather: the occurrence of image consciousness in the viewing subject—is equally applicable to paintings, drawings and sculptures as well as images shown on digital screens, it seems that newer digital

<sup>1</sup> This opinion is shared, for example, by Lambert Wiesing (1996, 263), for whom this conflict is "the key concept in Husserl's image theory," and Alexander Haardt (1995, 105), who sees it as the "central idea … at the center of the analysis." In this paper, I endorse Wiesing's and Haardt's reading of the relevance of the concept of conflict. For a contrasting perspective and the presentation of other readings of Husserl's image theory, see Ferencz-Flatz (2009).

media technologies possess the potential to transform aspects of said pictoriality.

While in virtual reality "the screen of the virtual only knows an artificial … horizon" (Quéau 1995, 65) of perception, in augmented reality this horizon is (or at least it appears to be) the real surroundings the subject finds themselves in. This raises the question of what we shall understand as the "reality," which is to be augmented by the media technology. It is admittedly challenging to answer this question satisfactorily, which is why I suggest a short and practical definition for the following analysis based on some thoughts found in Alfred Schutz's work *The Structures of the Life-World*. 2 In analogy to G.H. Mead's term of the *manipulative zone*, which "presents the kernel of reality," Schutz terms the "zone which I can influence through *direct* action … *the zone of operation*" (Schutz and Luckmann 1974, 42). Schutz makes a distinction here that should not be underestimated:

It is of course useful to introduce a distinction between the *primary zone of operation* (the province of nonmediated action, and correspondingly the primary world within reach) and the *secondary zone of operation* (and the corresponding secondary reach), which is built upon the primary zone and which finds its limits in the prevailing technological conditions of a society. (Schutz and Luckmann 1974, 44)

While the term "zone of operation" refers to the radius of action of the subject, Schutz describes the (world in) reach as the perceivable. Thus, the primary reach implies unmediated (visual) perception and is accordingly characterized by a larger radius than the primary zone of operation. Primary reach implies the expectation of being able to bring the perceived thing "into manipulative proximity through a change of location" (Schutz and Luckmann 1974, 42). Technology expands both the radius of action—by means, for example, of bow and arrow or intercontinental missiles (Schutz and Luckmann 1974, 44)—and the reach. The technological extension of the latter is above all a media-immanent achievement: "I can telephone, pursue events on the television screen while they occur

<sup>2</sup>The book *The Structures of the Life-World* was completed after Schutz's death by his student Thomas Luckmann, which is why it is certainly correct to speak of co-authorship. However, since Luckmann states in the preface to the work that "this book is the *Summa* of Schutz's life, and as such it is his book alone" (Schutz and Luckmann 1974, xii), I will refer to *Schutz's* perspectives, thoughts, arguments, etc., when referring to the book.

on other continents, etc." (Schutz and Luckmann 1974, 44). Schutz thus implicitly differentiates between tools and media: the former are to be regarded as determinants of the size of the secondary zone of operation, the latter determine the extent of the secondary reach. Because of the intertwining of AR phenomena and the real surroundings in the perceptual horizon of the user, it shall be sufficient for the following analysis to focus our understanding of the term "reality" in Alfred Schutz's conception of the two zones of operation as well as the two reaches.

Before we pursue the main analysis, it is necessary to give a short explanation of which hardware will be examined (and why). Looking at the mediated visual perception of subjects using AR applications, the term 'hardware' is used synonymously here with the term 'display'. Because AR technology is computer technology, there are diverse manifestations of displays used for it. As Ronald T. Azuma rightfully anticipated in 1997, "AR systems … place a premium on portability, especially the ability to walk around outdoors" (Azuma 1997, 366). The wide distribution of smartphones, which can be utilized as AR displays, necessitate an analysis of the pictoriality of AR phenomena presented by mobile screens. While mobile screens must be seen as hybrid media technology, usable as AR hardware inter alia, there are technologies that are specifically and exclusively AR displays, namely, head-up displays (HUDs) and head-mounted displays (HMDs). While head-up displays, which nowadays are mainly used in cars, are mobile, they are not portable, but rather attached to a machine. They are therefore limited in the diversity of phenomena they will display. Head-mounted displays, on the other hand, do not necessarily know such a limitation; this fact makes a comparison between HMDs and mobile screens more salient than one between HUDs and mobile screens. The display class of HMDs must be further subdivided into HMDs "equipped with a seethrough capability … using half-silvered mirrors" (Milgram and Kishino 1994, 1322) and such offering video see-through capabilities. As we will see, the latter should rather be considered analogous to mobile screens with regard to the pictorial relationship between the augmented and the objective reality. Therefore, this analysis of display effects on AR pictoriality focuses on mobile screens and optical see-through HMDs, both of which can arguably be considered predominant forms of AR display manifestations both now and in the near future.

Different displays imply different interfaces. Mobile screens are exemplary for the status quo of "[i]nteractions with digital information … [being] largely confined to Graphical User Interfaces (GUIs)" (Ishii 2008, xv). Natural user interfaces (NUIs), which have an "understanding" of human actions (derived from eye tracking, voice, and gesture control) as well as of physical objects, and thus support the blending of real and virtual objects (Kaushik and Jain 2014), are primarily suitable for the operation of HMDs. When we regard interfaces as "dispositifs of handling" (Wirth 2019) and thus primarily as forms of operativity "opening up accesses" (Wirth 2019, 81), it stands to reason that different types of interfaces imply different accesses to media-technological contents. The type of interface implemented in each case in new media technologies such as AR accordingly plays a crucial role regarding the emergence of different manifestations of pictoriality.

#### 2. AR Pictoriality: Image Objects and Their Subjects

Before we go into the analysis of the effects of different hardware for augmented reality applications, we examine the premise that virtual elements, which are put into the perception of the subject by AR hardware, are what Husserl would have called image objects. In this case, the question of which image subjects are represented seems quite easy to answer. Looking, for example, at an arrow in Google's *Live View* (Google 2019), it is important to consider its semiotic function: "*Whoever* … *lets themselves be shown a direction by means of an arrow, understand the arrow as a sign for the direction, which the arrow itself also has by its own shape*" (Wiesing 2013, 119). The arrow functions as a sign, which designates a direction through a similarity relation; following the terminology of Charles William Morris' and Charles Sanders Peirce, it is therefore an *iconic sign* (Wiesing 2013, 122; 216). Other virtual elements in augmented reality, like virtual artistic sculptures, may represent fictional entities, while others can represent non-fictional entities such as an engine. A car engine, which is rendered virtually, can be "taken apart" and examined in AR (Holo-Light 2020), represents a real engine; whether this real engine has already been produced or is still in a planning phase is of secondary importance for the representational relation

between the image object and the image subject. Another "popular effect of AR is to translate fixed images into moving ones" (van der Veen 2021, 1192). An example for that was shown at the 2017 exhibition *Magic City – Die Kunst der Straße* in Munich. Four black-and-white photographs hung on a wall documenting works from the *Collision* series by New York artist Jordan Seiler, who removed posters from advertising boxes in public spaces and replaced them with his minimalist artworks. In the exhibition situation, the photographs serve as the basis for their own virtual augmentation. Each photograph represents a street scene in which we find the posters the artist hung in advertising showcases in a specific moment in time. But if the camera of the mobile screen, which (also) served as an audio guide for the exhibition, was pointed at the photographs, time-lapse videos appeared which documented the creation of the works in public space (Magic City 2017). The photographs

*Figure 1: One photograph from the series Collisions by Jordan Seiler. Source: https://www.instagram.com/p/BOXmM3ZArI7/, Accessed November 23, 2022.*

"are superimposed by themselves, but in motion" (van der Veen 2021, 1192): the augmented version of each photograph represents the same street, but a different scene, which precedes the scene shown by the still

photograph. In each case, the moving image sequence represents the act of mounting the artworks by the artist. Up to this point, the image relations between the image objects in AR and their subjects seem to be quite unambiguous.

The new media technology produces new pictorial potentials that are distinguished from those of other, non-augmenting images in particular by the mobility of the displays, which results in the location-based nature of the image objects. The interactivity, which is also immanent in other digital image manifestations, emerges in new ways through AR, generating new forms of image perception while the image character of the appearing phenomena is not dissolved. The fact that possibilities of interaction can generate a more determined transformation of the relationship between subject and image object than the previous examples have shown, is exemplified by the AR application *Pokémon Go*. Here, the situation is particularly complex because at least two virtual elements appear simultaneously "as if they are next to a user's real-world location" (Rauchschnabel, Rossmann, and tom Dieck 2017, 277) in the game situation "catching": the Pokémon, which is to be caught, and the virtual instrument for catching—the Pokéball. It could be argued that these are two image objects that appear in the same frame of the display, and furthermore, that they are in a direct relation to each other. But this misunderstands the concept of an *image object*, analogizing it too closely to a (real) *object*. We have already seen from the work of Jordan Seiler that image objects do not necessarily need to represent single objects, but that they can also represent scenes. A painting depicting a landscape shows only one image object—just as a portrait of a person against a monochrome background does. In the first case, the image object is the landscape; it is not an accumulation of image objects (e.g., trees and clouds). The image object refers to the image subject: that is, the painted landscape represents a real or fictitious landscape. Thus, the image object in the case of the catching situation in *Pokémon Go* is neither the Pokémon nor the Pokéball; instead, an image object appears that represents the fictitious situation of catching a Pokémon. The subject's possibility of interaction with(in) the virtual image object is highly relevant for understanding the *image apprehension* (*Bildauffassung*) in this case. If the image object represents the scene "catching a Pokémon," then the subject interacts *within* the image object rather than *with* it.

The player interacts with different elements of the image object and the interaction with one element of the image object takes place by means of another: specifically, the subject acts on the image element "Pokéball" to put it in an interactional relationship with the element "Pokémon." The subject becomes an acting *ego* in the image object. A few years after he gave his lecture on fantasy and image consciousness, Husserl postulates the possibility to "'project' myself into the image" (Husserl 2005, 556): "But that can only mean that I extend the image space over me and over the space of my surroundings, and, excluding the real things that I see, assimilate myself into the image …. My participation is then the participation of a spectator in the picture (the participation belongs to the image object) …" (Husserl 2005, 556). He adds that "sensuous appearance *eo ipso* presupposes an ego-standpoint," meaning that the image-perceiving subject is "*always in* the picture as picture-ego" (Husserl 2005, 556). While in classical image forms the image ego can only play an observing role, the interactive aspects of digital image objects are able to turn the image ego into an acting ego. Here, the subject does not "fantasize" themselves into the image; rather the game invites the user to be part of the image object by offering interaction.

#### 3. Hardware Effects on AR Pictoriality

#### 3.1 Mobile Screens as AR Image Carriers: Framed Reality

Mobile screens can be described in different ways regarding the form of pictoriality they produce as well as regarding the relationship between the AR images produced by them and reality. On the one hand, it can be argued that the appearance, which is transferred to the display via video see-through, is a pictorial appearance. In the mobile screen, which functions as a frame, the real surroundings are seen—the same surroundings that can be perceived when looking past the frame. The supposedly real surroundings *in* the frame are now described as an image of the real surroundings *behind* the frame. This reading of the situation is certainly worthy of criticism, and it will be subjected to such; however, let us first take it up for consideration and reflect upon the AR applications previously mentioned.

If the non-augmented reality displayed on a mobile screen already is an image object representing the real surroundings behind the screen, the virtual elements of augmentation cannot be described as individual image objects. The situation would rather have to be described as follows: the image object appearing on the mobile screen is precisely the virtual-augmented real surroundings. The image subject would consequently be a fictitious version of the real surroundings—fictitious even in cases where the virtual element refers to a non-fictitious entity because the image object represents a version of the real surroundings in which something is visible that is not visible in the currently real surroundings. This can be illustrated by the example of Jordan Seiler's photographs exhibited in Munich. In the "normal" perception, the subject perceives one of the photographs hanging on the wall, and there is a conflict between the analogue image carrier and the image object. The image object wins in the conflict and appears: thus, an image consciousness occurs. In the view "through" the mobile screen, an image consciousness occurs even before the framed photographs appear in the framing of the display. When the mobile screen is directed at one of the photographs, the latter is augmented and appears to show the scene of the artworks being mounted in a public space. Even if the image object of the photograph is not instantly superimposed by itself, in the view through the display it would no longer be an image object, but an element of the image object that appears on the display (that is, a section of the exhibition space). In this case the image subject would still be a non-fictitious one: precisely this exhibition space. If the photograph is now augmented, so that its content begins to move within its frame, only one element of the image object seen on the mobile screen changes; however, the previously non-fictitious image subject acquires a fictitious status. That is, since a photograph cannot move, the image subject could only be a fictitious version of the exhibition space in which photographs can "magically" move. A strict interpretation of this perspective on AR images viewed on mobile screens raises the question of whether the term 'augmented reality' can even be used for applications run on mobile screens, because ultimately, they produce an *image* of the real surroundings, that is then augmented.

The other suggested perspective questions the idea that the non-augmented reality in the frame of the mobile screen already implies a pictorial relationship. As long as it can be assumed that the real surround-

ings are transferred onto the display without interference or latency, no perceptual difference between the alleged image object in the frame and the image subject behind the frame can be identified. The reality in the display ages together with the real surroundings and can thus no longer be described as physics-free or as "*a nothing*" (Husserl 2005, 50). This would mean that it cannot be an image object. Let us consider an example: I look on the display of a digital camera, searching for the right composition. What I see is a section of my real surroundings; only after I have pressed the shutter release, do I see an image of this section of reality on the display. If I have not taken a photograph but shot a video clip, I see a moving image on the display. These phenomena have a pictorial character because they are untethered to the slice of time they depict. It can be assumed that the view through a mobile screen leads to a *perception* of reality for the subject. This is consistent with an understanding of reality in which "reality" corresponds to Schutz's primary world within reach. The reach of the subject is not extended by the video see-through; instead, due to the framing of the mobile screen, one could rather speak of a reduction of the primary reach.

This second perspective on mobile screens as AR hardware allows the mediality of the display to recede into the background to such an extent that the display would have to be thought of almost analogously to an empty picture frame through which the subject looks into their real surroundings. Only the previously described virtual elements would then appear as image objects. However, the metaphor of the mobile screen as an empty frame is unsatisfying. Although the virtual elements appear to be located in the real space, they are bound to the physical image carrier in the hands of the viewer. The subject looks at the mobile screen with the knowledge that any part of the display that appears to be transparent in this moment can be occupied by a virtual image object in the next moment. Even though the temporal dimension of the surroundings shown on a video see-through image carrier corresponds to reality, the physical materiality of the image carrier evokes relations of distance between the real surroundings and their appearance on the display as well as between the subject and the AR image object(s). This distance cannot be resolved—unlike in the case of an empty frame. The ontological status of the "display reality" is ambiguous: neither actual reality nor an image of it. Günther Anders described a similar ambiguity with the term "phantom": the ambiguity

of live images on television.3 Live events seen on a television screen are according to Anders—"at the same time present *and* absent, real *and* apparent, there *and* not there" (Anders 1961, 131). These live images are not "'images' in the conventional sense," for the reason that the temporal gap, which has always "fundamentally belonged … to the essence of the image, … has shrunk to zero with them" (Anders 1961, 131–132). The same can be said about the display reality in the mobile screen used for AR; admittedly this display reality comes even closer to actual reality than the phantom-like TV images. Here, not only the temporal gap, but also the spatial distance between display reality and objective reality has shrunk; the latter, however, not completely "to zero." The medially evoked distance between the phantom reality on the mobile screen and the objective reality behind the same is small enough, so that it is hardly noticed by the subject in perception, which is why perception of reality occurs. However, the described distance is present to a sufficient degree that the subject does not confuse the phantom reality with the actual surroundings. While the two "realities" may not be sufficiently different (visually) to produce a pictorial relation, the characteristic of the "[p]erception's field of regard" as "an associative combination of several separate sense fields" (Husserl 2005, 74) draws attention to the non-identity of the two.

This minimal perceptual difference again allows for two perspectives on the relation between image object and reality. On the one hand, it may seem that the real surroundings are part of the image object and part of the image subject at the same time. In this view, the phantom reality is part of the image object and thus provides a basis for the referring or interactive relations between the virtual elements of the image object and the objective reality, which lies behind the display and is thus part of the image subject—the augmented reality. The resulting interpretation, in this case, would be identical to the one created by the idea of the display reality as an image object. On the other hand, it could be said that the perception of reality conditioned by the described minimal distance leads to the appearance of the virtual elements being located in

<sup>3</sup> As Lambert Wiesing (2011, 241) notes, the term "phantom" is not only used by Anders, but already by Edmund Husserl himself as well as by his student Roman Ingarden; while Husserl and Ingarden use the term as a synonym for the image object, Anders utilization of the term can help us to examine the specific relationship between the "display reality" in mobile screens and the real surroundings.

the objectively real surroundings. Here, the status of apparentness is not veiled, but is made present to the subject constantly by the minimal distance. Can the phantom reality provide an interchangeable background for the appearance of the virtual (image) objects? The current "reality background" cannot be described completely independently from the image object, as another look at the scenic image "Pokémon catching" shows. Regardless of whether the AR mode is activated in the game situation or not, or whether I am playing with the AR mode activated in front of the Humboldt Forum in Berlin or at home, the image object represents the situation of catching a Pokémon. Nevertheless, it can and must be specified that with deactivated AR mode, the scene takes place on a (fictitious) meadow, while, when the AR mode is activated, it takes place in front of the Humboldt Forum (for example). In AR mode, the image object becomes partially fluid, since it also contains the constantly changing reality background, which leads to the perceived identity of this part of the image object with a part of the image subject.

The difficulty in fully grasping AR pictoriality mediated by mobile screens is due to the status of mobile screens as AR bridge technology. With the help of a hybrid media technology, virtual content is created and displayed, but the hardware is not designed specifically for this kind of application. The objective of merging virtual images and reality as seamlessly as possible can therefore not be achieved by mobile screens. Optical see-through HMDs, on the other hand, are identifiable as genuine AR technologies, which is why we will focus on them in what follows. As explained before, the term 'HMD' is intended to refer exclusively to optical see-through HMDs, and thus deliberately excludes HMDs that produce a see-through experience by means of video technology.

#### 3.2 HMDs as AR Image Carriers: Dissolution of the Frame

HMDs, which allow virtual elements to appear with the help of a half-silvered mirror system, exemplify a form of media transparency that mobile screens cannot achieve. The slightly darkened perception of the real surroundings through such mirror systems can hardly be considered a significant characteristic of *media* opacity, since otherwise one would also have to speak of such opacity when wearing sunglass-

es. Thus, HMDs offer an (almost) completely transparent (see-through) view onto the real surroundings. Consequently, it appears as if the virtual elements projected into the subject's field of vision no longer have an image carrier (van der Veen 2021, 1191). When the image carrier is no longer perceptible, the question of the pictoriality of the phenomena arises, since the image constituting conflict between image object and physical image carrier "is at stake" (van der Veen 2021, 1191). The transparency of HMDs is supported by their interfaces, by means of which the subject interacts with the virtual elements. Looking at mobile screens once again, their touchscreens can at most be regarded as hybrids of graphical user interfaces and natural user interfaces and the haptic control via the display reinforces the previously described distance immanent to mobile screens. The interaction between subject and virtual elements mediated by HMDs is based on NUIs by means of gesture and voice control. Such input possibilities for the subject reduce the immanent distance to the presented contents. Both in terms of interfaces and with regard to probable future AR hardware, such as contact lenses, the trend in the technological development of AR media is toward increasing transparency. The question of how AR pictoriality is constituted in the face of the loss of the image carrier in perception is thus of increasing relevance. If the empirical image carrier— the display—is no longer perceived due to its technical properties, then the real surroundings appear as the carrier of virtual image objects (van der Veen 2021, 1191). The real surroundings thus become the *perceptive image carrier*. The conflict, which leads to image consciousness, is no longer based on the "difference between the empirical image carrier and the imaginary image content" (Wiesing 2000, 22), but on the difference between the perceptive image carrier and the imaginary image content. Here, the conflict is based on the intrusion of an image object in the perceptual apprehension (*Wahrnehmungsauffassung*) and therefore on

In this situation, the subject perceives their real surroundings, and then a virtual element appears and obscures a part of these surroundings. The subject knows about the reality "behind" this virtual "object" and is able to become perceptually aware of it by physically bypassing the virtual. It still seems similar to a paper on which an image object appears. In the apprehension of the subject, the paper bears the character of the physically real up to the limits of the image object; since the image object

the overlapping of apprehension contents (*Auffassunginhalte*).

tends to win the conflict against the *paper apprehension* (*Papierauffassung*), the paper apprehension can occur almost exclusively through damage to the physical image carrier or through deliberate action by the subject. This implies that AR image objects that appear by means of HMDs are in a double conflict: one with the empirical image carrier and one with the perceptual image carrier. The empirical image carrier, the HMD, can become opaque and appear in perception (in the sense that the user is noticing its presence) only through disturbance or damage. The perceptual image carrier, the real surroundings, can come back into apprehension through deliberate action by the subject. In the second case, however, there is no dissolution of the image object, as it still appears to be present and is now overlapping another part of the perceptual image carrier.

The virtual elements appearing by the means of HMDs are thus to be considered as individual image objects. Although these image objects have a special relationship to their perceptual image carrier, the latter can be described neither as part of the image object nor as part of the image subject.4 With regard to this, how could an image object be described that was assumed to represent a scene or situation by means of several virtual image elements? Let us imagine an HMD version of *Pokémon Go* in which both the Pokémon figure as well as the virtual Pokéball appear in the perceptual image carrier and the subject can "throw" the latter towards the former by making a specific gesture.5 It could still be argued that the image object represents a situation in which the subject and the real surroundings are directly involved, but this argument seems paradoxical. The virtual elements would now be perceived as part of reality. However, this does not seem to be the case: rather, it must be assumed that the perceptual dissolution of the empirical image carrier and the associated elimination of any kind of framing enables the simultaneous appearance of several virtual image objects. This is impressively illustrated in the short film *Hyper-Reality* by the designer Keiichi Matsuda (2016). Here, the diegetic world is almost completely augmented, so that reality only appears among the virtual phenomena (of textual, pic-

<sup>4</sup> Since the perceptual image carrier is now actually three-dimensional, I will speak in what follows of image objects appearing *in* the image carrier instead of *on* it. If there are references made to the empirical image carrier (i.e., the display) the image objects will still be described as appearing *on* the image carrier.

<sup>5</sup> What such an HMD version of *Pokémon Go* could (or will) look like was presented during a keynote at the Microsoft Ignite 2021 conference (UploadVR 2021).

torial and other kinds) when the protagonist's AR system malfunctions and has to be restarted. What can be described as a dystopian collage from an extradiegetic perspective can only be considered as the "co-existence" of several virtual elements (including AR image objects) from a diegetic point of view. When various advertising boards are perceived in a public space, each of which (can) make an image object appear, and the image contents of these boards are superimposed with various virtual artworks by means of an HMD, these virtual artworks are not *one* image object. Furthermore, the example of *Pokémon Go* shows that the virtual image objects are related to and might interact with each other. Here, the subject acts purposefully towards an image object—the Pokéball—in order to bring it into an interactional relationship with another image object—the Pokémon. Thus, the subject interacts with one image object by means of another one.

Such potential forms of interaction as well as the plurality of image objects *on* and *in* an image carrier may tempt us to at least partially deny the pictorial character of virtual image objects and to ascribe an object character to them instead. So, do some AR technologies or applications make a "strictly iconoclastic use" of the image after all—as Gottfried Boehm (2006, 12) says regarding the simulation image? Do AR media overexert their immanent pictorial potentials to the point of annulling their own pictoriality?

#### 3. Final Remarks

Summarizing the analyses of pictorial AR phenomena displayed by means of mobile screens and HMDs, it becomes evident that technological developments—changes in hardware used which might seem trivial at first—evoke changes concerning the pictoriality of these media phenomena which can be described as significant from an image-theoretical perspective. It is highly probable that the leaps regarding AR pictoriality are smaller from one step in technological development to the next than the leap presented here; thus, the inclusion of further AR display manifestations— such as "'window-on-the-world'" (Milgram and Kishino 1994, 1322) or video see-through HMDs—should ultimately be

included in a genealogy of AR pictoriality, even if such displays will arguably not play a significant role in AR technology in the future.

The most significant leap presented here is what I call "the dissolution of the frame." This dissolution allows for the appearance of miscellaneous individual image objects in the perceived surroundings. It was assumed earlier that when using mobile screens, the (potentially) changing "reality background" appears as part of the image object as well as the image subject. If the described minimal difference between phantom reality and objective reality is dissolved when using HMDs and reality becomes the perceptive image carrier, if reality would thus become part of all three aspects of pictorial perception, there could hardly still be an image consciousness. Thus, the (supposedly) transparent carrier medium and the resulting perceptive embedding of image objects in the real surroundings may tempt one to address AR image objects as (virtual) *objects*. Is it therefore legitimate to describe AR phenomena as things rather than images? In normal conditions it must be assumed that AR image objects displayed by means of HMDs are not apprehended as part(s) of the objective reality; if they are apprehended as such, it would be hard to explain how they *augment* said reality.

We have seen that the conflict between the virtual image objects and their empirical image carrier only occurs if there is a disturbance of the HMD. Therefore, it is the second conflict, the one between virtual image objects and the perceivable image carrier (i.e., reality), which mainly constitutes the pictoriality of the described AR phenomena. Shortly after Husserl's lectures on phantasy and image consciousness he describes a conflict different from the one between an "appearing image object and the physical object [i.e., the physical image carrier]" (Husserl 2005, 171) which he applies to hallucinations:

Hence conflict between *what appears and what is demanded empirically*. … [I]t can also relate to the external connection of the object with other objects in the unity of reality (the unity of 'nature'). Here, however, not only the immediate intuitive connection with the surroundings (the intuitive present) comes into consideration, but also the circuit of memories, the 'elaboration in thought' of empirical experience …. What appears directly and without opposition, and is also not contested by any external intentions (hence there is no talk about the pictorial …), '*exists*,' is *valid*. What conflicts with what appears without opposition (with what is given without opposition) does not exist. (Husserl 2005, 171–172)

One might say that the perceptive opposition in which AR image objects stand to the unity of reality "surrounding" them is only a matter of technological development still to take place and that further development might adapt the appearance of these image objects to the appearance of real objects in such a way that the opposition will therefore not endure. The technological development from mobile screens towards HMDs might support such an argument, one based on the transformations of pictoriality this development evoked. Still, the idea of iconoclastic images produced by means of AR media technology does not seem completely persuasive, as it neglects the role of the subjective viewer. While the occurence of image consciousness has to be described as pre-reflexive and pre-subjective in the case of more "traditional" forms of images, when using AR HMDs this occurence might still be pre-reflexive, but it is no longer pre-subjective. The subjective experiences and memories constitute an opposition between the AR image objects and the real surroundings. Here, also, the knowledge of the subject about using a technological device to display virtual elements in their real surroundings has to be taken into consideration. Accordingly, there is reason to doubt that further steps in technological development will enable AR displays to present a virtual element which is not longer apprehended as an image, as "a mere figment" (Husserl 2005, 52), as a "nothing."

Even if virtual AR phenomena will never be apprehended as parts of the "kernel of reality," they already are—like all media phenomena—ineluctable parts of our cultural and social world. AR technologies displaying interactive image objects in the real surroundings of the subject are the next "epochal step in the history of humanity constituted by the new images" (Wiesing 2010, 101). When display manifestations such as optical see-through HMDs or even contact lenses will become part of the normal configuration of human access to the world, it will be demanded to precisely analyze the transformations of pictoriality produced by such media technologies. Emerging theoretical problems require an image theory that faces an extension of pictorial characteristics without bias, without moving away from a conception that understands "the pictorial phenomenon as a basic component of reality" (Lotz 2010, 167).

#### References


*Geschichte der Philosophie*, edited by Simone Neuber and Roman Veressov, 167–181. München: Fink.


## A Kind of Mixed, Intermediate Experience. On the Entanglement of Image and Bodies

*Julia Reich & Manuel van der Veen*

#### Abstract

In our chapter, we focus on the concept of mixture that primes the description of mixed reality images. The latter do not insist on one augmented or virtual reality; rather, they explicitly install the interaction of multiple realities. However, this does not mean that in the mixture the individual components get lost. Consequently, we selectively pursue those moments in which the entanglement of image and bodies becomes apparent. In the first section we analyze techniques from film and theater, such as the phantasmagoria, in which an interaction of image and body is essential. Therefore, we refer to a concept of the body by Michel Foucault, who argues that real bodies reach into virtuality as well as virtual bodies require a localization. Based on this viewpoint, the main section examines three contemporary artworks by the artists Banz & Bowinkel, Sarah Rothberg, and Charlotte Triebus. Ultimately, in these artworks at least two mixed bodies meet in a mixed reality image in order to enable a "kind of mixed, intermediate experience."

#### Keywords

Corporeality, body experience, mixed reality, augmented reality, contemporary art, encounter, movement, screen

#### 1. Introduction

When it comes to virtuality, one pair and its difference has been of interest until now: virtual reality (VR) and augmented reality (AR). Despite different concerns, this all-too-simple separation was hardly sustainable, because neither a distinct technology can be assigned to the terms, nor can transitions between them be ruled out. Subsequently, terms have been installed that attempt to frame an entire field which outlines the entanglement of virtuality and lifeworlds. Somewhat vaguely, as already with Aristotle's texts, which could not really be assigned to "physics" and therefore provisionally were put together as meta-physics— today the Metaverse shall now follow the universe. In the same way, new umbrella terms are sought for such techniques as VR and AR, which make the above-mentioned entanglement visible: these are extended, crossed1 and mixed realities.2 However, to set mixed reality images as the third and last in the sequence of synthetic realities rather than at the center provokes a statement: in this case, it is no longer a continuum where mixed reality occurs as an approach of augmented and virtual images. Outside of this dichotomy, mixed reality is to be understood as a field in its own right. Unlike an alternative virtual reality or a one-way-augmentation of existing reality, mixed reality is a coincidence of distinct realms. Distinct realms that are shaped by a collision to map out a mixed or intermediate field of vision.

We will not subscribe to an existing order here, commit to a genre of mixed reality images and we will not use technical components to define them. Instead, we dedicate this chapter to a mixed experience in relation to the body and its phenomenal consequences in contemporary art. Since mixture, merging as well as blending suggest that the different components are no longer distinguishable, we enter into discussion of how a mixture can be experienced. In the first step we propose a philosophical analysis of what the French philosopher Michel Foucault

<sup>1</sup> In their text, the authors argue that any provisional naming does not do justice to the individual experience. It is therefore necessary to find out on site what realities are crossed and how the result is to be called (cf. Verhoeff and Dresscher 2020).

<sup>2</sup> The term "mixed reality" was already proposed in the 1990s by Paul Milgram und Fumio Kishino to describe a continuum between virtuality and reality, although it is precisely the extreme poles of the scale that seem difficult to maintain. Compare the extended version of the continuum in Milgram and Colquhon 1999.

called "a kind of mixed, intermediate experience" (Foucault 1998, 179). And in the second part we will pursue this mixed experience of a virtual and a physical body in three aspects: anticipation, alignment as well as attraction/avoidance. These aspects will be applied to contemporary artworks to emphasize the specific entanglement of image and bodies.

#### 2. Experiencing a Mixed or Intermediate Body

As an opening scene for practicing the reflection on mixed reality images, the approach of different bodies in *Blade Runner 2049* is described here. In this science fiction film a very intimate, but impossible relationship develops between Joe, a replicant, and Joi, an AI projection without a carrier (which also fulfills the fantasy of a hologram, as she is called).3 The bodiless AI Joi says, "I wanna be real for you" and Joe replies "You are real for me." However, in order to be physically in touch with each other, the hologram hires Mariette, a replicant sex worker and synchronizes with her. Joi adapts the movements of her projection to the movements of Mariette's physical body. We as viewers accept the synchronizing-process while looking at their hands, which is why this moment of connection could be called the scene of hand tracking (Fig. 1). For a short moment in the film, we adopt a first-person perspective onto the hands in order to test whether the image hand is "truly ours" by moving it. In fact, the mixture already starts before this test. It begins with the two bodily shapes of Joi and Mariette and it is not visually evident which one has a physical consistency and which does not. The complexity of the constellation takes place when the two bodies approach each other and becomes more intense when one body enters the other or is superimposed on it, i.e., at the moment of the mixture. This representation in the film has a fascinating effect, because in the following minutes the real body breaks through the image-body again and again, and shifts the boundaries of the skin. What makes this scene so apt is that we are dealing with a mixed image-body and that this

<sup>3</sup> Finally, holograms also depend on a carrier. See the comprehensive study by Jens Schröter (cf. Schröter 2014) or regarding the carrierless fantasy Eric the Bruyn (cf. de Bruyn 2015).

mixture is necessary to build a bridge between virtual and physical corporeality. If we additionally consider an HMD-based virtual reality at this point, where our hands are represented too, and able to act within, we realize that the constellation is not a one-way road. In virtual reality, our real body is superimposed by an avatar, through hand tracking, in order to be able to interact with virtual bodies. Consequently, there are not the virtual bodies on one side and the real bodies on the other. Hence in the following we refer solely to mixed bodies and the mediating relationships between them.

*Figure 1: Scenes from Denis Villeneuve's Blade Runner 2049 showing the moment of mixture of Joi (Ana de Armas) and Mariette (Sallie Harmsen) described here as the scene of hand tracking (01:27:18) © 2017 Alcon Entertainment, LLC. All Rights Reserved.*

A mixed-body concept was already outlined by Michel Foucault in his talk about *L'utopie du corps (Utopian Body* from 1966*)*. In the Foucauldian understanding, Joi has no real body because she has no fixed place in the world. The projection, after all, is able to appear anywhere, even there, where something else already is. In contrast, our bodies are not able to move from their place: "My body, pitiless place" (Foucault 2006, 229). A body is an absolute place, a *topia*, that fixes us without mercy, everywhere we go, our body is always with us. And even if we travel to virtual worlds, we cannot get rid of our body, which after all is wearing the HMD glasses. *U-topia* is therefore not only the negation of a place, but also the negation of our body. Joi has no place and thus only a virtual body—speaking with Foucault only a bodiless body. So, in order to become a "pitiless place," she projects herself onto the place of a tangible body.

However, Foucault lets his argumentation culminate in an inversion, when he ends saying: "My body, in fact, is always elsewhere. … And to tell the truth, it is *elsewhere* than in the world" (Foucault 2006, 233). Unlike before, the body is now the source of all utopia—it is as much a topia as a utopia. Our body is split: it is here *and* elsewhere, superimposing the here with an elsewhere. And this designation is not an exception; it is the very condition of being a body. A designation, which refers directly to a mixed body. Of course, Foucault hardly anticipated virtual reality experiences, but his thoughts provide a helpful framework to think about virtual corporeality, to which we refer here. In summary, this means that the experience of a virtual world—in which we travel to places that do not exist and look at hands that are not ours—on the one side and on the other side the entanglement of the AI and a physical body are to be read in one line. In both, the result is a mixed body and thus, based on Foucault, the technical visualization of an existential fact.

After all, this does not imply that the entanglement works in harmony. Despite the synchronization of Joi and Mariette, each body keeps its autonomy. This becomes obvious when both of them take off their clothes. Since their bodies wear different outfits, they also perform different movements while undressing. An interplay which becomes all the more complex when one considers the production process of this scene: it involves two different actresses playing the same scene one after the other. Both scenes are then mixed into one split image-body. In addition, the two actresses were scanned in 3D in order to let a completely new or at least third mixed figure, which was computed from both scans, intervene at certain moments. The slight shifts between the calculated, projected and physical body maintain the difference between the two women and scenes. This is reminiscent of the theater of *phantasmagoria*, when two bodies interact, each performing on its own level and without seeing each other. For the installation of the famous optical illusion *Pepper's Ghost*, the scenes are not performed one after the other, but layered on top of each other simultaneously. One person is standing under the stage and is illuminated by a light source, so that the reflection is cast onto a pane of glass one level above. Due to the semi-transparent carrier, the luminous figure appears to the audience as if it were on stage. In contrast the actor on stage sees the projection, but not at the same place as the audience. Both actors therefore have to act as if the other body were present, as if there were a connection between the body and the projection, while their location, consistency, and agency are not the same.

Both, in the case of the hand tracking scene and in *Pepper's Ghost*, the technical setup for the interaction of two different bodies plays an important role—and so in Foucault's conception of being a body. There are certain techniques that externalize the split of a body and thus make it possible to examine it, e.g., the mirror. The mirror maintains an existence that is as split as our bodies. Indeed, a mirror occupies a real place, but it cannot occupy a real place without referring to a place that is elsewhere. In another, far more famous text, *Des espaces autres (Of Other Spaces),* Foucault revisits the constellation of mirror and body by giving this experience a name: *mixed, intermediate experience*. The agenda is thereby achieved in the indecisiveness of this name. What appears to be mixed only becomes the object of reflection through intermediation. As the mirror example demands, for a mixed, intermediate experience it is necessary to understand how this mixture is made possible by the technology as intermediation. Based on Foucault, we describe the mixed bodies as follows: mixed reality makes the place that the virtual body occupies at the moment when I look at it in the glass at once absolutely real, which connects it with all the space that surrounds it, and absolutely unreal, since in order to be perceived we have to pass through this virtual point which is over there—there where it is not (cf. Foucault 1998, 179).

This illustrates the importance of considering two perspectives, both production and reception, in understanding mixed reality images and their entanglement with the body. Hence, if we now look at contemporary artworks that have virtual corporeality as their subject, we do not assume that the bodies in the images are virtual and that we, the viewers, are the real bodies. Both parties must appear mixed for this experience, as mixed reality image-bodies: the virtual bodies have to be mapped to an actual place, and we must enter the virtual space with our own bodies. The reading of Foucault's text and the scene from *Blade Runner 2049* already designated various fundamental constellations that will be questioned here. For the first part 3.1, which is subsumed under the term of anticipation, we ask: How to describe an experience in which one's own body is superimposed by another and what happens if our movements are translated into the movements of another body? With regard to the concept of alignment in part 3.2, it must be asked: How do bodies interact which are not at the same place, but between which a relationship takes place? And, ultimately, to approach the concept of encounter in part 3.3: How does an interplay of attraction and avoidance develop as corporeal experience in between image and bodies?

### 3. Three Ways to Entangle Bodies and Their Images

The introductory interpretation of Foucault seems to us to be appropriate, since spatial recognition and localization of both virtual and real bodies play a decisive role in the technologies discussed. However, this does not mean that we reduce the body here to local information. If a body has a place, then it is able to move in the world and subsequently two bodies are able to relate to each other. Furthermore, it is about an imagined and projected place, about translation procedures, and controlling behavior. A body that responds to stimuli *in situ*, that looks and is looked at, that shows itself, that curiously approaches the others and carefully withdraws. To illustrate this, the beginning of these considerations about the entanglement of image and bodies takes the subtitle literally and examines bodies in relation to a painting.

#### 3.1 Anticipation: Banz & Bowinkel's *Bodypaint V09*

The *Bodypainting* series by Banz & Bowinkel—an artist duo consisting of Friedemann Banz and Giulia Bowinkel—are neither paint on a body, as the title may suggest, nor do they show a body that paints. Instead, and that is our hypothesis, two real bodies at different places meet in the imaginary space of painting. Therefore, it could be called a mixed reality image-body which is not about a direct confrontation of two bodies. A circumstance that is now to be analyzed in detail. As a viewer of Banz & Bowinkels *Bodypaint V 09* (Fig. 2), we may be surprised to find a painting that is neither characterized by an artist-specific brushstroke, nor by pasty areas of color, nor by different surface textures, like an underlying canvas structure. The materiality of a painting is leveled here on the flat surface of a CGI print. So, it is a painting that was not even applied directly to this visible piece of paper in front of us. In other words, the carrier of the print is not the arena of the painting, which was rather painted elsewhere.4 Hence, Banz & Bowinkel negate precisely that traditional understanding of painting, which is based on the mythical act of an artist touching the canvas. Instead, the act of painting is delegated to technical translation procedures and thereby the artist duo refers to the performative act of artistic production.

Prior to the painting that is visible to us, there was a performance made in an almost sterile grid room, which has more in common with a laboratory, than with a messy painter's studio. From Jackson Pollock and his formally related drip paintings, we also know the studio documentations in which the color goes far beyond the image field. In the case of Banz & Bowinkel it is instead a performance without a mass of paint and without a canvas, but not without an observer. In place of Hans Namuth as a human observer and documentarist of Pollock's work, Banz & Bowinkel's moves are registered by an apparatus that tracks and translates the body choreography into lines in space. In reference to the artificial intelligence debate nowadays, it is not only questioned what art is, but also who makes art. However, hidden in the

<sup>4</sup> In his famous article "The American Action Painters" in *art news* from 1952, Harold Rosenberg designated the place of the canvas "as an arena in which to act" and this statement can be found under the heading "getting inside the canvas" (Rosenberg 1982, 25).

*Figure 2: Banz & Bowinkel, Bodypaint V 09, 2019, 150 x 200 cm, CGI pigment print and AR application. Courtesy of the artists.*

current phenomenon is a more specific question, namely whether one needs a physical body to make art. Since the media specificity of painting forbids showing movement, the physical work behind the genre was hidden from the viewers for a long time. And this very question of bodily movement is connected with a procedure that became popular as *action painting*. At first glance, the *Bodypaintings* of Banz & Bowinkel therefore resemble classic drip paintings, such as those known from Jackson Pollock. A mediating role between painting and movement first had to be taken on by the film camera, which it did, for example, in Pollock's work. This aspect is crucial, because with the movement captured through the lenses of a camera, the prominence of the body came into focus.

It can be stated at the outset that in the artificial light of the film setting, painting had to leave the painterly behind (cf. Meister 2014, 139). Driven by the running reels of the film, painting became as linear as graphic. Unlike a painted surface, a line has a beginning and an end—and between the two, the movement of the hand or even the entire body appears. The fact that the photographer and filmmaker Hans Namuth had pointed his camera at Jackson Pollock in the 1950s was an important factor in understanding his paintings as an action and his brushwork as a performance. It is hardly surprising, then, that in the course of current tracking technology, artistic experiments are once again negotiating body and movement in relation to painting. Like Banz & Bowinkel, Pollock does not touch the canvas. As a result, he needs to anticipate the "brushstroke" in an almost ballistic manner. For the movements of Banz & Bowinkel, however, there is not even a carrier.5 They do not use a brush, not to mention paint, only a body choreography has remained for the artists while the translation of the movements into color is done automatically by the computer. Nevertheless, while they perform in the studio, they have to anticipate all the lines, which are then calculated by the technique that paints.

A form of physical anticipation plays a decisive role in the work of Pollock and the artist duo, which has something to do with the fact that neither of them touches the canvas directly. Additionally, in both artistic practices the act of painting and its performance are emphasized. Even if one can assume a moved production in a dripping due to the obviously dynamic strokes, it was the film and thus Namuth's recording that put Pollock's action next to the resulting picture on an equal level. In Banz & Bowinkel's work, it is not a film but an AR that fulfills this task, and it is not someone else who highlights this performance. Rather, via AR, the physical movement becomes another layer of painting itself. In a sense, this entire process only becomes vivid at the level of the AR on the user's device.6 With a programmed app, the *Banz & Bowinkel AR*, all body paintings open up another level of reception. In the case of *Bodypaint V 09* a stylized avatar appears on the display, performing the movements of the artists with both hands. Strictly speaking, it is this avatar that paints and only in AR we are able to see how the movement actually throws a paint path into space. The fact that *Bodypaint V 09* makes the physical painting process repeatable in re-

<sup>5</sup> In their video work *Deamons* (2014), Banz & Bowinkel demonstrate to a certain extent the movement recordings in their studio work. However, since the physics are artificial, the entire body has to reorient itself to anticipate the dripping behavior.

<sup>6</sup> The performance of the avatar can be seen by pointing the screen at Figure 2 via the downloaded app *Banz & Bowinkel AR.* Accessed December 12, 2022. https://www. banzbowinkel.de/apps/.

ception via the avatar illustrates the extent to which the eventfulness of image production and its strong reference to the body are intermediated in the entanglement of the mixed reality image-body. Finally, it is the avatar that mediates between the artist's body and the body of the viewers. Through this avatar the viewers are able to physically comprehend the actual movements, albeit only in an imaginary space.

With the AR layer, it is not only the concept of art and its production that is questioned and stretched anew, but also that of the reception of art, that poses another form of anticipation. The painting anticipates, after all, which movements the viewer will mentally execute. Other *Bodypaintings* in this series are not associated with a painting avatar on the level of AR. Instead, the stream of color appears to take on a life of its own and recede until a nearly empty surface remains. And just before the origin, the empty canvas, the stream begins to spread out in space again, allowing the viewers to join in the physical movement that was once performed in a space. Namuth's film about Pollock documents how a movement becomes a painting. In this case, the painting is not dependent on the technique of the film. Whereas the *Bodypaintings* are only existing due to a tracked movement that AR turns into a true layer of the artwork. In this tracking procedure, the technique is necessary to convert the movement into a painting. This aspect becomes even more evident if we consider the chronophotographs of the Albanian-American artist Gjon Mili.7 His technique is based on photography and a long exposure time. Mili attaches small lamps to the bodies of various artists and athletes, whose movements are tracked by the photographic plate in order to produce luminous diagrams. Diagrams of sliding movements of a violinist, the brush swing of Pablo Picasso, or the prancing movements of figure skater Carol Lynne. The moving protagonists of Milis' photographic experiments cannot see what they are doing, or what they are drawing. This indicates that the technology creates this line in the first place.8 Almost in logical consequence for the movements

<sup>7</sup> Namuth had seen these printed in *Life Magazine* in 1950, and just one year later he made the film about Pollock (cf. Meister 2014, 145).

<sup>8</sup> Mili's photographs bear a resemblance to the chronocyclographs of the couple Frank and Lillian Gilbreth. The latter wanted to capture a perfect workflow via the lines in space, which could then be mimicked by other workers. A strategy that reappears in AR, for example, when it is used in assembly and directly writes lines in the air, thereby dictating the movement to be executed.

to emerge clearly, the surroundings had to be darkened. And along with the environment, the shape of the body disappears—what remains is the record of its movement, a line in space. Movement and lines thus maintain their autonomy at the same time as they mutually constitute each other. The movements develop by anticipating the lines, and the lines pursue the movement. Banz & Bowinkel create a real-life event that is subsequently translated by technology into a resulting work of art, which is a painting. The stereotypical avatar body, modeled after a male figurine, resembles the color traces in its glossy surface texture and equally presents itself as the maker of the corresponding composition. In this way, not only classical topoi of painting, such as those of artistic authorship and creatorship, are caricatured against the backdrop of current artistic production. Moreover, this approach raises the very fundamental question of what constitutes a painting or an artwork and how artistic production has changed in the age of virtuality.

What does this imply regarding the entanglement of image and bodies in *Bodypaint V 09*? There is an artist's body that imagines painting, but cannot see it, and a body of the viewers that may see the painting but is thereby immersed in the artist's movements. In short, the viewer's body, which is the place of the exhibition and the artist's body at the location of the studio meet in an imaginary space, which is mediated between the painting and the user's smartphone in AR. It is therefore not only a matter of tracking bodies through a certain technique, but also of translating this virtuality back again.

#### 3.2 Alignment: Sarah Rothberg's *Longing*

In Sarah Rothberg's *Longing* (2021) there is no painting, and the movement of the artist's own body is not relevant here at all. In contrast, the viewer's body as a topia is now decisive, and that this body faces a virtual body that is located "elsewhere." The latter can certainly be understood as an avatar of the artist and thus, despite its "painterly" appearance, opens up a reference to our lifeworld, since it directly connects with the display frame. But as mentioned before, this body does not perform any movement of its own. Rather, the artist's avatar aligns itself with the movements of the mobile screen holder, i.e., the viewer. Understood as an avatar representing and substituting the artist, which

*Figure 3: Sarah Rothberg, Longing, 2021, AR application, AR biennale, NRW Forum, Düsseldorf. Courtesy of the artist.*

Rothberg developed within the context of a synchronous AR performance, *Quarantine Me (2020)*, both the viewing conditions and the relationship between closeness and distance are reflected here.9 The AR starts by showing a purple nude female figure within the display, more drawing than photography (Fig. 3). Meanwhile, on the acoustic level, the question is asked: *Are you longing for something?* After an initial poetic dialogue, the figure stretches her hands, approaching in the direction of the user and fixes them at the frame of the visual field.

If one wants to praise a work of art for the convincing vitality or the represented three-dimensionality, then traditionally by speaking of its crossing of the frame. "The figure seems to be looking at me," "it comes towards me" or "almost falls out of the frame," strategies that

<sup>9</sup> The reference to the creation and integration of the avatar is taken from Sarah Rothberg's website. "Quarantine Me." Accessed December 12, 2022. https://sarahrothberg. com/QUARANTINE-ME.

were then literally depicted in *trompe-l'œil*, or one thinks of Pere Borrell del Caso's boy in the painting *Escaping Criticism* (1874), who pushes off from the frame into real space with his hands. According to Timo Skrandies, the frame offers us an "existential security, a topographical *effet de réel*" [transl. by the authors] and assures us an existence in the moment of looking (Skrandies 2010, 256). In Rothberg's case, the work comes toward the viewer, the hands coiling like rubber snakes in space, expanding and seeking contact. In doing so, information is used about which the technology is certain—the position of the viewer, which is ultimately identical to that of the image carrier. Because the users hold the carrier in their hands, the image always knows where they are. To vary Foucault again here, the viewers are, after all, where the image is, even when the image appears elsewhere.

The title-giving longing is here of course the longing of the image-body to get in touch with the viewers. Hubert Damisch called this a "desir de mur," the desire of the image to anchor itself in the world or at least on the wall (cf. Damisch 1984). This longing is expressed by the avatar's hands flying towards the viewer in search for grip, its alignment with the frame of the screen. As in the all-too-familiar goodbye scenes, when a hand presses against a window of a train from the inside, so here two purple hands touch the screen from one side. The counterpart—the hands of the viewers—however, occupy the other side of the screen. In this way, it seems that two bodies are facing each other and holding a frame together that connects and separates them in the same way. Here the window or the screen serves as an unbridgeable distance despite proximity. An ultra-thin distance that shows that the spatial separation is inescapable, or better, that the separation is indeed already the case. But instead of making explicit, as in the departing train, that the two bodies represent two independent places that can move away from each other, the experience here is about a contact that cannot be detached even by movement.

If the color strokes in Banz & Bowinkels *Bodypaintings* had a beginning and an end, that is, designated the beginning and the end of a movement, then the line here indicates the relationship of two positions in space and two pairs of hands that surround the image carrier. Thus, it is about two bodies, their position and a relationship between them. In the process, the viewers inscribe their movements into the body of the other one, forcing them upon it. The "artist's body," however, does not complain, because its position can remain unchanged,

and its extremities are in any case boundlessly flexible. Everything the viewer does immediately becomes inscribed to the work. The work is what the viewers do. This again shows the autonomy of bodily movement, this time that of the viewers. But the movement of the viewers is autonomous only insofar as the body of the image is able to adapt. Because it knows where we are, it possesses a place itself and registers every of our movements attentively. In this respect, the avatar also sets limits for our movements. Unlike a traditional sculpture in a park, one cannot walk around this body, because the figure in the image permanently fixes the display that connects the image carrier and the viewer. If the viewer does not move, the arms stay at the figure. Only if the user walks around or swivels the device the movements are captured and translated into twisted arms. One is bound by the image carrier until the poem ends and the body releases its hands from it. In a sense, Sarah Rothberg's figure can also be placed in relation to the tradition of nude female sculptures in public space, since the chest and pubic area are visually accentuated at the purple avatar. The avatar could be discovered as part of the Düsseldorf AR Biennale,10 which gathered many AR artworks in the local park called Hofgarten. In addition to AR works, the park is home to numerous sculptures, such as Artistide Maillol's bronze statue *Harmony* (1953), a nude woman standing in classical *contrapposto*, reminiscent of ancient Venus types. It may be a coincidence or curatorial smartness that Rothberg's figure was placed not far from Maillol's bronze. This creates a friction, since Maillol is considered as a great sculptor of the female nude. Rothberg's avatar instead challenges the voyeuristic gaze due to permanent alignment with the viewer and thereby unables the traditional circling of the sculpture. The traditional act is made impossible, because of an inescapable coupling of eye contact (the image always knows where the viewers are). While Maillol's nude in the antique manner has no arms attached to her torso to ward off voyeuristic eyes, it is the fixed hands and prolonged arms of Rothberg's avatar that solidify the constant face-to-face.11

<sup>10</sup> The AR Biennale (August 22, 2021–April 24, 2022), initiated by the NRW Forum showed AR works by 19 international artists in the public spaces of the cities of Düsseldorf, Cologne and Essen. "AR Biennale." Accessed December 12, 2022: https://www. nrw-forum.de/ausstellungen/ar-biennale.

<sup>11</sup> This seems to stand in line with feminist performance positions of the 1960s and 1970s, which used the strategy of exposure and (medial) staging of one's own body to reflect

In the case of Rothberg's *Longing*, the bodies (of the avatar and the user) and the image align with each other, coincide in the frame of the display and evoke an interactive experience. The virtual avatar reacts to the real movement, while our movement is virtualized. At the place "where our hands touch," an image emerges, which could be described as a preliminary state of the entanglement between mixed bodies. Unlike the avatars of Charlotte Triebus' work, which are discussed below, the violet nude is designed for a dependent alignment rather than a direct confrontation with the viewer. Rothberg's avatar seems to determine the user's gaze on the one hand, while on the other hand the users impose their own movements on the avatar. In this way, an ambivalence is stated between an inescapability of the synthetic counterpart and its physical deformation with one's own going back and forth. On a purely factual level, however, it is the display movements, i.e., those of the user, that are forced onto the avatar. The figure adapts herself physically to the users body height, and therefore is always at eye level. But through this alignment of the display-viewed figure and our body movement, a complex choreography develops in which the relationship between active and passive, viewer and performer, seems to oscillate. The hands of the avatar are the contact zone to the user's hands holding the device, and the elongating arms represent the paths completed in the shared experience or one's own tangled mess. While this may initially suggest an interaction, it is intended to be quite one-sided. In contrast to Banz & Bowinkels *Bodypainting*, in which the painting process and the real-body movements are associated and conveyed to the user via the avatar, Rothberg focuses on the direct movements of the user in conjunction with her avatar.

#### 3.3 Attraction/Avoidance: Charlotte Triebus' *kin\_*

In the AR work *kin\_* (2021) by performance artist Charlotte Triebus, the point of contact is neither an imaginary space, nor is the relationship of two bodies a one-sided alignment. Unlike the previous two works, *kin\_* is furthermore neither tied to a specific location nor to a physical

on the relationship between the gaze's subject and object, between artist-subject and material, just like body and image.

artwork. The work can be experienced via app on a personal device any time and any place, because it aligns with situative conditions, i.e., the lightning and the spatial relations by means of a LiDAR scanner. When opening the app, the user is instructed to first place a portal on a floor surface. Then, three avatars appear outside the display field and only come into visibility with the user panning and moving (Fig. 4). The physiology and physiognomy of the avatars is an adaptation of the real appearance and movements of the artist herself, who was their model and therefore is able to be located "elsewhere." As Steve Dixon mentioned, "the performing virtual body is either less authentic than the live, nor is it disembodied from the performer" (Dixon 2007, 215). Triebus lends them her face, mimic, body shape and movement by making her own body documentable via taking standard 3D-scan positions, from which the motion patterns of the virtual choreography derive. This is also the origin of the peculiar entrance poses of the figures. They derive from the production process: for the benefit of its overall capture and transferability Charlotte Triebus had to adapt to technological proceedings of image generation and not the other way around. For example, the T-pose, a standard orientation in animation where the figure stretches both arms out at right angles, is often used in game production as a placeholder for an unfinished move but is used in Triebus' work as an important choreographic element. On the one hand, the movements are reminiscent of the classic sculpture poses of standing, sitting, and reclining bodies; on the other hand, they show unusually dehumanized features, i.e., when the heads make strange glitchy-stretching movements. In this way a reference is also made to the technical production process, which is not sidelined in favor of an illusion affirmation, but rather forms the work. Unlike Banz & Bowinkel, who were able to move freely, or the completely externally determined body of Rothberg's avatar, Triebus' body adapts to the technology to become an image and enable a mixed, intermediate experience. But this avatar subsequently confronts the body of the viewer.

The choreography lasts a total of about 12 minutes and can be divided into three phases that provoke different physical relationships between the user and the avatars ranging from attraction to avoidance. This AR performance is probably the closest to the scene from *Blade Runner 2049* mentioned at the beginning of this chapter. Not only because the artist's body is synchronized with that of the avatar in highest

*Figure 4: Charlotte Triebus, Kin\_, 2021, in collaboration with Brigitta Muntendorf, Inès Alpha and Mirevi Lab, AR application. Source: https:// apps.apple.com/de/app/kin/id1580039645. Accessed December 12, 2022. Courtesy of the artist.*

resolution, but also because the interplay between synchronization and desynchronization aims at a living relationship between "artist" and viewer.

In the first phase of avoidance, the three avatars remain in their respective poses and perform only small movements: for example rolling their eyes or changing their hand position, and to some extent give the impression of being bored. The motionless figures provoke movement, a familiar behavior, and so we circle them like a sculpture. If we come closer to have a look, because we are attracted by their high-resolution appearance, demanding gazes and also their costumes, they paradoxically step back, avoiding the user's approach. Production-technical background is the so-called collision box, which triggers a corresponding interaction, i.e., stepping back, at a certain distance of the device. The collision box here is a kind of intimate area, and now the discretion is no longer a task of the viewers. Their physical closeness has an impact and therefore we can move the "sculpture" through space. On the one hand, the figure thus becomes an object that we can move, but there is also the impression that the virtual performers demand a respectful distance from their audience and claim to be perceived as a "techno-organic life form" (Triebus, Geiger, and Družetić-Vogel 2022, 4) in their own right.

After this, in the second phase, the avatars start their odd choreography by alternately synchronizing and desynchronizing their poses.

In doing so, they unbound the spatial limitation of the mixed reality image in the display frame, so the user must follow with physical movements to watch. They perform each of the recorded standard poses, being synchronous at times and building a formation. From this moment on, we freeze as viewers, not only because of the attraction of the choreography, but also because the individual figures now appear as a group they are outnumbered. However, they do not act as if they were performing together the whole time. Similar to the initial example with Joi and Mariette, *kin\_*'s three avatars do not completely synchronize their movements in their performance. Again, and again the alignment is desynchronized, and they regain their individuality. This status between group dynamics and independence is also reflected in the constant overlapping of the bodies, which, however, has no influence on the procedures or bodies themself. As a result, they show us their lack of a place, or that they represent only one body. Furthermore, these overlaps seem possible because the bodies are characterized by a permeability, like that of a "hologram," instead of the physical density of human bodies. This corporeal transparency occurs sequentially again and again in the choreography when the avatar bodies inflate and thus emphasize the virtuality of Triebus' body.

At the end of their choreography, they inflate into a large blue ball, from which a single avatar emerges and directly addresses the user. Now the complete situation has turned around. After we had moved the figures just a moment ago, we are now approached by this one with threateningly quick steps and pushed away from our place in the world. However, the threat not only originates from the crouched posture and the aggressive step. The threat also arises with the mixed body. During the merging, the different figures have united into a single metabody, in which all the costumes come together as well. The merging intensifies the glitches, as if the union of different virtual bodies in one and the same place causes a repulsive reaction. The performance thus proceeds from the individual figures to the more or less coordinated group to a techno-organic metabody. A mixed body out of control, but capable of a confrontation with the viewer. Therefore, the viewers can move freely only at the beginning, are then frozen, to clear the field at the end.

### 4. Conclusion or Preparing an Encounter Between the Real and the Virtual

In this paper, we have asked how the concept of mixture can be used to describe an experience that is made possible by technologies such as VR and AR. Our first assumption was that the mixed is not an inseparable mélange or something in-between two extremes, but rather that the different dimensions can also be seen as distinct in the mixed realities as both separation *and* connection. In the ensemble of mixed objects or entire superimposed spaces, we have focused here on the bodies in mixed realities. Specifically, on those experiences in which a body takes place in front of and behind the screen, through which they enter into a relationship with each other. The examples here emphasize the relevance of the screen as reference point, even if the works tend to (supposedly) detach themselves from it. Together with Foucault, we have additionally assumed that the situation is not one of physical bodies on one side and virtual bodies on the other. Since already from the beginning a place is attributed to the virtual bodies, just as our place is registered by the technique, in order to allow a rapprochement between the two. As reflections on a mirror, according to which one's own mirror image unites both real and virtual qualities, the body can be understood as a split, as a mixed reality image-body. And as the illusionist example of *Pepper's Ghost* and the scene from *Blade Runner 2049* showed, it can be placed here and elsewhere at the same time.

The second point of departure for our contribution was the question to what extent current artistic examples can provide us a new kind of access to corporeality through their various entanglements of image and bodies. As the examples discussed have revealed, it is possible to exist with the virtual bodies in interactive situations of our lifeworlds without losing their actual effectiveness. Based on the artworks of Banz & Bowinkel, Sarah Rothberg and Charlotte Triebus, three ways of this short-circuit between image and bodies were identified: anticipation, alignment and attraction/avoidance.

Banz & Bowinkel's *Bodypaintings* made evident that the image can be a meeting point for bodies and places. As the avatar in the AR directly connects the artist's movement with the static print, it links the place of the studio with that of the exhibition. While Banz & Bowinkel's

avatar adapts preceded movements and demonstrates them to the viewer, Sarah Rothberg's naked figurine in *Longing* aligns in the very moment of reception with the user. This suggests an interplay between the avatar and the behavior of the viewer, because they draw a line "together." But it turns out that the viewers are ultimately thrown back on their own movement and thus on a location that takes place and happens through the other. In Charlotte Triebus' *kin\_* the avatars claim more agency. On the one hand, they seem to attract due to the realistic representation of the artist's body. But, on the other hand, they seem to avoid the viewers approaching. In contrast to the other examples, Triebus' multiplied avatars are able to challenge and irritate the viewers. They expand the "here and elsewhere" of their virtual bodies in relation to the user's body. They expand to occupy an entire area and spread across the image.

Thus, as a prospection, one can ask whether an encounter between physical and virtual bodies is conceivable in a mixed reality zone: if we follow a basal definition of encounter, then it is only an encounter if chance is necessarily involved and both parties leave it altered—to encounter means to become altered (cf. Roskamm 2008; Nancy and Meister 2021). Of course, the encounters in the mentioned examples are not by chance; after all, one has to install the app or to visit the exhibition. However, for the artworks, the place of art has to be recalculated as well as the specific position of the viewers, and thus the result is a conjunction. A reciprocity that prepares the base of an honest encounter. This corporeality along with the technical choreography points out in all clarity the foundation of an encounter between virtual and physical bodies, which involves a mix of both. Contemporary artistic positions are concerned with understanding the virtuality of one's own body and how its projection shapes our lifeworld. Accordingly, virtual bodies are no longer banned to the counter- or other-worldly, but have recently become a figure of awareness for and in our lifeworlds, as reflections in the meeting point of a mixed zone.

#### References


Verhoeff, Nanna and Paulien Dresscher. 2020. "XR. Crossing and Interfering Artistic Media Spaces." In *The Routledge Companion to Mobile Media Art*, edited by Larissa Hjorth, Adriana de Souza e Silva, and Klare Lanson, 482–92, New York: Routledge. Accessed December 12, 2022. doi: 10.4324/9780429242816.

## The 'Phygital' as the Virtual Real: The Role of Mixed Realities in Contemporary Art

*Pamela C. Scorzin*

#### Abstract

In this contribution, I discuss mixed reality (MR) and the prominent role it plays in contemporary art, with particular reference to strikingly staged cutting-edge mixed reality experiences by the *avant-garde* theater collective, Rimini Protokoll, and by two pioneering digital artists, Manuel Rossner and Marie Lienhard. Using mixed reality, a subcategory of extended reality (XR), they blend as well as modify reality along a continuum ranging from augmented reality (AR) via augmented visuality (AV) to virtual reality (VR). In their individual works, mixed reality interpolates in varying degrees between the real and the virtual. I show how their exemplary mixed installations and so-called immersive art experiences such as *Monet's Garden* focus on distinctive aspects of artistic mixed reality projects, most prominently on immersion, interaction, incorporation, and illusion. I further explore how changed ideas and concepts of art as well a viewer behavior have combined to stoke the current boom and popularity of mixed reality projects in the contemporary arts. Immersive and interactive mixed reality installations are not autonomous artworks, rather they are iterations of the basic network idea. On that foundation, they blend the virtual and the real, the physical and the digital (hence 'phygital'); they bring forth new art forms and contrivances that no longer exist independent of their operating context and the presence and participation (to different degrees) of their audiences. Mixed reality transcends, obliterates, and dissolves the boundaries and categories of the traditional, modern staging realm to excite and please younger, and more diverse audiences.

#### Keywords

Immersive art experience, mixed reality, immersion, illusion, interaction, incorporation, phygital, virtual real

#### 1. Introduction: 'Phygitality' and the Metaverse

'Phygital' is the evolving marketing term and buzzword (cf. Horowitz 2016 and Scorzin 2023) that describes our contemporary perception of and (user) experience with the progressive digitalizing of the modern worlds of life and work in which the physical and the real are increasingly permeated by the virtual and the digital. They are gradually merging to varying degrees into a new, enhanced or extended reality, conditioning our ability to experience as we move closer to the Metaverse (cf. Rinaldi 2022) and its vision of an immersive and interactive 3D online world. Movies like *The Matrix* (1999, directed by the Wachowskis) or *Ready Player One* (2018, directed by Steven Spielberg) early on brought the concept of mixed reality (cf. Speicher et al. 2019) to audiences and firmly established it in popular culture even long before their science fiction vision could be realized. In contemporary art, 'phygital' also means using advanced technology to treat audiences to unique interactive, immersive, and illusionistic experiences that mix the digital with the physical. In the current conception of the mixed reality Metaverse, our virtual selves will be able to engage in all manner of activities and actions of our conventional analog life. Thus, for example, we will have avatars as our digital twins who will no longer be bound locally but will range globally and in decentralized fashion.

Also envisioned for the Metaverse are immersive 360-degree dynamic 3D experiences in which motion and time are synthesized and convincingly simulated in space to be palpably experienced as 'in real life.' The holodeck in the Star Trek TV franchise is probably the most evocative illustration of this idea to date in popular science fiction. A new online platform industry with sizeable economic stakes and high profit expectations stands ready to back the further development of existing XR variants toward a Metaverse. But these novel synthetic worlds also have potential for simulation-based learning in general, for training purposes (military included) and as pure entertainment (cf. Cheok et al. 2009). In the latter category, the commercial video/computer game industry and social media giants are betting heavily on these new expanded realities. For example, live concert experiences offered by online game platforms like *Fortnite, Minecraft*, or *Roblox* already augur a future of such highly commercialized endeavors thriving in a purely digital space.

#### 2. Mixed Reality as Extended Reality

In the next two sections, I first discuss mixed reality as a subgenre of extended reality in general terms and then survey the contemporary art and culture scene. Mobile AR, AV, MR, and VR applications have historically been anchored—at least conceptually—in media art, pioneered by media artists like Myron W. Krueger, Lynn Hershman Leeson or Jeffrey Shaw. As early as 2017, Art Electronica's VRLab, the Linz (Austria) based world leader in media art, proclaimed that:

virtual reality, augmented reality and mixed reality, total immersion in virtual worlds and superimposing data onto our reality – for several years, everybody has been talking up these concepts and ideas once again. The enthusiasm that accompanied the dawn of this new high-tech age in the 1980s and 1990s is back, whereby the technology deployed in today's data glasses (head-mounted displays or HMDs) seems to finally be capable of living up to the visions that preceded it. VR, AR, and MR have become a playground for various pursuits in the gaming sector and film industries, for applications in the educational field and tourism market, for works of art and architecture, the creative economy, performance, and the theater.

Ars Electronica Center since then has specialized in CAVE-like fully immersive, extended (virtual) reality experiences, most recently with *Deep Space EVOLUTION.*

Turning to contemporary digital art, here also we find a current of adopting the latest hardware used in creating XR experiences—new headsets, displays, wearable devices, glasses, or (holo)lenses with some critical potential. At the same time, the digital art scene itself in many cases inspires and drives new XR. Using techniques such as over-layering, blending, and merging of digital and physical, virtual and natural spheres of reality, new fictional and symbolic worlds are being realized. Characteristically, they present a singular present that participants can experience directly. Digital moving images thus create immersive new environments that audiences ultimately should be able to interact in a quasi-natural way.

XR serves as an umbrella term for the gamut of computer-based virtuality, including characteristic 'phygital' experiences in everyday life (for example, when using the smartphone). The virtuality continuum introduced by Paul Milgram in 1994 (Milgram et al. 1994, and Milgram and Kishino 1994), ranges from the completely real to the completely virtual (see Skarbez et al. 2021). The former, for example, could be a built stage set, an escape room, or an Instagram pop-up museum, while the latter might immerse audience members in an interactive Metaverse of perfectly simulated and seamlessly connected hybrid worlds. The XR spectrum therefore encompasses not only VR and AR but also many in-between hybrid forms. One of these is augmented virtuality (AV), in which a VR staging integrates natural objects and physical props. These variants can also be found today in contemporary space- and time-based media installations, where they serve, for example, as (self-explanatory) interfaces. The concept accommodates, for one, virtual artifacts overlaid on a real-world environment and, for another, real objects projected into and controlled in a virtual world, and lastly, total immersion in an all-encompassing, holistic virtual environment. Here I follow Skarbez 2021 in arguing "that the 'virtual reality' endpoint has yet to be reached, and any form of technology-mediated realities are mixed reality" (Skarbez 2021). A noteworthy recent advance in XR is the innovative WebXR Web application programming interface (API). It makes possible development of web-based applications (e.g., art apps) that can display three-dimensional content on various compatible (mobile) AR and VR devices.

Contemporary artists and art collectives in the dynamic mixed reality field are collaborating on research-oriented and experimental works with developers, coders, and programmers, with designers, scenographers, and cultural institutions, and even with activists. It should be emphasized, however, that so far the variants of hybridized and synthetized multiple realities have yet to be conceptualized into a coherent and continuous augmented reality, which, going with Skarbez once more, we would like to understand, at least in the majority, as one mixed reality (MR). They still suffer from discontinuities and related dissonances, such as causing audiences to receive them as primarily artificial, even if this should be self-evident, especially in art projects. However, AR and VR hardware continues to improve (digital imagery with high fidelity), is more affordable, and becoming more widely available. This enables the creation of increasingly illusionistic and realistic quasinatural hybrid environments constituted of digital moving images. Yet even more crucial for these installations is that today's computing power can perfectly adapt the processed moving images to audience members' body movements, viewing directions, and perspectives with barely any time delay.

#### 3. Mixed Reality in Contemporary Art

As of yet, we are still waiting for the definitive monograph to be written on mixed realities and their new 'phygital' and multimodal experiential world in the contemporary arts. However, for digital artists working in this new field, many new exhibition spaces, platforms, international festivals and theme exhibitions have sprung up globally. Often, they appear at the intersection of the commercial game industry and AI. In the broad spectrum of XR, and especially on the mixed realities continuum, i.e., from AR to AV, many different expressions and levels of coherent hybridization and continuous synthetization of differing realities can be found. In contemporary art they create a singular new space of perception and experience. How these realities are merged and synthesized depends in each case on the artistic conception, the topic, or even the context, and crucially on the accessibility and availability of software and hardware assets. In all cases, however, the idea is to artistically marry the real with the virtual, the physical with the digital, the material with the intangible in an innovative and passably original way. Many such artistic mixed reality art projects are funded by cooperating cultural institutions through grants and fellowships for artists interested in exploring and experimenting with the latest advanced technologies and doing research in technical development. Generally speaking, mixed reality art, given its operational context and post-autonomous status, depends on a technological network culture and extensive infrastructure. The current digital art concept also draws on the contemporary phenomenon of collectives, groups, and networks working jointly to create something with a specific agenda—in entertainment, in education, and even in activism.

However, the quality of the illusion or simulation of a new dynamic 3D reality depends on more than just the degree and coherence by which different reality spheres are synthesized and hybridized using the latest technologies. I suggest that also at work here is a boundary-dissolving illusion-obliterating effect which directly affects art's framework. But it also impacts immersion (cf. Slater 2006 and Slater 2009), incorporation, and the possibilities for interaction at their designated interfaces. On the other hand, according to Koleva et al. (1999), mixed reality boundaries can also act as transparent windows between physical and virtual spaces. These authors introduce a set of properties that

allow configuring such boundaries so they can support assorted styles of cooperative and co-creative activity. They group these properties into three categories:


Media development has a long tail of continuities and disruptions in the apparent dissolution of traditional interfaces between space and image through illusionistic effects and interactive options for action. Western cultural history is rife with examples of immersive spaces, such as static preforms for digital moving-image environments that staged reality with physical materiality and intangible components like light and sound: the nineteenth century's famous painted 360-degree panoramas with their faux terrain and elaborate light directories and planetariums with their full-dome laser-projection technology. They culminated in the conception of the today's highly topical Metaverse, that is entered via designed proxies, such as (fashionable) avatars. However, even the Metaverse, although experienced as a dominantly virtual environment, still operates in a material and temporal context, starting with the necessary servers and technical network, hardware, and usable (visual, haptic, auditory, and olfactory) displays.

Still, even with 360-degree moving images now a reality, motivations and intentions will differ ranging from pure entertainment to experimental artistic research to training and academic instruction. These various demands are also found in contemporary art, which is why it treats mixed reality techniques as merely (innovative) tools and not as ends in themselves. To substantiate this creative ferment, in the remainder of this paper I present three case studies of contemporary art projects situated at different points on the mixed reality continuum. Each produces effects of illusion, immersion, interaction, and incorporation for its audience in differentiated ways with digital moving-image components based on a larger operating context. However, they have in common that audience members become active participants in the completion, or, as it were, the realization of these new mixed reality worlds of sensory experiences. These installations therefore each create a new virtual reality. The question is whether contemporary intangible

or material mixed reality projects constitute a new space- and timebased media genre in the art field and if as such they can simulate or even stimulate the art experience.

#### 4. Immersive Art Experiences

Where in the past autonomous artworks traveled the world to be exhibited, today the global trend is to stage immersive art experiences featuring them as reproductions or simulacra. At the tour's various stopping places, audiences now apprehend this art mainly in mediatized ways. Stationary art has for so long been mobilized via its reproductions that the mediated art form ultimately phantomizes into a 'real' simulacrum. Numerous specialized agencies and international companies have emerged in recent years for staging touring art events as immersive art experiences—such as Atelier des Lumières (Paris), teamLab (Tokyo), Meow Wolf (Santa Fe, New Mexico), Factory Obscura (Oklahoma), Frameless (London), or Artechouse (New York) to name just a few prominent examples. Most of these creative companies showcase and exhibit mediated art exclusively. Thus, immersive art experiences must be fitted into the context of the experiential event economy. They also go hand in hand with new educational and cultural policies responsive to social and demographic change. Mostly these spectacular stagings and scenographies draw audiences that are more socially heterogeneous, diverse, and younger.

In line with its cultural or social origins, this new mode of reception has been characterized by gamification and (non-linear) storytelling for some years now. The same trend is changing the concept of what is art, as evident in the global boom of immersive art experiences in major cities like these blockbusters in 2022: *Van Gogh – The Immersive Experience; Dali – The Endless Enigma or Genius da Vinci; Cézanne, the Lights of Provence; Klimt, The Immersive Experience;* or *Kandinsky, The Odyssey of Abstraction* (cf. Wiener 2022).

These classic works of art by the great masters are transiently presented by lasers in 360-degree, cinema-sized projection inside huge vacant spaces of the post-industrial age. There they attract larger audiences than the originals in traditional museums, which the younger

generations in particular tend to regard as elitist temples of high culture. Here, 'phygitality' in the contemporary arts means using advanced technology to bridge the gap between the digital world and the physical world for (new) audiences looking for unique interactive and immersive experiences. The mixed reality of the future Metaverse will allow all conceivable activities and actions of the old locally-bound analog life to be carried out by twinned digital representations, such as avatars that are free to roam the Metaverse in a decentralized manner. But for now, the immersive and interactive blending of formerly separated spaces with more or less seamless cinematic projections and 3D video mapping in abandoned industrial architecture or onto monumental facades has resulted in a new pervasive genre with enormous media appeal. These immersive experiences are not only assets and exhibits but also represent a new medium. That is not to say they are without their shortcomings: often crude overlays of projected moving images and operative architectural contexts with all their dissonance for the time being only offer limited possibilities for intuitive interaction. Still, facial/body/position tracking in the chosen projection space already allows fascinating interplays between a mobile audience and moving images that continue to reduce friction and achieve seamlessness. Consequently, they have the power to move the audience emotionally as it wanders through them. They accomplish this with moving digital images that invade, occupy, and overlay natural physical spaces, creating novel hybrid spaces of perception and experience that can reach broader audiences in touring experiential spaces. In this respect, immersive art experiences once again build on the tradition of the spectacular panoramas of the nineteenth and twentieth centuries (see Oettermann 1980). These were highly illusionistic *veduta* or spectacularly panoramic historical battle scenes that toured world fairs for audiences to marvel at. But the modern traveling immersive art experience increasingly allows artists and designers to think in transdisciplinary and scenographic terms, thus combining visual arts and (found) existing architecture in a temporary synthesis. In doing so, they give new prominence to the interactions and entanglements of architecture, but they also transcend limitations or dissolve boundaries to produce unique and visually coherent spaces for a sensory experience.

*Monet's Garden* (Fig. 1) is just such a multi-sensory, immersive exhibition space creatively staged by Roman Beranek. It casts its spell on

*Figure 1: Monets Garten, Immersive Art Experience, Berlin – Alte Münze, February 2022. Photo: Pam Scorzin.*

the audience using projected moving imagery, stage props, and musical sounds to tell the story of French impressionist painter Claude Monet (1840–1926) through his famous garden in Giverny. According to the *Monets Garten 2022* press release, the idea is to offer a new perspective on the artist creative output by immersing the audience not only in his works but also in a sensory experience that integrates Monet's central themes of light, shadow, wind and water. The aim: to create "an overall poetic concept" with state-of-the-art technology. The sophisticated dramaturgy of the (light) projections, a cinematic collage of the impressionist painter's body of work using images that mix styles and media, combines with a soundscape of selections from Erik Satie, Claude Debussy, Maurice Ravel, and Jean Sibelius to seduce the audience primarily through the dominant visual and auditory senses. At the same time, however the guests are invited to linger—at least until the next loop begins playing—in the experiential exhibition space's varied seating and lounging arrangements. Guests entering the projection space thus become the casually profane performers of a new event-oriented exhibition culture.

Immersive art spaces also set on its ear the idea that art is to be silently worshipped in physical art museums and galleries. As Witherspoon (2021) trenchantly observed: "Art museums and galleries create an environment similar to a temple. Guests cannot touch items, believe they should remain quiet and respectful, and are less likely to interact with others outside their group. Looking at art then becomes akin to silent worship. Interactive art creates a more joyous and less serious environment."

The immersive exhibition concept also mirrors the standard audience-first approach and emphasizes the social moment of gathering and communicating the way players do on the online gaming platforms. In *Monet's Garden* 2022, the content blends education and entertainment: with 76 of the French painter's most famous artworks by projected on film as oversized, luminous reproductions in real space along with ancillary information. Art history here becomes a form of multimedia storytelling. The last room of this "mediated reality" installation (see Mann 2018) has projected on its surfaces a vast water lily pond, creating the illusion of an endless whole. It serves to hypostasize Monet's conception of art and painterly style as the dissolution of forms and colors into a single impressionist percept. The narrative visualization in and with space through the mixed reality lens thus becomes compelling storytelling that rewrites history.

Current 3D mapping projection systems also allow content such as graphics, animations, texts, images, or videos to be projected onto three-dimensional objects to enrich reality with additional information. In such 360-degree projections, the viewer is illusorily transported into the midst of the cited works of art and brought into a close dialog with it. Most immersive art shows aim to create extraordinary experiences with art that audiences usually cannot access in reality. Here, they are immersed in light, colors, shapes, and sounds that address all their bodily senses. The physical bodies also become part of the staged scenery (Fig. 1)—the mediatized artworks seem to interact with them. Moreover, art is heightened to a total work of art (Gesamtkunstwerk), as promised by the Swiss creative lab 'Immersive Art AG', which developed the immersive film project in cooperation with the tour organizer Alegria Konzert GmbH in 2021, it elevates art into a total work of art and holistic scenography.

Directors and producers of such experiential immersive art shows like to say that their success and popularity prove that younger audiences want to be 'wowed' by art at an engulfing scale. A telling example that they are on to something is furnished by the *Van Gogh Starry Night* exhibition that debuted in February 2019 at the Atelier des Lumières in Paris. Featured only once in an episode of *Emily in Paris*, a popular Netflix series (cf. Boucher 2021), it had a profound impact on the popularization and proliferation of such mixed realities in the art realm worldwide. Instantly, similar immersive art experiences served as frequent selfie backdrops for the Instagram generation worldwide. The desired synchronization of the audience's movements with the moving images projected onto the interior architectural space here again is not entirely satisfying. It ends up as photographs taken in a mixed reality that become fodder for posting on social media platforms, testifying that one has been "(in) there" at that iconic spectacular immersive place of sensations.

Superficially, the immersive art experiences will thus primarily appeal to new and significantly younger target groups with diverse ways of seeing and understanding: "We are seeing a shift away from traditional structured narratives, more fluidity towards brand, genre and technology constraints, and a more authentic, audience-first approach, considering them as empowered co-creators. The result? More meaningful experiences which resonate deeply and instill purpose" (Di Stefano 2022, n.p.). However, a closer look at the image motifs, the ways of storytelling and world-building, and their visual modalities deployed in these immersive art experiences shows that they also challenge the artistic authorship as well as the traditional or scientific narratives concerning the projected supersized video-animated artworks. Hence, these types of experiential mixed realities also drive a different form of art mediation, of art historiography, as well as new aesthetic politics. Audience members are now fully addressed as participants and active clients (however, mostly, if not solely, in comfortable and non-confrontational environments). Scenographic art exhibitions instead of purveying pedagogy and education, are now becoming indistinguishable from any number of cultural sites and experiences striving to deliver whatever kind of 'content' (cf. Calise 2021). In the next stage, with more automatic or AI-based interactive elements, such as face- and body-tracking systems in the installation space, participants will be classified as "empowered co-creators." This emancipation to co-creation is also often understood as offering greater accessibility and inclusion in distinct types of mixed realities.

Pundits and critics easily dismiss many of these commercially oriented immersive art experiences as cheesy entertainment, as simulacra and fiction delivered by machinery. They critique them as simple attempts at spectacularization and guilty of nothing less than diminishing the actual cognitive and artistic value of the cultural heritage. Instead of adding layers of understanding or interpretative explanations to the exhibited artworks, the exhibitors rip them off while depriving their audiences of an opportunity for education and knowledge. Contrariwise, some observers regard popular immersive art experiences favorably as an evolutionary stage on the way to more sophisticated and complex 3D projection-mapped (vs. fully immersive) animations. They envision technology-driven immersive installations with elaborate soundscapes that obliterate the boundaries between the audio-visual and the spatial. Other possibilities exist for the creative technical coupling of moving digital images with the operational context/infrastructure to create multimodal entertainments that stimulate the senses and, ultimately, boost cognition in their audiences. Embedded and embodied interaction between visitors and moving images are on the way, the optimists say, despite their relative primitiveness in many popular immersive art experiences.

With the spread of AI-supported body and face/emotion tracking systems, for example, before long visitors will be given the power to (playfully) affect the processual flow of images in the exhibition space through their intuitive movements and (physical and emotional) reactions. With the increased possibilities for interacting via reactive moments of movement, an ever-stronger coupling is also on the horizon for interaction between the projected moving image on display and the visitors in the real space of the mixed reality installation. The technology thus will support a simple form of audience engagement and attention-focusing. It simultaneously reinforces the supposed closedness of the XR world as a holistic 'Gesamtkunstwerk' (after Gottfried Semper and Richard Wagner). At the same time, the components of the digital moving images of this synthetic environment are experienced as virtual, variable, and viable.

Basically, an XR experience always emerges from this interplay between a sensuously perceiving body surrounded by digital moving

images that are carried and controlled by an operative context. Synchronization joins synthetization in playing key roles in mixed reality worlds, with their creators artistically and creatively fine-tuning them to enthrall their audiences. The downside, already alluded to earlier in this chapter, is that instead of supplying knowledge and cognition, such mixed reality installations are growing "indistinguishable from any number of cultural sites and experiences, as all become vehicles for the delivery of 'content'" (Calise 2021, n.p.). Add to this that most of these XR exhibitions, like their panorama precursors, tour the globe's major cities across Europe, Asia, and North America. Many of these art events are mounted in empty or transitional spaces of the post-industrial era as stopgaps until a permanent tenant can be found for them. After the pandemic, cities have a surfeit of empty box stores, vast industrial spaces, and even theaters waiting to lure broad swaths of the population with new sensations. These are the more obvious economic reasons behind the growth of the immersive experience phenomenon. It also behooves us to ask what new worlds we are building utilizing mixed realities and similar advanced technologies that were invented and developed by a generation on the US West Coast experimentally expanding their minds with LSD. This history holds the clue for where the dominant general understanding in the art field of the potential of XR installations come from: they aim to expand reality with memorable sensuous experiences and adventures of the mind—in other words, escapism and dreaming. On the other hand, the physical body and its virtual representation in the moving image, such as a designed avatar, are joined in a significant 'natural' interface function. At stake, therefore, is not merely dissolving corporeality but instead immersing physically with all the body's natural senses in a staged narrative in space and time—in the quest for a "second nature."

#### 5. Rimini Protokoll: *Urban Nature*, 2022

The XR field in the contemporary arts is still dominated by real stagings in space, i.e., with built stages, sceneries, environments, and installations combined with temporal performative components featuring AR, VR, or holograms. A representative example in this category is furnished by

the theater collective Rimini Protokoll. In their work, the theater/stage intersects with the museum architecture and performance space to form a holistic mixed reality experience. In their most recent effort, *Urban Nature* 2022 (concept, text, and direction by Helgard Haug, Stefan Kaegi, Daniel Wetzel), visitors are sent with handheld digital devices (iPads/ tablets) on a tour of the exhibition space. Here, in museum architecture configured into seven installation rooms, they interact with present live and virtual/recorded performers. However, the visitors themselves act as performers for other participants in the staging. Equipped with tablets and headphones, visitors experience both as individuals, and in groups following the marked path through the exhibition—a complex blend of built (backdrop) spaces, set designs, live performances, video images, recordings, and sound. Behind it all, a precisely timed system of video, sound, and light signals on a common standard timeline closely links to the computer-tracked movement and situations of the visitors. The theatrical mixed reality visit of *Urban Nature* thus evokes a complex shared reality like that of the subjectively perceived world of our modern megacities where spaces and encounters are staged in reality, mediated by media, and experienced individually. The boundaries and separations between private and public, between individual and shared common experience, blur and merge. *Urban Nature*, this expansive mixed reality installation staged inside the concrete architecture of the Kunsthalle in Mannheim (Germany), aims to replicate the play between the physical-real and the virtual-digital, as described in the following quote from the Rimini Protokoll website:

During their visit, members of the public take on distinct roles. These include a financial adviser at a private bank looking to diversify investments in excess of €2 million, or a prison worker who, in a reconstruction of a cell within the exhibition space, explains how many of the inmates earn more in the prison than when free. Theatres and museums are typically used in opposing ways. Whereas theatre audiences are normally immobile for one or two hours as the performance takes place on the stage before them, in museums the public moves through the exhibition. URBAN NATURE blends these two modes of reception: while some visitors follow life stories individually as active spectators with a tactile tablet device, others experience the exhibition as a group. All are able to observe how others take on different perspectives, but they are also challenged to look at themselves in the mirror and experience dependence between various positions and their freedom for personal action. (Rimini Protokoll 2022)

*Urban Nature's* theatrical mixed reality mode has its visitors simultaneously visiting and interacting with the physical and digital worlds. For Rimini Protokoll's scenographer and artist Dominic Huber, this mixed reality constellation in the art museum depicts the experiential and living space of an urban population on different levels that intertwine and are directly experienceable as a symbolic art space in a narrative space. The artistic and creative means deployed are mutually dependent. The rooms, backdrops, and scenes, the concrete processes, actions, and situations inscribed in them, as well as the narratives of the various participants and protagonists—they can no longer be dissociated in experiencing the installation. The scenography of this mixed reality experience comprises not only the artistic design of spaces and surfaces but also a compelling narrative structure in real space that can be experienced situationally in the temporal sequence.

#### 6. Manuel Rossner: *New Float*, 2022

As we move more toward the virtuality side of the current mixed reality spectrum in the art field, we find specifically AR apps and predominantly virtual environments that quote or integrate natural reality and enrich it with additional information (such as visuals, sounds, sceneries, or narratives). Typically, these are digital 3D images synchronized with the audience's natural body movements in the blended space. These mixed reality experiences in the arts are also strongly inspired by the action modes of video/computer games, whose digital environments elicit predominantly motor movements for controlling viewpoints focused on staged visual events. Such experiences may take place in purely virtual museum spaces that are nonetheless connected to a specific place/ geo-location and existing art collections.

Exemplary here is Berlin-based artist Manuel Rossner (Figs. 2a and b) and his creation of a purely virtual private art museum in 2021 that can be visited from anywhere in the world via the Internet. Working primarily with AR and VR technology, he built his interactive and immersive architecture with virtual materials/intangible components that are spatial interventions and virtual extensions. The impetus for his project was a museum building boom in Berlin that he thought lacked

a museum for digital art and NFTs; he would thus "lend the city a helping hand in the wink of an eye." He debuted the digital exhibition space *New Float* (Figs. 2a and b) on February 2, 2022. In a bold move, he staged it in the open space between the Neue Nationalgalerie (New National Gallery), Philharmonie, and St. Mathew's Church in Berlin, where the Museum of the 20th Century by Herzog & de Meuron is slated to open in 2026.

*Figure 2a: Manuel Rossner, New Float, Digital Space, 2022. Courtesy: the artist.*

*Figure 2b: Manuel Rossner, New Float, Digital Space, 2022. Screengrab: Pam Scorzin.*

*New Float* is a purely virtual exhibition space for presenting and discussing current developments linked to NFTs and post-digital art. It does so in a fitting style designed to create experiences with digital art in a native, immersive spatial context. By applying decentralized technologies, he also explores new ways of curating and exhibiting works of art. This virtual artist museum space with 'real' NFTs on display at a geolocated spot on the globe can now be accessed online, anytime, and anywhere through Spatial.io, the name of the collection *New Float* houses. The private artist collection includes NFTs, tangible generative art, crypto art, 3D sculptures, augmented reality, AI art, video, and (digital) photography. Rossner designed the virtual museum building using only digital tools employed by contemporary architects and built it applying digital principles of dynamic expansion, flexibility, and rapid adjustability. The final building's dimensions were created with physics simulation: digital clay forms the inside walls, each of which is customized for the single work of art displayed on it. *New Float* now forms a permanent part of Spatial.io, a Metaverse of geolocated art experiences built by a digital artist.

#### 7. Marie Lienhard: *Logics of Gold*, 2018

Also found in contemporary arts, in addition to Manuel Rossner's predominantly virtual spaces and the real-staged virtual spaces by Rimini Protokoll, are mixed reality projects combining thematically and scenically coupled stage props with head-mounted displays (HMDs). However, these so-called inter-reality systems coupling a virtual reality system with its real-world counterpart are still rare. One such rarity is Marie Lienhard's somatic mixed reality installation *Logics of Gold* (2018, virtual reality video in a set design, 5'00"). In her words, it is "a two-meter diameter gilded helium balloon, to which a 360-degree panoramic camera is attached, that rises into the skies. This camera films the environment in the round from above; as the balloon continues to the edge of space, the world keeps shrinking until the balloon bursts at 35 km altitude. As its gold-plated fragments drift back to earth, they are also caught in the 360° VR video" (Lienhard 2022).

The fully immersive virtual reality imagery visually gives a mind-blowing physical experience of weightlessness. The viewer dangles freely in space in a rocking seat designed into the built set design. This effect appeals to the body's visceroceptors, which enhances the visual immersion effect in mixed reality. Here the artist's video is visual poetry but, simultaneously, also a documentary. It stimulates the recipient's sensations of actually flying in space—similar to what happens in an enclosed space flight simulator. The physical boundaries are all at once transcended in a double sense—for a brief sensory spacey 'trip.' Embodied and embedded (inter)actions are the prerequisite for this simulated space flight experience. It triggers a cognitive response that focuses consciousness on the limitations of the earthly living space and its thin blue band of atmosphere. Ultimately, this mind-bending mixed reality artwork dialectically couples the dissolution of boundaries and of limits.

#### 8. Conclusion

A mixed reality environment in the contemporary arts is an extended reality in which staged real world and virtual world objects, or artifacts and stimuli, are combined to form a single percept. Art is actualized by the synchronized interplay of moving images or lifelike scenes and participants as active agents in a network. As such, immersive and interactive mixed reality installations perfectly represent today's so-called 'phygital' experiences, in which recipients actively participate. They perceive simultaneously virtual content that is real, directed, produced, and staged in multisensory fashion. The participants thus can experience a mixed reality as a coherent synthetic and symbolic environment for a meaningful duration. Reactive immersive art experiences, for example, using position tracking, facial and emotion recognition, body movements, and novel gestures produce new enhanced forms of (interactive) storytelling and world-building in scenographic terms. Even if only momentarily, audiences can feel as if they are immersed in a processed, staged hybrid reality. The participants' sensory actualization conveys a particular narrative or discourse via visual storytelling in space and time. Especially in the arts, the all-encompassing XR projects in the form of immersive and interactive mixed reality installations essentially adopt

a strictly holistic approach, with the audience whose bodily senses are addressed all at once explicitly placed in the center.

By heightening the possibilities of (intuitive) interaction between moving images as the new 'phygital' environment and the body via in/tangible interfaces, the mixed reality art installations I have discussed represent another step forward on the long road to the final social Metaverse. Mixed reality art installations are not only popular artistic spaces of contemporary experience but also function as social/ communications spaces for shared experiences and dialogue, as when the members of a group share the sensation of being in a virtually real venue like *Monet's Garden*.

Modern 'phygital' experiences ultimately add value to each domain that they artistically transform into a mixed reality. If investors and leading tech companies like Microsoft spearheading this popular trend have their way, mixed reality is on track to 'liberating' humanity from (two-dimensional) screen-bound experiences. They plan to get there by engineering instinctive interactions with data in our living spaces and with our (remote) fellow humans. Gen Z audiences are hungry for experiences that are exciting, dynamic, interactive and shareable on social media. At the moment, mixed reality seems to suffice for filling this need in the contemporary art field—blurring the line between art museums, movies, theatres, games, social events, and theme park attractions. At their best, these mixed reality installations are compelling, unique experiences based on the latest technologies and advanced digital artists' tools that only recently have become accessible and capable.

#### References


## Inhabitable Bodies: On Embodying Virtual Reality Experiences<sup>1</sup>

*Anna Caterina Dalmasso*

#### Abstract

Virtual reality has been repeatedly presented as a medium capable of providing effective first-person experiences and even as an opportunity to transcend the limitations of physical embodiment. As a result, immersive environments are often understood as a rearticulation or remediation of the figure of point-of-view shot, which dominates contemporary mediality. But, how does virtual reality actually engage with the possibility of inhabiting a different body, to provide us with a prosthetic or augmented body? The chapter tries to outline a phenomenological analysis of the conditions of embodiment elicited by contemporary immersive environment, by focusing on: 1) how the virtual dimension does not replace the real with the experience of an "alternative" reality, but gives rise to a twoway movement, since the virtual space is simultaneously augmented by the real; 2) how immersive interfaces put us in contact with a constantly actualizing reality, which unfolds in real time, to the detriment of any representational or referential component; and 3) how virtual environments confront us with an image which generates in accordance with the embodied movement of its experiencer, the living body being the pivot of a process of performativity, shared by the immersants' bodily movement and the virtual image.

#### Keywords

Virtual reality, point-of-view shot, first-person shot, embodied experience, augmented virtuality, presence, frame, performativity

<sup>1</sup> This research has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. [834033 AN-ICON]), hosted by the Department of Philosophy "Piero Martinetti" (Project "Departments of Excellence 2023-2027" awarded by the Ministry of University and Research).

#### 1. Virtual Reality as First-Person Media?

Virtual reality is a medium still in search of a specific form of expression and creative grammar and still in the process of becoming institutionalized, which is why the present state of virtual reality technologies is similar, in many ways, to the early phase of cinema history. In many respects, the settings of virtual reality experiences could be easily classified under the notion of "attraction" elaborated by new film history (Golding 2019) to describe spectatorship of early film audiences as focused on the technology rather than on the content conveyed by it, that is to say, more interested in experiencing the novelty of cinema as a medium and technology, than in the films that were projected (Gunning 1990, Gaudreault and Gunning 1989, Strauven 2006). In the context of virtual environments, the "attraction" effect revolves around the strong sense of presence and so-called "place illusion" (Slater 2009) that are made possible by the interface, through the simulation of plausible sensorimotor contingencies.

Precisely because of this effect, immersive experiences, ranging from VR cinematography to gaming, displaying or hybrid contents, demand a reassessment of our understanding of aesthetic experience and spectatorship. The sense of presence conveyed by immersive technologies is able to challenge the user's awareness of mediation and their image consciousness. In such a germinal phase, the analysis of storytelling strategies, narrative and normative discourses is just as important as focusing on a phenomenological analysis of VR technologies and their implementations in terms of aesthetic strategies.

The spectators, whom we should rather call "immersants," wearing a head-mounted display, are suddenly enveloped in a virtual unframed 360-degree environment, which conveys a feeling of "being there" that vividly imposes on the senses the impression of inhabiting another reality. Presence studies have played a pioneering role by focusing on how remotely operated machinery and virtual reality technologies (Held and Durlach 1992, Slater 2003 and 2018, Farocki 2004, Calleja 2011, Lombard et al. 2015, Paulsen 2017) can elicit the feeling of being in a place other than our physical location, defined as a sense of "immediate transparency" (Bolter and Grusin 1999) and as a "perceptual illusion of non-mediation" (Lombard and Ditton 1997).

Nowadays, since the new wave of virtual reality began in the early 2010s, virtual reality has been repeatedly presented as a medium capable of providing effective first-person experiences (Mateer 2017, 14) and to put the immersants "in the shoes" of other individuals. This way to understand VR, that has characterized public discussion, is mainly due to the fact that the goal of a large part of immersive contents was very much concerned with humanitarian and prosocial objectives (Rose 2018), in the belief that virtual reality represents the "ultimate empathy machine," to quote Chris Milk's famous claim (Milk 2017, Fisher 2017, Sanchez Laws 2020, Bollmer 2017, Pinotti 2021, Morriet 2021), as it grants the possibility of putting the audience directly into the event and therefore of eliciting a transformative experience. By dissolving the boundaries of the frame, we would potentially be able to make real contact with the spectacle instead of objectifying it, by breaking what Alejandro Gonzales Iñárritu has called the "dictatorship of the frame" (Iñarritu 2017). In fact, in presenting his first virtual reality production at the Cannes Film Festival in 2017, the Mexican director emphasized how the two-dimensional format of the cinema screen hinders the spectator's engagement with the narrated events, whereas immersive technology can instead overcome this limitation: thanks to the dissolving of the frame boundaries, we would be able to undo the detachment that traditionally separated spectator and spectacle. This line of argumentation accords with the inspiration behind the work of Nonny de la Peña, a pioneer of immersive journalism, whose ambition was precisely to bring the audience "on scene," thus inaugurating an affective and visceral experience, to be lived *in first person*. However, the emphasis on first-person narratives in virtual reality also permeates other fields in which immersive technologies are notably employed, such as video games and pornography, mainly featuring point-of-view perspectives and focusing on a strong sense of embodiment provided by the new medium.

Such an understanding of virtual reality has decisive consequences on the way we can think of our embodied experience of immersive environments: as media scholar Melanie Chan points out, since the appearance in the 1980s and 1990s of media representations of virtual reality, specific to the imaginary and visual culture of the end of the twentieth century, "virtual reality was often represented as a wondrous technology that could provide an opportunity to *transcend the limitations of physical embodiment*" (Chan 2015, 1). The growing interactive

*Figure 1: A still from Joe Hunting's We Met in Virtual Reality, 2021, HBO.*

community inhabiting VRChat worlds (Fig. 1) and the still uncertain future of the so-called Metaverse and the epistemic, imaginary and technological construction of its advent by Meta's communication strategy, are underpinned and rely precisely upon this assumption.

Virtual reality is presented not only as providing an extremely immediate and enthralling experience of simulated worlds, i.e., allowing participants to come into contact with the spectacle instead of objectifying it, but also as an opportunity to immerse oneself in the perceptual experience and point of view of other individuals. As a matter of fact, in the last few years, a vast part of virtual reality contents realized (ranging from immersive journalism and fictional experiences up to therapeutic and prosocial applications) seem to engage precisely with the possibility of *inhabiting a different body*—be it human or non-human—and thereby having access to an extracorporeal experience, without leaving one's own body. But when one enters virtual reality, what body will one inhabit (and this, regardless of the visible presence of an avatar body)? What it is like to live from within another embodied perception? What are the aesthetic strategies that make possible an actual embodiment of someone else's corporeal experience? Do such attempts lead to failure, or can they succeed in providing the feeling of inhabiting the body of another person?

Nonetheless, the attempt to access subjectivity through media, i.e., to mediate an internal point of view is not new in the history of media

and devices (Reinerth and Thon 2016). The idea of sharing someone else's vision, not just in the form of a visual document, but of coinciding with the very movement of their gaze, is deeply rooted in the history of the moving image. Therefore, by following this objective, virtual reality pursues a cinematic drive, a desire that emerges very early in the history of cinema, namely the will to embody the perception of the other, to see what the other sees. The cinematic apparatus elaborated an effective means to let the spectator embody the view of another: the so-called point-of-view shot, or subjective shot (Branigan 1975 and 1984) emerged in early cinema, contributing to the elaboration of film experience and spectatorship (Dagrada 2014, Gaudreault 1988, Casetti 1997). Measuring themself with such a cinematic form, the beholder is called to occupy a utopian position, standing at the same time as both an intradiegetic character and the device that enables the production of moving images. This aesthetic form most thoroughly expresses this striving to see what the other sees which became embedded in the cinematic medium. The early debate that developed between the 1920s and the 1940s around the visual construct of the point-of-view shot was already very much concerned with the epistemological possibility of externalizing vision as lived from the inside (Münsterberg 1916, Merleau-Ponty 1945a and 1945b, Mitry 1965, Wall-Romana 2012), namely of visualizing one's own inner perception, especially as regards altered bodily states (such as vertigo, vision loss, or drunkenness).

Both the point-of-view shot and its post-cinematic evolution, the first-person shot (Eugeni 2012), have been the object of an extensive study in cinema history and theory, having been mostly investigated in connection with narrative and semiotic theories, as a cinematic adaptation of literary first person (Jost 1987 and 2004, Jost and Gaudreault 1999). But, if we understand first-person perspective following a phenomenological approach, the operation realized by the cinematic POV also attempts to provide us with a prosthetic or augmented body. Drawing on this theoretical perspective, film itself has been understood as a "viewing subject," able to make the perceptual experience of its own body available to the audience. Cinema has been understood as a medium that endows the spectator with a virtual body, the body of the film as a "viewing-viewed" subject (Sobchack 1991).

But should we then understand virtual reality as an evolution of the cinematic point-of-view shot (Bédard 2019) and the postcinematic first-person shot, which Ruggero Eugeni has defined as the "symbolic form" of our time (Eugeni 2015), as manifested nowadays by the proliferation of first-person media? Is it within the framework of this aesthetic and narrative figure that we should understand the experience of VR technologies and access to virtual worlds?

### 2. Augmented Virtuality: The *Hic* of Virtual Environments

Extended realities and especially the environments we can experience through virtual reality technologies have come to challenge our classic conception of images as being separated from the material world by the boundaries of the frame (Pinotti 2021, Conte 2020)—be it an architectural, pictorial or simply imaginary limit—and as traditionally having a referential or transitive structure (Marin 2001), i.e., as referring to an extra-iconic dimension (Husserl 2006). For these reasons, virtual images are no longer to be understood as "icons," they should rather be called "an-icons" (Pinotti 2021), that is to say, images that tend to "negate" their status of representations and rather present themselves to their experiencers as environments, as worlds in their own right.

As a result of these features, the simulated environments made accessible by virtual reality technologies are often described as giving access to an *alternative* world, that is, as a parallel dimension which excludes or at least temporarily conceals our contact with the physical world, experienced in real life (IRL), as if we were given access to a hermetically sealed *trompe l'œil* (Grau 2004). In fact, ever since its designation—the trope of "virtual reality" coined by Jaron Lanier during the 1960s—the virtual medium bears a problematic relationship to the "real": the very definition of virtual reality seems to imply a virtual dimension that is meant or believed to—even though temporarily—be "as if" real, in other words, as a virtual real as opposed to a "real" real.

In fact, nothing could be more misleading in describing the user experience of the virtual environment: instead of "leaving out" the real world, VR technologies rather inaugurate a porosity between different coexisting sensorial dimensions, producing a blurring of the threshold

between the world of the image and the physical or material world. Being immersed in a simulated audio-visual environment, in some cases complemented by multisensory stimuli, while inhabiting a material space with our bodies and even being allowed to physically move within it, constantly requires us to blend our simultaneous perception of multiple overlapping realities.

Hence, we need to reverse the assumption that the virtual is what comes to *augment* the real, to extend it or rather to replace the real with *another reality*, for what actually occurs in the immersive environment is also that the virtual space is simultaneously *augmented* by the real. Therefore, the relationship between the virtual world and the "real" or physical space occupied by the user must be reframed as a two-way communication.

Indeed, even from a technical point of view, the relationship between physical space and the construction of the plausibility of the virtual environment has been discussed, since the 1990s, as a *reality-virtuality continuum* (Fig. 2). In this regard, Paul Milgram et al. introduced the concept of "augmented virtuality" (Milgram et al. 1995, Skarbez et al. 2021), referring to environments created entirely in computer-generated imagery (CGI), to which elements of "reality" need to be included and objects present in the physical environment must be introduced into the graphic world, such as the user's hands, which appear in the form of a partial avatar in order to point, grasp, or manipulate something in the virtual environment. But, in fact, the expression "augmented virtuality" could be applied to immersive VR experience as such, inasmuch as it always features a constant two-way passage between elements of the real and the virtual.

*Figure 2: Milgram, et al. 1995.*

Only in this way is it possible to understand the specific embodied conditions that regulate what media studies call presence or telepresence, i.e., the feeling of being part of the simulated environment and also of being able to move within the virtual space. The visualization and tracking systems of immersive headsets, in fact, generate sensorimotor contingencies sufficiently congruent with those of physical reality (Slater 2009) to be able to deceive the human brain and to create the illusion of sharing the same space (the phenomenon defined as *place illusion* previously referred to) and, under certain plausibility conditions, to provoke responses similar to those that the physical environment would elicit. It is worth noting that this applies regardless of the degree of verisimilitude and photorealistic appearance of the virtual image (Salen and Zimmerman 2003), which can be realised by means of 360-degree video shooting, photogrammetry, volumetric capture or entirely processed by means of computer-generated imagery (CGI).

If this translation or transduction between the two dimensions becomes possible, it is not because our brain would "substitute" one set of stimuli (the ones coming from the physical reality) for another one (coming instead from the virtual image). The possibility that I could be fooled into taking the simulated environment for real is a hyperbolic metaphor (J. Murray 2020), forged by science fiction narratives and sometimes leveraged on by marketing and communication professionals, as the fact of wearing a head-mounted display keeps reminding me of the operation of mediation that is ongoing.

But, instead of being fooled by the verisimilitude of virtual worlds, what is far more impressive for those who have experienced VR technologies is the fact that any immersant is subject to place illusion *in spite of the fact* that they are perfectly aware of the simulated nature of the stimuli they are receiving from the virtual world, and yet they cannot help being perceptively involved, in some cases in an uncontrollable manner. Indeed, I—along with my brain-body system—never abandon the perception or at least the awareness of being situated in a certain place, inside a room or any physical space, while I simultaneously produce bodily responses to the stimuli that originate from—what I know to be—a simulated reality. In this context, the body is constantly called to merge together different dimensions of experience, hence the fatigue and sensation of motion sickness that are frequent especially in VR beginners.

In fact, this process of innervation or hybridization of our body with the aesthetic functions of a device is by no means new in the history of media technologies. We can try to compare the situation described here to a multisensory experience that is much more familiar to us and which has long since innervated our media experience: that of audio-visual media. In a similar way as the *Flatlandia* mental experiment served the purpose of understanding the fourth dimension, let us try to compare the simultaneous stimulation of the soundtrack and the visual track of film to the experience of virtual environments. In a movie, the sound design that constitutes the soundtrack is a constructed reality that has nothing to do with the sounds we experience in our everyday life. It is the product of foley artists and later of post-production sound effects designers; even when it comes from live sound capture it still differs from our ordinary perception of sound. Consider the fact that even the lack of noise in a film is conveyed with the recording of so-called room tone or room sound, that is the "silence" recorded at a location or space when no dialogue is spoken, whereas the complete absence of sound results in an uncanny feeling, generally used to arouse suspense and fear in the audience. Nevertheless, as cinema spectators, we have become accustomed to merge what we see on screen with this independent dimension of complex stimuli, made up of human voice, music, and noises, as well as to accept that they do not adhere to the visual phenomena appearing on screen but could be external to diegesis (as in the case of extra-diegetic music) and can even point to an imaginary or spiritual dimension (the voice of an absent omniscient narrator or character voice over).

Similarly, in the context of a simulated virtual environment, the immersant is engaged in a process of constant negotiation between two overlapping "tracks" of stimuli: the ones coming from the physical environment and those that are generated by the virtual environment. Not surprisingly, some of the most cutting-edge VR experiences realized increasingly combine the virtual environment with the setting of material props that come to effectively support the merging of the overlapping dimensions, so that the virtual incorporates the real and the real comes to include a virtual dimension.

#### 3. Being Present: The *Nunc* of Virtual Environments

How does the feeling of being included in the simulated environment affect the aesthetic experience of those who immerse themselves in virtual realities? *Place illusion* in itself is not sufficient to motivate the engagement, the effect of participation or even the emergence of an empathic response conveyed by virtual environments—as is suggested by the dominant discourse with which virtual reality contents are often promoted. Let us try to further analyze the material and phenomenological conditions of VR experience, to attempt to observe its effects on the way we experience the reality of the virtual image.

In fact, since the impression of "being present" in the virtual environment is widely investigated as a key feature of the medium, the emphasis is often placed on the capacity of VR to recreate a world spatially surrounding the audience, by building 360-degree explorable landscapes, 3D objects, avatars, and so on. Yet, there is another meaning of "presentness" that is rarely highlighted when examining VR media technologies which concerns the specific temporality entailed by the experience of virtual environments.

We can compare virtual reality with photographic media, which have been at the center of aesthetic and mediological reflection in the twentieth century. Simplifying to the extreme the bases of film theory and semiotics of media, we can say that what is considered to be the distinctive trait of the photographic image, according to the reflection first developed by André Bazin or Roland Barthes, is the fact of presenting us with its referent by placing us before the evidence of its "having been" (Barthes 1981), that is, before something that has been absolutely and irrefutably present, but which we always necessarily encounter in a delayed manner, as a "mummy of change" (Bazin 1959). In this perspective, what characterizes the aesthetic reception of photography and cinema is the impact with a *temporally delayed reality*, which opposes to the dimension of the present, characteristic of the expressive form of theatre, which, as György Lukàcs wrote in his *Reflections for an Aesthetics of Cinema*, is an "absolute present" (Lukács 1913): a present that is not the mere present of life, but in which the audience is brought to abandon the parameters of everyday life in order to embrace a different system of rules.

Virtual reality technologies, precisely by virtue of the embodied conditions described above, project their audience into a similar "absolute present." Before connecting us with a re-presented real from which it originated, the image that surrounds us at 360 degrees puts us in contact with a reality which *is taking place*, which is currently unfolding, and which, indeed, as we shall further emphasize, co-constitutes itself as a result of our bodily movements (Fuchs 2017). In such a context, instead of a "real" that reaches us from its factual, historical, or memorial dimension, we are rather confronted with a real which is constantly actualizing, a real which sensorially envelops us and which we cannot escape in its being shaped in real time.

In other words, the unfolding and the occurrence of the merging of the virtual environment, together with the embodied dimension we experience in VR, tends to overcome the reference—although always potentially present—to an indexical or even documentary dimension of what is presented to us. Hence, the VR audience experience needs to be reframed as radically pertaining to a performative dimension, to the detriment of any representational component and reference to an extra-iconic dimension: we are no longer confronted with an image *of* something, but with a world that imposes itself as such in its presence (Pinotti 2021, XV). In this sense, the image—even when photorealistic or manifestly the result of a photographic capture in non-fiction contents—seems to be partially devoid of the factual force of a referential past, since the dimension of its actual unfolding prevails— overwhelmingly—over its possible emanation from a referent. In a nutshell, we could say that we experience virtual environments *in the present tense*, and that the real with which we come into contact is located in the excess of our embodied response, in the bodily feedback that ensures the editing of the immersive experience, occurring even independently of our rational and conscious processes.

If virtual reality is to be experienced in the present tense, it is because it puts us in contact with a real which needs to be actualized and "brought about". In this perspective, the immersive environment has been understood as the site of a *re-enactment* performance. As Luca Acquarelli has suggested, this often concerns the production process of VR contents, as much as the moment of their reception, in which the audience becomes the protagonist of the unfolding of narration or experience (Acquarelli 2020). In fact, one of the recording techniques utilized for the creation of immersive contents entails the use of motion capture technologies, in which actors or even the real protagonists of historical

events or individual stories are asked to stage their own involvement in the narrated facts. But re-enactment is also what occurs when the trace of human gestures and their memory are reactivated by the immersants engaged with the virtual environment, in a computational AI-assisted recomposing of the point of view that re-articulates the bodily movements captured in a preliminary phase.

Such strategies draw on and take from various forms of historical re-enactment as well as the long-standing tradition of re-enactment as an artistic practice (Baldacci, Nicastro and Sforzini 2022). In this respect, discussing the multiple applications of virtual reality technologies in archaeology and museum projects (Gaitatzes et al. 2001), Elisabetta Modena has pointed out that several contents aiming at visualizing heritage sites or even reconstructing buildings and works that have disappeared, are based on the practice of re-enactment to revive the past and collective memory, setting out a sort of *narrative anastylosis* (Modena 2022, 95–98), in which the process of digital recomposing is not limited to the reconstruction of environments and buildings, but also includes their animation with stories, rituals and daily practices.

There is also a distinctive kind of re-enactment, which not only implies a performative revival unfolding before the eyes of those experiencing the immersive environment, but also a re-enactment of one's own perception, that is the possibility of recreating a first-person experience. Indeed, as noted above, the experience of virtual reality is often understood as the possibility of stepping into the other person's shoes, that is, of assuming their situated point of view and individual perspective. This is a peculiar variant of the immersive experience that implies being brought to coincide with a precise point of view, realizing what Andrea Pinotti has called a "360-degree autopsy," in the etymological sense of "*autòs optòs,*" "to see with one's own eyes" (Pinotti 2019, 29–30). At the same time, this capture of a subjective point of view is also an autopsy in the mortuary sense of the term, in that it freezes the reality it immortalizes in its mere perceptual and factual reception in order to offer it, in its absence, to another person for inspection. It is worth noting that, even though in these cases the genesis of the virtual image implies a re-presentation, i.e., a repetition of experience as suggested by the very concept of re-enactment (Holzhey and Wedemeyer 2019, Tore and Colas-Blaise 2021), for the immersant embodying virtual environments, the reality unfolding during the experience will

still be experienced as happening and unfolding in the present, rather than in the form of a past being evoked or reproduced. Thus, as these different interpretations suggest—in a different but compatible manner by referring to the notions of re-enactment, anastylosis or autopsy—in the sensible encounter with immersive worlds we come into contact with a real that rather than reaching us from the past, demands to be activated, as well as built, informed, and constructed *in the present* by the immersant who experiences it.

These preliminary investigations, trying to single out the *hic et nunc*, the here and now, of immersive experience, have hopefully prepared the ground for a phenomenological discussion of the experience of 360-degree virtual environments. In the following, we will focus on an analysis of the gaze, to be understood not merely as the product of ocular vision, but more broadly as the situated embodied point of view that is the pivot of the perceptual articulation of the virtual environment. This will allow us to interrogate the ways in which the immersants are led to situate themselves in relation to this unfolding real and thereby to question their own inclusion in and participation in the virtual world.

Focusing on the embodied gaze will also allow us to investigate embodied experience in virtual reality regardless of the virtual visible presence of a full or partial avatar body, which have attracted much attention in recent accounts of virtual immersive experience (Murray and Sixsmith 1999, Murray 2000, Dolezal 2009, Popat 2016, D'Aloia 2018, Zimanyi and Ben Ayoun 2019).

### 4. The Body as Virtual Frame: The Performativity of the Immersive Image

In order to understand the access to virtual immersive environments as an embodied experience from a phenomenological perspective, we can now resume our initial hypothesis: if VR provides a view in first-person, should it be understood as an evolution of the point-of-view shot, part of the multiplicity of contemporary first-person media and genres?

Of course, the specificity of VR lies in its being "subjective" (Bédard 2019), in the sense that the body of the immersant is the point that generates the constitution of the image in a process of performative negotiation: for instance, what appears within the environment is the result of the synthetic graphic elaboration or, in the case of 360-degree cinema, the pre-rendered recording of the environment, *and* at the same time of the physical and attentional movement of the embodied gaze that wanders within it. To put this in another way, the specific mode of presence that is articulated by virtual environments can be interpreted as generating a "self-centered world" (Eugeni and Catricalà 2020), as, in semiotic terms, the experiencer is granted a role of co-enunciator of the virtual world. But, in what sense is virtual reality experienced as self-centered?

In this respect, virtual reality has brought about an epistemological shift in the conception of spectatorship, similar to that which was affected in the history of cinema by the introduction of depth of field and long take, which provoked a radical reassessment of the spectator's role. These aesthetic constructs and visual strategies obliged theorists to think of the spectator's experience not just as an essentially passive reception, but as being constantly involved in an attribution of meaning and progressive readjustment of this, in which the interplay between the belief in the world represented and the reflection on such a reality result in the "active" participation of the beholder (Bazin 1959, Dufrenne 1981, Buscemi 2022).

Thus, if an analysis of the new status of spectatorship inaugurated by VR technologies often focuses mainly on the spectator's sensorimotor interaction, playfulness and transmedia agency (Neumann et al. 2018, Cowan and Ketron 2019) paradoxically, the operationality and performativity of the embodied gaze are still largely overlooked. In fact, the interactivity proper to immersive environments cannot be limited to the fact that the experiencer is now able to move, respond, and direct their actions towards certain goals within a virtual space, but also brings into play the gesture of looking—as well as being/not being seen—along with new forms of voyeurism and narcissism of vision (Wang 2021).

In view of phenomenology and especially of Merleau-Ponty's account of embodied experience (Merleau-Ponty 1945, 244 and 2011), perception has to be reframed as an active exercise, as a form of expression: perceiving is never purely passive reception, but already a way of *acting*, since it always entails the movement of the body—even when we are immobile, the gesture of looking implies ocular and muscular

movements—and, in other words, since our perception, by expressing the world, recreates it. Also, according to the enactive approach drawing on phenomenology and cognitive science, "perception is not something that happens to us, or in us. It is something we do," since the world makes itself available to the perceiver through bodily movement (Noë 2004, 1).

Therefore, even though virtual environments can provide dramatically different degrees of freedom of movement, according to their production process and the conditions of the interface (allowing different so-called degrees of freedom, 3DOF or 6DOF), the experiencer's interactivity cannot be limited to their capacity to explore or manipulate the environment and the objects included within it. In other words, the line of discontinuity which marks VR spectatorship and differentiates it from other forms of media experience cannot coincide simply with the measure of user interactivity, which is necessarily a gradient. On the contrary, I suggest that we acknowledge that the embodied encounter with the virtual image always entails, as its *minimal but intrinsic form of interactivity*, a *shared performativity*, in which a reciprocity is established between, on the one hand, the gestures of inspection and multisensory exploration of the environment and, on the other hand, the very unfolding of the visual, audio-visual or multisensory material experienced.

In fact, a process of negotiation between the experiencer and the work's perceptible material characterizes the aesthetic experience and media spectatorship as such, and yet, the interface of virtual reality confronts us with *an image that co-constitutes itself in accordance with the embodied movement of its experiencer*, as the tri-dimensional environment relies precisely on the immersant's mobility to unfold. Even in cases where the interface or immersive storytelling does not entail more complex forms of interactivity, virtual reality technology always involves the experiencer in the process of the *real-time construction of the image*. Thus, if within virtual environments we come into contact with a real-in-image, it is a real that the experiencer contributes to in-forming or at least, as affirmed above, to actualizing or re-actualizing in the present.

Thus, as much as immersive media seemingly inaugurate an experience of unframedness, as a new condition for the perception of the image, opposed to the classic conception investigated in the history of art and media, the process of in-formation of the image traditionally ensured by framing in 2D media, does not completely dissolve. On the one hand, framing persists as a symbolic, psychic, aesthetic, or semiotic threshold (Pinotti 2021). But, more radically, I would claim that, instead of disappearing, the very perceptual function of framing is assumed by the experiencer's body, by the performance of their bodily gestures and embodied gaze. Indeed, even in 360-degree films which do not allow an interactive exploration of the environment, the experiencer can direct and point their gaze inside the tri-dimensional surrounding, adopting different patterns of visual behavior, tracing with their eyes what in film analysis would result in panoramic shots, tracking shots, changes of perspective, and so on. As a result, being constantly tracked by the sensors of the interface, the body of the experiencer acts like a *virtual frame* (Dalmasso 2019a), so ensuring the functions of selection, comparison, association, and dissociation, hitherto described—in film and media theory—as framing, camera movements and editing. This needs to be understood not merely in physiological terms, but in its social and biopolitical implications: as the body of the immersant is historically situated, it brings along a background determined by socio-cultural conditions and norms, gender, ethnicity, and so on, acting as well as a receptacle for their performative response to the image.

However, it is worth noting that the performativity of the experiencer is just the reverse of the performativity of the image itself, as operational (Hoel 2018) and "perceiving" image. This co-constitution of the virtual image is possible as head-mounted displays are equipped with sensors which constantly track the user's movements and reconstruct their position in space in real time, so that the resulting image that appears on the screen—which, although not experienced as a twodimensional surface, is nevertheless in front of the user's eyes—continually modifies in accordance with bodily movements. Hence, by virtue of this reciprocity, the performativity of the virtual image describes a structure in which we can no longer assign categories of activity or passivity to one of the two involved, that is, the image and its experiencer, since, to take on Merleau-Ponty's expression, they are always in an "imminent reversibility" (Merleau-Ponty 1961).

#### 5. Whose Body? Becoming a "Moving Cast"

But, if this configuration structurally concerns the conditions of use offered, by design, by the virtual interface (at least in the form that this technology has taken in its current stage of development), how does the shared performativity of the virtual image come to reshape the experience of embodiment in immersive environments? Our analysis could now be deepened through an investigation of forms of embodiments that can be elicited by the different interaction designs implemented in contemporary VR productions. Our goal here is not to outline a comprehensive taxonomy (Dalmasso 2019b), but to better discuss how the assumption of the framing function by the immersant's body and to single out the aesthetic strategies that can emerge from this feature of the virtual medium, stimulating the audience to cognitively and bodily situate themselves in relation to the unfolding real, potentially bringing them to question their own participation and inclusion in the virtual world.

We will focus our analysis on a few examples from one of the genres most explored in contemporary immersive productions, namely socalled immersive journalism or, more in general, non-fiction VR contents. The examples we will analyze share, in different ways, the intention to thematize and raise awareness of the experience of migration, often focusing in particular on the currently topical moment of the crossing of the border.

If we consider some of the early pioneering works of immersive journalism,2 we can observe that the vantage point that is offered to those who experience the virtual environment proposes a sort of degree zero of observation, aiming to achieve a complete illusion of non-mediation—which is the very definition of presence effect: as if I were there. When understood in this sense, virtual reality would seem to realize the dream of idealist philosophers of being able to transcend the existence of our material body so as to achieve a pure inner vision, a pure act of perception experienced from within. However, such productions tend not to forego institutional enunciative indexes proper to

<sup>2</sup> See, for instance, ground-breaking works like *Clouds Over Sidra* by Chris Milk and Gabo Arora (2015), *The Displaced* by Ben C. Solomon and Imraan Ismail (2015), or Nonny de la Peña's *Gone Gitmo* (2007), *Hunger in Los Angeles* (2012), *Use of Force*  (2013).

traditional documentary film-making. Hence, the audience immersing in the virtual work, although locating themselves within the diegetic space, still maintain their privileged space of external witness (Nicolae 2018; Nash 2018).

More recently, also by leveraging on the expressive research carried out by the early productions of immersive journalism, many works of non-fiction VR have begun to call into question the conditions under which we experience virtual environments, through their storytelling strategies. The interaction design they implement essentially lean on the performative dimension of the virtual image previously examined, and attempt to articulate expressive choices that effectively exploit the spectator's inclusion in the spectacle.

Realized by means of 360-degree filming and CGI effects, Stefania Casini's *Mare Nostrum – The Nightmare* (2019), follows the journey of a young migrant boy from the Sahara to the Mediterranean Sea. Stepping into the virtual environment, at first the audience witnesses the tragic farewell between the young Tuareg Atambo and his mother, as he is about to leave his native land. At first, this closeness would appear to be an intromission, we feel uncomfortable as we are intruding into an intimate situation, until an unexpected interpellation occurs: the mother turns towards the immersant, she addresses *us* and asks *us* to protect her son. Our excessive proximity, thus, assumes a specific meaning on a narrative level: we are invited to position ourselves in relation to the image that surrounds us. However, the interpellation upon which immersive storytelling here relies is not simply linear, in other words, the question addressed to us is not univocal. In fact, on the one hand, it invites the immersant to ask the question: what role should I inhabit if I place myself within the diegetic universe? Am I one of the smugglers who manage migration routes from Sub-Saharan Africa, or, am I a friend, a travelling companion who will share the border crossing with the boy? In fact, as I will claim, at a closer look, the kind of conundrum that is posed to the immersant is articulated by every VR experience. Yet, here, the question raised by the mother does not merely calls for a diegetic interactivity, it reaches us also on another level, as the question she poses also hints at the outside of the fictional universe: is it perhaps I myself—with my first-world citizenship status—to whom the mother addresses a symbolic appeal?

These different diegetic and non-diegetic instances are gathered in my point of observation, when, instead of voyeuristically witnessing this journey, I am immediately called upon to situate myself—both aesthetically and ethically—within this world. Later in the development of the VR script, my position will be repeatedly called into question: in a Libyan prison I will be one of many prisoners, I will become a companion in the trip by sea, and so on, following step by step Atambo's destiny.

The essential but effective narrative choice put into practice by *Mare Nostrum* reveals that, even when in a 360-degree experience we are devoid of a sensible corporeal appearance within the virtual world and cannot therefore be identified in a visible avatar, we still can feel visible, addressed, and subject to observation.

A similar but more intensive use of interpellation is put in play by Neil Bell's interactive installation *The Crossing* (2022). The immersant witnesses a clandestine night-time rendezvous on the Libyan coast, where a smuggler meets a group of migrants preparing to embark on their journey. The smuggler must decide who, among the migrants, will act as the captain of the rubber dinghy. Some put themselves forward for the role, but suddenly the smuggler turns to us and lets us choose the person who seems to be best suited to captain the boat, knowing that the responsibility for the consequences of this decision will ultimately fall on us. The virtual experience thus unfolds, along the lines of a serious game, confronting us with decisions to be made during the trip and their crucial outcome, which will result from our choices.

An opposite aesthetic strategy is pursued by what has undoubtedly become the most famous virtual reality work to deal with the subject of migration: Iñarritu's *Carne y Arena*, already mentioned above. Here, plunging into the Sonora desert, where a group of South American migrants are crossing the border between Mexico and the United States, the immersant is not visible to the other characters, being, as the subtitle of the experience suggests "virtually present" but "physically invisible." This ambiguous dual status confronts us with the frustration of not being perceived by the characters we encounter who ignore us and pass through us, thus attributing to us a ghostly existence. Such bodily invisibility points towards the political and social invisibility of the migrants' bodies, of which the immersant can thus have a glimpse. Through the complex installation that is the framework of the piece, those who experience the virtual environment are thus called upon to embody a first-person perspective—stepping into the shoes of a migrant—and yet, such a process of alteration depends upon their freedom of movement and choice, in such a way that it may or may not result in an overlap and coincidence, for instance, with the refugees caught in the night or, perhaps, also with the border patrol agents.

In different ways, in the encounter with these VR works, we are invited to undertake the gesture of a continuous "gearing" onto the visual and sensible material, to situate ourselves in relation to a reality that touches us precisely insofar as it questions our being located in a perceiving-perceived body.

This process of embodied situation also implies positioning oneself within a complex visual culture, which loads the perceptible image that surrounds us with cross-references and stratifications of meaning. It is at this network of markedly extradiegetic echoes and references that Sara Tirelli's *Medusa* (2018) hints (Pirandello and Tirelli 2021). From its very title, the 360-degree experience establishes a direct association with one of the most crucial myths in the history of images—the Gorgon capable of petrifying with her gaze—and at the same time locates its vanishing point in the event of the shipwreck to which Géricault's famous work of the same title refers. *Medusa* moves away from the narrative grammar of immersive journalism to place the immersant at the heart of a theatrical and cinematic performance which envelops them, bringing together events and elements that permeate humanitarian visual culture, up to a symbolic re-enactment of the shipwreck of the Medusa. Through virtual storytelling, those who experience the immersive work realize little by little that the place they occupy—as privileged Western citizens with European or Schengen passports—is that of the spectator of the shipwreck which from Lucretius' *De rerum natura* onwards characterizes our relationship to the spectacle and that the artistic exploration of the virtual medium dramatically puts into question today.

The stylistic trait that characterizes the four works we briefly examined calls upon the immersants to situate themselves in relation to the diegetic world, that is to "mold" their bodily presence and consequently their identity within the immersive environment, resulting in a corporeal and spatio-temporal, but at the same time also social and political positioning. This dynamic seems to emerge as an absolute expressive

and aesthetic specificity of the virtual technology that is still taking shape as an expressive medium, namely, the possibility of interrogating the process of "gearing" between virtual and real. This is the point in which the performativity or shared agency of the immersant and the virtual image meet: with my simple bodily movement in space, I have the power to shape the tri-dimensional image that appears to me, but this image that is molded around my "moving cast," has, in turn, the power to fashion me as a perceiving body included within the perceptible—at the same time real and virtual—world.

To employ the terms used above, while discussing the advent of a virtual reality to be re-enacted or re-constructed, we can say that the virtual *anastylosis* does not concern only the perceivable material within the immersive environment, but first and foremost the position of the subject experiencing it. As mentioned above, not only do the immersants contribute to constituting the virtual image, but they are in turn shaped by the immersive environment, asked to fashion their own identity, adapting themselves to the unfolding of the experience in order for it to take place.

Indeed, regardless of the genre and media context from which the VR content springs, the game space (*Spielraum*) of any 360-degree experience systematically organizes as an enigma, a conundrum—and this regardless of the presence of a specifically playful component and the degree of interaction and manipulation of the environment granted by the interface—in which the immersant is invited *to figure out their own position within the spectacle*. In fact, every VR experience places us inside a puzzle or rebus whose constant question is: who am I? Who am I supposed or meant to be? What degree of engagement and participation is required of me? What kind of identity should I adopt so that the experience could work and make sense? Even when in the absence of an interactive engagement, in virtual environments I need to keep questioning my role as a mobile virtual frame, attempting to adapt my bodily gestures to the perceptible image that I contribute to informing around me, in which the "real" takes shape as that which makes me re-emerge from the image as a subjective position, to be continuously reconstructed around the gaze that I am invited to embody.

#### References


Gaudreault, André ed. 1988 *Ce que je vois de mon ciné*. Paris: Klincksieck.


## Exploring Architecture with Image Technologies: From Narrative Film to VR, AR and MR Narrative Structures

*Katharina Andjelkovic*

### Abstract

Contemporary applications of image technologies are not only computer vision, but extend to integrate three dimensional environment modeling, reconstruction and documentation of architectural buildings, experimental architecture, human tracking and video representation. The chapter also presents opportunities directly related to representational capacities, spatial experience in architecture, and how architectural ideas can be challenged through image technologies. Dissecting diverse ways in which we experience both existing physical spaces and technologically created environments, this chapter explores the relationship between architecture and image technologies in various parts. It brings the latest image technologies and narrative studies into the spatial realm of architecture with an aim to extend the general idea of experiencing, representing and communicating architectural ideas from the screen to the immersive environments. Given that the potentials of narratives studies—in relation to media like narrative films, virtual reality, augmented reality and mixed reality environments– is a new field of inquiry, this chapter explores how new narrative contexts extend the field of representational media through which heritage architecture can be practiced in the future. The overall goal of the chapter is to intersect heritage architecture and narrative film, virtual reality, augmented reality and mixed reality environments to understand various ways of experiencing space, reality and illusion.

#### Keywords

Heritage architecture, image technologies, narrative film, virtual reality, augmented reality, mixed reality, illusion, immersion

#### 1. Introduction

As early as the 1930s, science fiction writers, inventors, and thinkers dreamt of an environment where one could escape from reality via art and machines. We were weighing questions about virtual reality *versus* augmented reality *versus* mixed reality long before we had the technology to make them possible. Recently, the researchers claimed that some of the latest architectural design practices have recognized that film, using its specific screen environment, can provide a source of new architectural imagination while contextualizing our experience of both physical and on-screen spaces. Accordingly, by intersecting experimental architecture and narrative film, this chapter's aim is to tackle these issues of the environments created via technologies and narratives. This said, the first part of the chapter deals with early twentieth century artistic experiments in relation to the new modes of spatial representation brought about by cinema. These experiments in turn made their way into architectural design. More precisely, architects learn from filmmakers and their exploration of spatial experience from film, observing how the filmmaker uses a unique formal film language to explore the relationship between film time and spatial experience. Ever since the 1990s, experimental architectural and narrative films have been dedicated to the "complicated spaces" of both canonical and marginalized modernist architects, translating their spaces into cinematic, imaginary "architectures of time." Heinz Emigholz (1948–) is a pioneer of experimental architecture and narrative film who has been deciphering the experience of film, time, and space since the 1970s. Thanks to his unique formal language he has radically distanced himself from conventional representations of space and architecture in film, creating instead an alternative narrative of modernist architecture.

In the second part, the chapter deals with the virtual reality, augmented reality and mixed reality technologies in how they embellished structures and challenged boundaries between the digital and the real in architectural space. The increasing establishment of virtual and augmented communication media allows architects to create real time immersion, influencing the emergence of new space-time conventions. With the increasing dominance of virtual reality, augmented reality and visual media, digital technologies extend the field of architectural inquiry to develop the media conventions and support sustained illusion

and immersion. For the purpose of creating illusion, I look at narrative solutions in 360-degree immersive devices, not only with the picture but also in the way to organize space, sights and time. As a mimic of a real personal experience, the selected 360-degree panorama mode corresponds to the events occurring in real time and thus enables us to discuss the transition between reality and illusion. Here, I explore the potentials of virtual reality, augmented reality and mixed reality, to provide an access to different simultaneous temporalities of the past and present of the heritage site. To convey different experiences and narratives that exist within the site, I scrutinize the procedure of creating these immersive experiences, specifically a tension which is identified in the process. Depending on the position of the researcher, different narratives are sewed together from using multiple archives and stitching the materials together in diverse ways. Introducing the contemporary architectural critique of the spatial representation through image technologies is aimed at benefiting the architecture of multiple heterogeneous temporalities that does not inscribe the narrative into a continuous space time. The overall goal of this research is to demonstrate the capacities that virtual reality, augmented reality and mixed reality, hold as technological platforms for critical thinking in architecture, and more specifically, provide answers as to how narrative can be developed to open crucial epistemological questions in architecture.

#### 2. Narrative, Space, Film

In building design, architects defer to a concept as an idea driving vehicle that defines architecture and justifies final outcomes. Narrative is another important layer that reaches beyond literary works and significantly defines visual … arts [and architecture] (Zarzycki 2016, 201). Richard Koeck argues in his book *Cine|Scapes* (see 2013), that architectural spaces and urban landscapes can have narrative qualities which link them with film and cinema. To be able to project their relationship with film, firstly, we need to address the term narrative in relation to space. Namely, the study of narrative has its origins in linguistics, where it began life as a structuralist pursuit for a formal system and was subsequently adapted to other fields, such as film studies. This means

that we need to address the rationale for the study of narrative in the context of urban landscapes, before looking at narrative mechanisms with regard to screen space and on location space (spaces where films are shot), as well as how these can have a presence in actual urban spaces.

Scholars such as Roland Barthes and Gérard Genette, or Jacques Derrida and Jonathan Culler, have advanced an understanding of narrative from semiotic to the poststructuralist approach in the second half of twentieth century. Edward Branigan (1945–2019) recalls a similar shift taking place in the field of film studies where, in the mid 1960s, film theory began as an object centered epistemology (where the goal was to present numerous methods by which to segment and analyze the parts of a film) before repositioning itself to a subject-centered epistemology (where the goal was to investigate the actual methods employed by a human perceiver to watch, understand, and remember a film (Branigan 1992, XI). In this context, it is important to understand the limits of applying a formal structure to a system that is as complex as that of architectural and urban spaces.

In addition to these limitations in reading of spaces, another concern lies with the term narrative itself, whose meaning is debated, if not contested, to a considerable degree. For example, Genette notes that "one will define narrative without difficulty as the representation of an event or sequence of events" (Genette 1982 [1966], 127), while Gerard Prince states that a "narrative is the representation of at least two real or fictive events in a time sequence, neither of which presupposes or entails the other" (Prince 1982, 4); or Onega and Landa who declare that narrative is a semiotic representation of a series of events. The definition of narrative has expanded to question the salient role of 'sequences of action' (as started by scholar Monika Fludernik), and it moves to define the essence of narrative as the "communication of anthropocentric experience—the experientiality, which is inherent in human experience and feelings, and depiction perceptions and reflection" (Fludernik 2009, 59). Accordingly, narrative is seen not simply as a sequence of events, but renders such sequences, and through it, narrative itself, is an integral part of human experience.

Of particular importance in this context is Fredric Jameson and Jean-Francois Lyotard's concern about the impact of postmodernity on human condition. If we consider that narrative is part of all human dialogue, then this seems to open up the possibility of including spatial

characteristics in general, as well as cities and architectural spaces in particular, as active agents in a narrative discourse. "If narrative exists," as Roland Barthes (1915–1980) concludes, "as a written, oral, visual discourse in an infinite diversity of forms, then cities and architecture become agents that can be studied for their narrative significance in a represented or mediated form (e.g. film space) as well as unmediated existence (e.g. actual space)" (Barthes 1997 [1967], 158-172).

For nearly forty years, considerable research had been dedicated to the study of narrative and its relationship to space in the context of film. Stephen Heath's work on narrative space made a key contribution to this field, and it is probably fair to say that ever since the publications of "narrative space" (Heath 1976) and questions of cinema (Heath 1981), the study of film has become unimaginable without a consideration of spatial narrative dimensions. Heath alludes to the fact that "film makes space, takes place as narrative, and the subject too, set– sutured– in the conversation of the one to the other" (Heath 1976, 107). In doing so, the concept of narrative space does not foreground narrative qualities of film space, but the notion of film as narrative space (Ibid. 75), which can perhaps be summarized as a hypothesis that says that film forms a dynamic space that is held together by a narrative (Ibid. 75). Moreover, it was felt that there is a need to study the spatio-temporal organization of narrative.

### 3. Heinz Emigholz's Experimental Architecture and Narrative Film

Heinz Emigholz is a pioneer of experimental architecture and narrative film who has been deciphering the experience of film, time, and space since the 1970s, while also exploring the texture of memory and consciousness. With his film experiments, Emigholz created a kind of 'autobiography of modernism' by filming modernist architecture. Thanks to his unique formal language he has radically distanced himself from conventional representations of space and architecture in film, creating instead an alternative narrative of modernist architecture. In his *Slaughterhouses of Modernity* (2022), the multi-layered story demonstrates the

literal modernist slaughterhouses from the 1930s designed by architect Francisco Salamone (1897–1959) in the area of Buenos Aires. Instead of representing the symbol of progress of an epoch, modern architectural relics were turned to the symbols of time, absurd, of unlikely events and the places where "modernism itself is slaughtered," as Emigholz says it on film. *Slaughterhouses of Modernity* contains four such episodes, which act as chapters. It seems that Emigholz is putting these architectural stories out there, each fascinating in its own right, and partially contextualizing them, to challenge us to form our own ideas (Kees Driessen 2022). By changing the way of filming architecture, a need for alternative narratives is called upon. For example, the first shot deals with the ways in which the modernist architecture was meant to be photographed: symmetrical, imposing, in full view. Then, Emigholz takes different angles, including Dutch ones, close-ups, which often show wear and tear of the buildings, and shots from behind trees and bushes, as if nature will end up devouring these symbols of industrial progress (Kees Driessen 2022).

Emigholz's cinematic "archives" of the depicted buildings, with minimal commentary, provide a rare opportunity for careful contemplation and study of the space, light, and materials of architecture. Since his early work analyzing cinematic movement formations in the 1970s, Heinz has developed a unique formal film language. He uses it to explore the relationship between film time and spatial experience, between memory structure and consciousness and the gap between ideological horizons of expectation and materially formed conditions. Emigholz's approach is interesting not in terms of analyzing or re-reading a certain type of narrative film, but instead in relation to how it attempts to explore this transformation of filmic architecture into narrative cinema. For example, his *Goff in the Desert* (2003)1 is an example of how Goff and Emigholz design and experience space, each through his own medium. Emigholz depicts many of Bruce Goff's (1904–1982) sixty-two buildings as they pave alternative approaches to mainstream American Modernism. By connecting organic forms, graphics and artwork, Goff's focus was to explore how patterns and chaos, from na-

<sup>1</sup> *Goff in the Desert is* a sweeping, cinematic meditation on 62 buildings designed by the American architect Bruce Goff. It is one installment of a series of films Emigholz has made under the title *Architecture as Autobiography*.

ture, relate to composing space. Particularly after 1940, his buildings are imaginative, both ordinary and extraordinary. Through the creativeness of Midwestern architects, the language of composing forms, organization, and structuring was inspired by patterns such as those used in the repetitive compositions of the Beaux-Arts (Andjelkovic 2019, 82). For example, using brick, wood, glass, and stone, his window design is variously circular, triangular, diamond shaped, while roofs and buildings take surprising and radical shapes. In Emigholz's film, the presentation of Goff's buildings offers the constructs of narrative film, defined as a representational film that tells its audience or spectators a fictional story or narrative. Ever since the 1990s, experimental architectural and narrative films have been dedicated to the "complicated spaces" of both canonical and marginalized modernist architects, translating their spaces into cinematic, imaginary "architectures of time."

#### 4. Virtual Reality, Augmented Reality, Mixed Reality

The increasing establishment of virtual and augmented communication media allows architects to create a real time immersion for their projects, and consequently to advance the high-end research and put a theoretical framework for new space time conventions. With the increasing dominance of virtual reality, augmented reality and visual media, digital technologies extend the field of architectural inquiry to develop the media conventions and support sustained illusion and immersion. For the purpose of creating illusion, I look at narrative solutions in 360-degree immersive devices, not only with the picture but also in the way to organize space, sights and time. As a mimic of a real personal experience, the selected 360-degree panorama mode corresponds to the events occurring in real time and thus enables us to discuss the transition between reality and illusion. In regard to the effect of illusion created in cinema, "contemporary film theorists have tended to assume that the average film spectator is fundamentally deceived into believing what is seen is real" (Allen 2003, 226). For example, in his influential essay, "Ideological Effects of the Basic Cinematographic Apparatus," Jean-Louis Baudry argues that projection and narration in film work together to 'conceal' from the spectator the technology and technique

that underpin the production of the cinematographic image, so that the film viewer believes she or he is in the presence of unmediated reality (Baudry 1986, 286–298).2 However, an arguably correct statement was made by Noël Carroll (1947) that what he terms "epistemologically pernicious" sense of illusion implied by contemporary film theory's account of spectatorship, in which the spectator involuntary takes the cinematic image to be real, does not reflect our experience of the cinema (Carroll 1993, 21).3 The film spectator is not duped by the cinematographic apparatus or forms of narration in the cinema; the spectator is fully aware that what is seen is only a film (Ibid. 21). The problem of this kind of rejection of illusion, and its applicability to the cinema on the basis that the cinematic image, does not differ in any essential aspects from other forms of pictorial representation that do not involve illusion (Carroll 1988, 90–106),4 has been further challenged by Carroll. Allen stands for a kind of 'reproductive illusion' which trades upon the reproductive properties the cinematic image and the photograph share. This form of illusion derived from the photographic properties of the cinematic image will be the basis for our further discussion of narration in virtual reality, augmented reality and mixed reality environments, as they all share a photograph as their basic means of operation.

Employing narration in virtual reality means telling a story to open the time interval to an ambivalent reading. Time is now converted into space, pushing the action from frame to frame. Observing and describing the narrative elements in response to a gradual transformation of spatio-temporal conditions enabled me to keep track of the encounters between temporal, narrative, and visual effects. I take the case of immersive approach to the built heritage in relation to how a three-dimensional digital reconstruction could be made and disseminated through virtual reality, augmented reality and mixed reality applications. I look at the potentials of these technologically created environments to provide an access to different simultaneous temporalities of the past and present of the heritage site. To convey different experiences and narratives that exist within the site, I scrutinize the procedure of creating immersive

<sup>2</sup> However, the argument that Baudry and certain other contemporary film theorists make about illusion is more complex that this summary implies.

<sup>3</sup> Noël Carroll cited in (Allen 1993).

<sup>4</sup> See Carroll 1988, 90-106. An earlier version of this argument is contained in "Address to the Heathen," October 23 (Winter 1982): 103–109.

experiences, specifically a tension which is identified in the process. Depending on the position of the researcher, different narratives are sewed together from using multiple archives and stitching the materials together in diverse ways. Introducing the contemporary architectural critique of the spatial representation through image technologies is aimed at benefiting the architecture of multiple heterogeneous temporalities that does not inscribe the narrative into a continuous space time. This said, the capacities of virtual reality, augmented reality and mixed reality, hold as technological platforms for critical thinking in architecture, and more specifically, provide answers as to how narrative can be developed to open crucial epistemological questions in architecture.

Technological advances have also enabled the rendering of realistic three-dimensional virtual reality, augmented reality and mixed reality representations of actual buildings and places from the past, including buildings and streets that have long since disappeared. In terms of possibilities that these technologies can offer, while virtual reality can fully immerse the user in a virtual environment allowing him to travel to far locations, augmented reality combines the virtual elements with reality, e.g., by showing digital artifacts to the original location where they were recovered and adding a new dimension to the visitor experience (Garro, Sundstedt and Sandahl 2022, 1988). Mixed reality (Milgram and Kishino 1994, 1321–1329),5 on the other hand, is a blend of physical and digital worlds, unlocking natural and intuitive three-dimensional human, computer, and environmental interactions. In order to enhance understanding of the dimension and shape of the lost building in a heritage site, a three-dimensional digital reconstruction could be made and then disseminated through an augmented reality or virtual reality application (Massio 2020, 229). As demonstrated by most recent scholarly research in the built heritage, this method makes available experiences that present a bricolage of the site's lost buildings and demand a discussion of the potential digital representation and immersive technologies as a design and visualization tool. Furthermore, using advanced simulation techniques, three-dimensional scanning, and real time rendering, along with the archival and scholarly resources, architects may challenge the traditional approach to the study of heritage sites and make available its

<sup>5</sup> Mixed reality is based on advancements in computer vision, graphical processing, display technologies, input systems, and cloud computing.

complex histories anew and from the nonlinear perspectives. To achieve these goals, they aim to further the potential of immersive technologies and open critical questions based on the research process. For example, in their "Digital Archeology and Virtual Narratives: The Case of Lifta" workshop,6 researchers from the Massachusetts Institute of Technology identified the whole new logistics between archival research and collection of the onsite evidence with a rich body of photographs, drawings and artworks, stories and other narratives.

The workshop instructors, Eliyahu Keller and Eytan Mann, elaborated on a successful project stating that it "resulted in the design of immersive and virtual experiences of the village and its multiple narratives" (Keller and Mann 2019, 287). My particular interest is in the process of creating these immersive experiences, specifically a tension that they identified between narrative and representation in the first stage of the process. Keller and Mann further claim to achieve immersion, the shifting between scales and the animation of the materials by the voices of former residents, to convey a different experience and narratives which exist within the site (Keller and Mann 2019, 289). Essentially, its panoramic format of a 360-degree continuous viewing experience enabled the effect of immersion. It offered multiple narratives through a rich body of photographs, drawings, artworks and collected stories. In this way, virtual reality provided access into different simultaneous temporalities of the past and present. This is possible because with each new image representation the heritage site recontextualizes the previous one. Walter Benjamin identified this radically alternative conception of time and of historical experience in the notion of *dialectical image*. As a method, "the dialectical image serves for distancing an image from the reality it presents" (Andjelkovic 2020, 96). In addition, some of the most recent scholarly research identified similar issues and paved a way for an entirely new approach to virtual reality, augmented reality and mixed reality research, asking how has an immersive form

<sup>6</sup> This collaborative workshop "Digital Archeology and Virtual Narratives" designed by Eli Keller, Eytan Man, Takehiko Nagakura, and Mark Jarzombek, brought together the resources and expertise from the MIT Department of Architecture, MISTI-ISRAEL and the Department of Bible Archeology and Ancient Near East Studies at Ben Gurion University (BGU). It was conducted at the MIT School of Architecture and Planning in 2018. Read about the workshop in "Digital Archeology, Virtual Narratives: The Case of Lifta," virtualXdesign (MIT Virtual Experience Design Lab).

of representation enabled the historical evidence of the site to change our approach to history?

Undoubtedly, these new environments play a key role in the presentation of the past, which can create new meanings by integrating storytelling, gamification and other interactive experiences to visitors. While virtual reality adheres to the limits of traditional historical studies, it also stretches the boundaries of historical inquiries and challenge them by immersing the viewer into "realistic" environments. As correctly put by Keller and Mann, "the immersive quality facilitates a reciprocity between the site as it is recorded, represented and narrated, as well as the numerous existing and constructed archives, or the various testimonies about the site" (Keller and Mann 2019, 287–298). As these intermingle with one another through the work and investigation, the site itself becomes yet another archive, while the archive transforms, or better yet, it is exposed, as what it always has been: a site of intervention and design (Ibid.). In that regard, the process of creation of these environments enabled an operational role of the researchers to work between various epistemological registers. Music, images, objects and augmented reality clips, which all make part of an architectural model in the cultural heritage site, now serve as a structural motif for the unfolding narrative.

It is clear that an entirely new digital approach for managing built heritage is needed at the juncture of historiography and ethnography of the recent past. With the increasing dominance of virtual reality, augmented reality, mixed reality and visual media, digital technologies extend the field of art historical and architectural inquiry to include hybrid practices constantly negotiating between operative and representational demands. The significance is in providing means to link our current technological reality with agents involved in the built heritage's historical research. For this reason, it becomes urgent to investigate the possibilities of such alternative approach by establishing critical dialogues between them. To enable the potentials of these dialogues, the contact narratives can be established where the polarities of traditional representations, visual media, virtual reality, augmented reality and mixed reality, meet and negotiate between operative and representational demands of newly formed hybrid practices. For example, the analysis of virtual reality representation experiments undoubtedly testifies to a discrepancy between the reality of presented historical artifacts and

what might be inscribed through the new reality of network aesthetics regimes. Due to an increasing tendency towards digitalization of the image archives nowadays, it further intensifies diverse types of critical dialogues to discuss how this transition from object-based to a network aesthetics affect the syncretization of cultural differences and problematization of uneven development and inequalities between different sites. Another issue is the relationship between visual agency, forms of image creation, collection and patterns of memorialization, which make part of the process.

#### 5. Engaging Viewers in Interactive Experiences

Recent studies have shown that users' interaction with immersive reality technologies and mixed reality environments, where the real world meets virtual objects in co-existence and interaction, can determine their learning outcome and the overall experience. This innovation is particularly important for providing the target audiences with exhilarating, meaningful, and inclusive cultural experiences. A wide range of opportunities provided by these hybrid environments for learning, communication, and entertainment, is now securing the employment of virtual reality, augmented reality and mixed reality technologies by cultural heritage organizations and beyond. Even more so in the context of new tools and techniques which are developed simultaneously and can enable new means of exploration, interaction, and interpretation of cultural assets and heritage architecture. For example, interactive exhibitions based on digital technologies have become widely common, which require the viewer to imagine and understand lost architecture. The primary objective being the immersive experience for different types of users, these exhibitions offer the possibility of interacting with the content instead of reading or listening to the explanation. Different interactive elements are integrated into each augmented reality clip, both three dimensional and two dimensional, to engage the viewer. Some can be experienced in augmented reality, while others are moved directly on the screen. This choice is related to complex interactions, in which visitors usually feel more comfortable interacting with touch screen elements (e.g., smartphones and tablets) than

with augmented elements (Yin et al. 2019, 17663–17674).7 Despite the main interactions, during the experience visitors can also freely explore the virtual recreation of heritage architecture, manipulating the threedimensional models and focusing on some aspects that otherwise will remain undiscovered (Spadoni et al. 2022, 1370–1394). Additionally, the integration of different narrative modes and media involved, defined as transmedia storytelling8 , is making the experience more engaging and complete for the visitors.

While mobile augmented reality applications are widely available on desktops and mobile devices, fully immersive and interactive mixed reality applications are difficult to find. Combining collaborative and multi-modal interaction methods with mixed reality allows multiple users to interact with each other (social presence) and with a shared real virtual space (virtual presence) (Bekele 2021, 1448). As a result, the combination of these methods can easily establish a contextual relationship between users and cultural contexts. This method is important in the museum and heritage site context, as these are known for preventing physical manipulation of artefacts. By providing a dynamic and interactive environment, the mixed reality applications enable users to collaboratively manipulate and interact with the digital three-dimensional models of the heritage sites and architectural objects via an interactive and immersive virtual environment, either in museums or remotely.

#### 6. Conclusion

With the increasing dominance of virtual reality, augmented reality and mixed reality environments, digital technologies extend the field of heritage architectural inquiry to develop the media conventions and support sustained illusion and immersion. The new technological reality presents opportunities directly related to representational capacities, spatial experience in architecture, and how architectural ideas can be challenged through image technologies. Consequently, the importance

<sup>7</sup> Yin et al. 2019. IEEE Access. Also see: Keighrey et al. 2021. IEEE Trans. Multimed.

<sup>8</sup> See more in: Jenkins, H. 2006. Convergence Culture: Where Old and New Media Collide. New York, NY: New York University Press.

of implementing these technological innovations is to contribute not only to an immersive experience, building on the idea of viewers' participation, but also to achieve the synthesis of immersion and interactivity in a real time mode as the ultimate goal of institutions dealing with cultural heritage.

In addition, this chapter explored how the new narrative contexts extend the field of representational media through which heritage architecture can be practiced in the future. Diverse applications of these innovative methodologies are already seen in articulating an innovative account for protecting and preserving cultural assets, not only for the promotion of cultural tourism, vivification of museums and monuments, and regeneration of historical centers, but also as a vessel for visibility and accessibility in both its physical and virtual forms, as well as the criticality of historical knowledge. By utilizing the capacities of virtual reality, augmented reality and mixed reality spaces in architectural heritage research, an immersive form of representation enabled the historical evidence of the site to change our approach to history. In other words, this research demonstrated that virtual reality, augmented reality and mixed reality hold as technological platforms not only for immersive experiences but also for critical historiography. In other words, their use in the research process not only challenged the discursive debates around emergent technologies that impact the often-warring methodologies of the heritage studies. It rather contributed to reintroducing history into the debate about the scientific analysis of the built heritage and, in return, provided new inroads into the exploration of our current technological reality. Moreover, this research demonstrated the capacities that virtual reality, augmented reality and mixed reality hold as technological platforms for critical thinking in architecture, and more specifically, provide answers as to how narrative can be developed to open crucial epistemological questions in architecture. In a wider perspective, the established dialogues promise to provide a firm historical foundation on which we base current architectural debates caught in an ever-changing world.

#### References


Fludernik, Monika. 2009. *An Introduction to Narratology*. London: Routledge.


## Authors

*Katarina Andjelkovic* (Serbia), Ph.D., M.Arch.Eng., is a theorist, practicing architect, researcher and a painter. Katarina served as a Visiting Professor, Chair of Creative Architecture, at the University of Oklahoma U.S.A., Institute of Form Theory and History in Oslo, Institute of Urbanism and Landscape in Oslo, University of Belgrade, and guest lectured at TU Delft, AHO Oslo, FAUP Porto, DIA Anhalt Dessau, SMT New York, ITU Istanbul. Katarina has published her research widely in international journals (Web of Science) and won numerous awards for her architecture design and urban design competitions. Katarina has published two monographs, an upcoming book chapter, and several journal articles with Intellect UK.

*Niklas F. Becker* (Germany), M.A., is a doctoral candidate at the DFG-graduate research program *Media Anthropology* at Bauhaus-University in Weimar. He studied media, film and media dramaturgy in Siegen, Mainz and Berlin. After obtaining his degree in media studies, he was teaching at Technical University of Berlin. In his doctoral thesis, he focusses on questions regarding the constitution and transformation of human perception and modes of existence under technological conditions in general and under the conditions of an 'augmented reality' in particular.

*Anna Caterina Dalmasso* (Italy) is assistant professor in Film and Media Studies within the ERC project "An-Icon" at University of Milan, where she teaches "Media Archaeology". Combining visual culture, film and media studies, phenomenology and aesthetics, her current research focuses on post-cinema and immersive media. She devoted her doctoral thesis to Merleau-Ponty's philosophy of the visual and to its implications for contemporary mediality and technoculture (*Le corps, c'est l'écran. La philosophie du visuel de Merleau-Ponty*, Mimesis, 2018; *L'œil et l'histoire. Merleau-Ponty et l'historicité de la perception*, Mimesis, 2019). She has co-edited several interdisciplinary collective books and journal issues on screen studies and philosophy of cinema.

*Julia Reich* (Germany), M.A., is a research associate at the subproject on Virtual Art of the Collaborative Research Center *Virtual Lifeworlds*  at Ruhr University Bochum. Since 2019 she is a doctorate candidate in art history at the graduate research training group on *Documentary Practices: Excess and Privation* at Ruhr University Bochum, where she is working on a dissertation on figurations of absence in performance art that documents itself through media. She holds a Master's degree in Art History from Heinrich-Heine-University Düsseldorf. Her research focusses on performance art, its relation to documentary practices, artistic performative forms and body discourses of the virtual.

*Norbert M. Schmitz* (Germany), Dr. phil., Professor of Aesthetics at Muthesius University of Fine Arts and Design, Kiel, art and media historian. Teaching positions at universities and art academies in Wuppertal, Bochum, Linz, Zürich, Siegen, and Salzburg; international lectures at universities and academies in Chicago, Berkeley, New York, Atlanta, Minneapolis, Paris, Zürich, Bern, Salzburg, New Dehli, Lahore, Kathmandu, Seoul, Hangzhou. Research interests: iconology and intermediality between art and film, iconology of the old and new media, discourse theory of the art system, and global art. Publications: Schmitz, Norbert M. 2017. *Media Time as Aesthetic Strategy in Modernism: On the Aesthetics of Time and Media between Avant-Garde Film, Classical Style, and New Media.* In: Image Temporality: Time, Space and Visual Media, edited by Lars C. Grabbe, Patrick Rupert-Kruse and Norbert M. Schmitz, 16–37. Marburg: Büchner. Schmitz, Norbert M. 2011. *The resistance of the object or the subversion of the intimate: Jan Švankmajer's 'surrealisme intime'*. In: Das Kabinett des Jan Švankmajer/The Cabinet of Jan Švankmajer, exhibition catalogue, Kunsthalle Wien, edited by Ursula Blickle, Gerald Matt, and Kunsthalle Wien, 96–110, Nuremberg: Verlag für Moderne Kunst. Schmitz, Norbert M. 2008. *Hopper's Modernity*. In: Western Motel: Edward Hopper and Contemporary Art, exhibition catalogue, Kunsthalle Wien, edited by Gerald Matt, 240–259. Nuremberg: Verlag für Moderne Kunst. Schmitz, Norbert M. 2005. *Interaction Design – Design as Interaction: On the Emergence of Design as a Symbolic Form of Modern Communication.* In: Total Interaction, edited by Gerhard Buurmann, 43–51, Basel: Birkhäuser.

#### Authors 181

*Jens Schröter* (Germany), Prof. Dr., is chair for media studies at the University of Bonn since 2015. He was Professor for Multimedial Systems at the University of Siegen 2008-2015. He was director of the graduate school *Locating Media* at the University of Siegen from 2008–2012. He was member of the DFG-graduate research center *Locating Media* at the University of Siegen since 2012. He was (together with Prof. Dr. Lorenz Engell, Weimar) director of the DFG-research project *TV Series as Reflection and Projection of Change* from 2010–2014. He was speaker of the research project (VW foundation; together with Dr. Stefan Meretz; Dr. Hanno Pahl and Dr. Manuel Scholz-Wäckerle) *Society after Money – A Dialogue*, 2016–2018. Since 4/2018 director (together with Anja Stöffler, Mainz) of the DFG-research project *Van Gogh TV. Critical Edition, Multimedia-documentation and analysis of their Estate* (3 years). Since 10/2018 speaker of the research project (VW foundation; together with Prof. Dr. Gabriele Gramelsberger; Dr. Stefan Meretz; Dr. Hanno Pahl and Dr. Manuel Scholz-Wäckerle) *Society after Money – A Simulation* (4 years). Director of the VW-Planning Grant *How is Artificial Intelligence Changing Science?* (Start: 1.5.2020, 1 Year, Preparation of Main Grant); April/ May 2014: "John von Neumann"-fellowship at the University of Szeged, Hungary. September 2014: Guest Professor, Guangdong University of Foreign Studies, Guangzhou, People's Republic of China. Winter 2014/15: Senior-fellowship at the research group *Media Cultures of Computer Simulation*, Summer 2017: Senior-fellowship IFK Vienna, Austria. Winter 2018: Senior-fellowship IKKM Weimar. Recent publications: (together with Project *Society after Money*) *Postmonetär denken*, Wiesbaden: Springer 2018; (together with Project *Society after Money*): *Society after Money. A Dialogue*, London/New York: Bloomsbury 2019; (together with Armin Beverungen, Philip Mirowski, Edward Nik-Khah): *Markets*, Minneapolis/London: University of Minnesota Press and Lüneburg: Meson (Series: In Search of Media); *Medien und Ökonomie*, Wiesbaden: Springer 2019. Visit www.medienkulturwissenschaft-bonn.de /www. theorie-der-medien.de/www.fanhsiu-kadesch.de.

*Pamela C. Scorzin* (Germany), Prof. Dr., born in Vicenza (Italy), studied European art history, philosophy, history, and English/American studies; Magistra Artium in 1992 and doctorate in 1994 at the University of Heidelberg. After assisting in 2001, habilitation at the Department of Architecture at TU Darmstadt. Freelance work as an art critic—a

member of the AICA since 2006. Since 2008 Professor of Art Studies at the Department of Design of Dortmund University of Applied Sciences and Arts, and since 2020 Vice-Dean. Numerous publications (German, English, French, and Polish) on the history of art and culture from the 17th to the 21st century. Lives, works, and researches in Dortmund, Milan, and Los Angeles and travels in the metaverse under the pseudonym 'Levania Lehr'.

*Manuel van der Veen* (Germany) is an author and artist. After studying Fine Arts and Philosophy, he did his Ph.D. in Art Science at the Academy of Fine Arts Karlsruhe from 2018 to 2022 under the supervision of Carolin Meister and Stephan Günzel on the topic *Augmented Reality. Trompe-l'oeil and Sculptural Relief as Technique and Theory*. At the moment he is a research associate at the subproject on Virtual Art of the Collaborative Research Center Virtual Lifeworlds at Ruhr University Bochum. His current research involves the juxtaposition of traditional image production with current digital techniques, theories of image carriers and site specificity in the age of the virtual.