# Beyond Media Borders, Volume 1 Intermedial Relations among Multimodal Media

*Edited by* Lars Elleström

# Beyond Media Borders, Volume 1

Lars Elleström Editor

# Beyond Media Borders, Volume 1

Intermedial Relations among Multimodal Media

*Editor* Lars Elleström Department of Film and Literature Linnaeus University Växjö, Sweden

#### ISBN 978-3-030-49678-4 ISBN 978-3-030-49679-1 (eBook) https://doi.org/10.1007/978-3-030-49679-1

© The Editor(s) (if applicable) and The Author(s) 2021. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cover illustration: Jay's photo/getty images Cover design: eStudio Calamar

This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG.

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

# Foreword: Mediations of Method

As the subtitle of the two volumes of *Beyond Media Borders: Intermedial Relations among Multimodal Media* makes clear, these reflections on media have the mission to begin where medium-specificity or, what I call slightly irreverently, medium-essentialism ends. The media under discussion here, considered from a great variety of perspectives, are all 'multimodal', set in more than one semiotic mode. The most readily understandable example we have rehearsed for so long would be, of course, cinema or television, the study of which in monodisciplinary departments seems to take for granted that they *are* media, whereas the inevitable combination of words and images, colour, sound, narrativity and technological effects clearly demonstrates that no single disciplinary framework will do. As I am also a maker of films and video, I feel I am in a good place to say this. But as the essays in these volumes make clear, practically all media deploy more than one modality.

The point is not so much, however, that 'multi-' aspect, although that, too, is important, since it advocates an anti-purist view of the media products—Lars Elleström's term for 'texts' and 'images', 'sounds' and 'words', and what have you, that it is the Humanities' mission to study. What catches my eye is primarily that word 'relations', in combination with the preposition 'inter-', which is particularly dear to me, as I have explained more times than I care to remember. Briefly, 'inter-' stands for, or *is*, relation, rather than accumulation. It is to be distinguished in crucial ways from that currently over-used preposition 'trans-', which denotes a passage through, without impact from, another domain. With his consistent interest in media *as* intermedial and his prolific publication record, many edited volumes, and as director of the Linnaeus University Centre for Intermedial and Multimodal Studies (IMS), Elleström has become a primary authority in that domain that is best characterized as one that doesn't fit any of the traditional disciplinary concepts, yet is probably the largest, most frequently practised mode of communication among humans, indispensable for human life. Elleström's ongoing focus on—his intellectual loyalty to—the idea of the *semiotic*, a concept and field that on its own already indicates the need for the 'beyond' in the books' main title, demonstrates a resistance to ephemeral academic fashion and a consistency of thought without dogmatism which I consider characteristic of the semiotic perspective. Briefly again, a semiotic perspective asks how we make meaning. The interest of these volumes lies in the importance of communication in general, without which no human society is possible.

Media, as the editor explains, are always-already 'inter-', as the centuryold debates about inter-arts clearly demonstrates. The preposition is a bridge, and the articles brought together here explore what the bridge bridges. This requires reflection on the concept of media itself. One cannot understand intermediality without a sense of what a medium is; even if, as such, in its purity, it doesn't exist. With exemplary clarity, Elleström begins his substantial opening and synthesizing article with five tendencies he finds damaging for intellectual achievement in (inter)media studies. Anyone interested in this field of study will recognize these tendencies and agree with the editor's critique of them. But then, the challenge is how to remedy these problems. This is where Elleström earns his authority: he proceeds to announce how these tendencies will be countered, or overcome, in the present volumes. If only all academics would take the time and bother to lay out what they are up against and then redress it: academic bliss would ensue. In other words, this is real progress in the collective thinking of cultural analysis. Felicitously refraining from short definitions, he embeds the relevant concepts in what he calls a 'model', but what those of us with a mild case of 'model-phobia'—the fear of a certain scientistic demand of rigour before all else—may also see as a theoretical frame. Felicitously, he calls his activity 'circumscribing' rather than defining. His approach alone, then, already demonstrates in the first pages of his long introductory text an academic position that integrates instead of separating creativity and rigour and thus not only helps us understand the general principles of communication but, through detailed analysis, makes us 'communicationally intelligent', if I may follow discursively the example of psychoanalyst Christopher Bollas who, in his 1992 book *Being*  *a Character*, sensitizes us to the complexities and *thereby*, clarity, of how people are able, and by the media products, enabled, to communicate effectively, with nuance.

There is not a term or concept here that is not both circumscribed and relativized and put to convincing use. The length of the introductory essay is, in this sense, simply a demonstration of generosity. For example, the central concept of 'transfer' that we can hardly do without when talking about communication is neither defined in a simplistic way, as a postal service that goes in one direction only, nor theorized into incomprehensibility. The idea of transferring means that a message goes from a sender to a receiver; we were told in the early days of semiotic theory. Of course, in order to discuss communication, we must consider the idea that a message is indeed transferred from a sender to a receiver; without it, we are floundering. In this, Elleström is realistic; he doesn't reinvent the wheel. Yet, the implicit (but not explicit) notion that the content of a message, as well as its form, go wholesale from sender to receiver, as endorsed in traditional semiotic theory, is clearly untenable. For, the sender's message, with the sender always already 'in' communication, will always be influenced, or coloured, by what the sender expects, and has reasons to expect, the receiver will wish, grasp, appreciate.

What do we do, then? Instead of casually rejecting the idea, concept or notion, Elleström and his colleagues in these volumes recalibrate and nuance what we consider a message to be, with the help of the relationality that the preposition 'inter-' implies. This makes the sender-messagereceiver process an interaction, mutually responsive, hence, communicative in the true sense. The change from 'sender' to 'producer' intimates that the former sender has *made* something. The former 'receiver' has shed her passivity by becoming a 'perceiver', a term that adds the activity performed at the other end of the process. And when the term 'meaning' is hurt by a long history of rigid semantics, as is the case of many of the concepts we use as if they were just ordinary words, they come up with alternatives, but not without bringing these in 'discussion' with the simpler but problematic predecessors. The need for a concept that cannot be reduced to dictionary definitions compels the authors, guided by the experienced and ingenious editor, to come up with richer terms that are able to encompass all those nuances that were always a bit bothersome and that we liked to discard or ignore. Thus, 'cognitive import' cannot be reduced to 'meaning', and neither can it be confined to language. That would make the substitution of a well-known term by a new one pointless. Instead, the new term necessarily includes the embodied aspect of communication. This eliminates the mind-body dichotomy to which we are so tenaciously attached; not because we believe in it, but because, until these volumes, we had no alternative vision.

The word 'dichotomy', here, is perhaps the most central opponent in these volumes' discourse. And as with 'inter-' as implying relationality, I feel very close and committed to an approach that does not take binary opposition as its 'normal', standard mode of thinking. And once we are willing to give up on dichotomies such as mind/body, it becomes possible to complicate all those dichotomies that structure what we have taken for granted and should let go in order to recognize the richness of mental life—mental in a way that does not discard the body but endorses it, along with materiality, as integrally participating in the thinking that communication stimulates, helps along and substantiates. Both the partners in communication, who can be singular and, at the same time, plural, and the site of communication, are necessarily material or bound to materiality. Moreover, the sense-based nature of communication makes the abstract ideas surrounding communication theory, not only untenable, but futile, meaningless. Getting rid of, or at the very least, bracketing, binary opposition as a way of thinking is for me the primary merit of the approach presented here.

So, the first thing these books achieve is to complicate things, in order to get rid of cliché simplicity, and then, right after that, to clarify those complicated ideas, concepts and the models that encompass them. This is perhaps the most important merit of these volumes. They complicate what we thought we knew and clarify what we thought is difficult. With that move as their starting point, the enormous variety of topics of the chapters become, thanks to the many cross-references from one article to another, a polyphony. Rather than a cacophony of loud divergent voices, this polyphony constitutes a symphony that, as a whole, maps the enormously large field of the indispensable communication that is human culture, without pedantically demanding that every reader be an expert in all those fields. I don't think anyone can master all the areas presented and examined in the contributions, but the taste of it we get makes us at the very least genuinely interested. This is not a dictionary or an encyclopaedia but a beautifully crafted patchwork of thoughts.

The conceptual travels are stimulating, never off-putting, because they are completely without the plodding idiosyncrasies one so often encounters when new concepts are proposed. And devoid of the polemical discussions with other terminologies, well explained and labelled meaningfully, the conceptual network towards which these books move fills itself as we read along and thus ends up offering a ground for cultural analysis that I am eager to put my feet on. Solid, reliable and, still, exciting. What more would we wish from in-depth academic work? This collective, collaborative work is based on a deep understanding of what scholarly work should be: an act of communication between producers and perceivers, as the view presented here would have it, one that makes its readers feel involved. This is the only way they can learn something new.

University of Amsterdam Amsterdam, The Netherlands Mieke Bal

# Preface

In 2010, Palgrave Macmillan published a volume entitled *Media Borders, Multimodality and Intermediality*, which I had the pleasure to edit. It included my own rather extensive introductory article, 'The Modalities of Media: A Model for Understanding Intermedial Relations', which has since then attracted some attention in intermediality studies. It is my most quoted publication, and scholars and students still use the book and my introductory article in research and education. For my own part, I apply the core concepts of 'The Modalities of Media' as a basis for all my research, including in the two Palgrave Pivot books *Media Transformation* (2014) and *Transmedial Narration* (2019). Over the last decade, however, I have also deepened, developed and slightly modified the original ideas because I think some of them were formulated prematurely. I have also noted that people sometimes misunderstand certain parts of the article because of my somewhat inadequate and occasionally confusing ways of explaining some of the concepts. Therefore, I decided to rewrite 'The Modalities of Media: A Model for Understanding Intermedial Relations'.

However, the reworking became more substantial than I had expected, resulting in a text that is not only modified and updated but also significantly expanded, incorporating ideas that I have presented in other publications during the last decade. Therefore, I have called it 'The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations'. The new version more clearly frames mediality and intermediality in the context of inter-human communication and defines the central concept of media product as the intermediate entity that makes communication among human minds possible. It retains, but slightly modifies and expounds, the central idea of characterizing media products in terms of four media modalities, four kinds of media traits. For instance, the discussions now include not only virtual (represented) time and space but also virtual (represented) materialities and sensory perceptions. Providing a fuller picture of representing and represented media traits, as well as adding discussions of cross-modal cognitive capacities of the human mind, makes it possible to offer a much-developed understanding of the concepts of media types and media borders and what it means to cross media borders. As a result, the new article hopefully better explains the intricacies of media integration and media transformation. Overall, most of the concepts have been fine-tuned, leading to a more consistent and developed framework. However, attentive readers will note that I have not mentioned a few ideas that I briefly discussed in the original article. This does not necessarily mean that I have abandoned them; rather, I have decided to develop them further in other publications instead of trying to squeeze even more into an already extensive article. Nevertheless, 'The Modalities of Media II' is supposed to replace rather than complement the original article*.*

This means that the two-volume *Beyond Media Borders: Intermedial Relations among Multimodal Media* is effectively a completely new publication. All of its other contributions are entirely novel compared to *Media Borders, Multimodality and Intermediality* (2010) and are written by authors that (with only one exception) are not the same as those in the earlier book. The main idea of the new publication is not only to launch an updated version of 'The Modalities of Media' but also to present it together with a collection of fresh articles written by scholars from a broad variety of subject areas, united by their references to the concepts originating from 'The Modalities of Media'.

Besides being highly original pieces of scholarship in themselves, the accompanying articles practically illustrate, exemplify and clarify how the concepts developed in 'The Modalities of Media II' can be used for methodical investigation, explanation and interpretation of media traits and media interrelations in a broad selection of old and new media types. To provide space for analysis of such a wide range of dissimilar media types, without reducing the complexity of the arguments, two volumes are required. Their title, *Beyond Media Borders: Intermedial Relations among Multimodal Media*, reflects the underlying idea that all media types are more or less multimodal and that comparing media types requires that these multimodal traits being analysed and compared in various ways. As different basic media types have diverging but also partly overlapping modes (for instance, several dissimilar media types have visuality or temporality in common), and because humans have cognitive capacities to partly overbridge modal differences (between, for instance, space and time or vision and hearing), media borders are not definite; in that sense, one must move 'beyond' media borders.

Overall, the two volumes form a collection with strong internal coherence and abundant cross-references among its contributions (not only to 'The Modalities of Media II'). Simultaneously, they cover and interconnect a comprehensive range of very different media types that scholars have traditionally investigated through more limited, media-specific concepts. Hence, the two volumes should preferably be read together as a unified, polyphonic and interdisciplinary contribution to the study of media interrelations.

Växjö, Sweden Lars Elleström

# Acknowledgements

The support and help of my colleagues at the Linnaeus University Centre for Intermedial and Multimodal Studies (IMS) has been invaluable for my work. I am also in debt to all the contributors to these two volumes, including Mieke Bal, who is currently a guest professor at IMS and kindly agreed to write the foreword. Moreover, I am grateful to those esteemed colleagues who acted as peer reviewers: Kamilla Elliott, Anne Gjelsvik, Pentti Haddington, Carey Jewitt, Christina Ljungberg, Jens Schröter, Crispin Thurlow and Jarkko Toikkanen. Finally, I would like to acknowledge the financial support from the Åke Wiberg Foundation and IMS, which made it possible to make the publication open access.

# About the Book

Although all of the contributions can be read as separate articles, the two volumes of *Beyond Media Borders* form a whole. Because the contributions are written in concert and include some dialogues, reading the publication in its entirety adds substantial value. Part I in Volume 1, 'The Model', contains the extensive theoretical framework presented in 'The Modalities of Media II: An Expanded Model for Understanding Intermedial relations'. Part II in Volume 2, 'The Model Applied', offers a brief summary and some elaborations that end the two volumes. Between these two opening and closing parts, one finds Part II in Volume 1, 'Media Integration', and Part I in Volume 2, 'Media Transformation', which contain the majority of contributions. As explained in 'The Modalities of Media II', media integration and media transformation are not absolute properties of media and their interrelations, but rather analytical perspectives. Hence, the division of articles into two parts only reflects dominant analytical viewpoints in the various contributions; a closer look at them reveals that they all, to some extent, apply an integrational as well as a transformational perspective.

This is the first volume of *Beyond Media Borders*. The complete table of contents for both volumes is as follows:

#### *Volume 1*

### **Part I The Model**

1. The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations *Lars Elleström*

#### **Part II Media Integration**


*Volume 2*

#### **Part I Media Transformation**


#### **Part II The Model Applied**

8. Summary and Elaborations *Lars Elleström*

# Contents





#### **Index** 239

# Notes on Contributors

**Mark Crossley** is an associate professor at De Montfort University in Leicester, UK, specializing in contemporary intermedial theatre and applied performance. He recently edited *Intermedial Theatre: Principles and Practices* (Red Globe/Macmillan, 2019).

**Lars Elleström** is Professor of Comparative Literature at Linnæus University, Sweden. He presides over the Linnaeus University Centre for Intermedial and Multimodal Studies and chairs the board of the International Society for Intermedial Studies. Elleström has written and edited several books, including *Divine Madness: On Interpreting Literature, Music, and the Visual Arts Ironically* (2002), *Media Borders, Multimodality and Intermediality* (Palgrave Macmillan, 2010), *Media Transformation: The Transfer of Media Characteristics Among Media* (Palgrave Macmillan, 2014), *Transmedial Narration: Narratives and Stories in Different Media* (Palgrave Macmillan, 2019) and *Transmediations: Communication Across Media Borders* (2020). He has also published numerous articles on poetry, intermediality, semiotics, gender, irony and communication. Elleström's recent publications, starting with the article 'The Modalities of Media: A Model for Understanding Intermedial Relations' (2010), have explored and developed basic semiotic, multimodal and intermedial concepts aiming at a theoretical model for understanding and analysing interrelations among dissimilar media.

**Iben Have** is Associate Professor of Media Studies at Aarhus University, Denmark. Her research focuses on media sound and audio media such as radio, podcasts and digital audiobooks. She has published the books *Listening to Television: Background Music in Audiovisual Media* (in Danish 2008), *Digital Audiobooks: New Media, Users, and Experiences* (2016), *Tunes for All: Music in Danish Radio* (2018) and *Quietude* (in Danish 2019). She is one of the founding editors of the academic online journal *SoundEffects*.

**Andy Lavender** is Vice-Principal and Director of Production Arts at Guildhall School of Music & Drama, London, UK. He is the author of *Performance in the Twenty-First Century: Theatres of Engagement* (2016).

**Heather Lotherington** is a tenured full professor at the Faculty of Education and the Graduate Program in Linguistics and Applied Linguistics at York University in Toronto, Canada. Her research focuses on language and literacy education in superdiverse, digitally connected societies. She is engaged in researching mobile language learning with a view to developing appropriate production pedagogies. Her latest book, co-edited with Cheryl Paige, is entitled *Teaching Young Learners in a Superdiverse World: Multimodal Perspectives and* Approaches (2017)*.*

**Birgitte Stougaard Pedersen** is associate professor at the School of Communication and Culture, Aarhus University, Denmark. Her research interests cover sound, literature, digital culture, digital reading and phenomenological aspects of aesthetic experiences. She has published a book entitled *The Digital Audiobook: New Media, Users, and Experiences* (2016). From 2018 to 2021 she is leading the collaborative research project Reading Between Media—Multisensorial Reading in a Digital Age. She is one of the founding editors of the academic online journal *SoundEffects*.

**Chiao-I Tseng** is a senior researcher affiliated to the University of Bremen, Germany. Her research focuses on narrative designs across different media such as films and graphic and interactive media. Tseng specializes in developing frameworks for analysing narrative forms and contents, particularly methods for systematically tracking types of events and actions, character features, narrative space and motivation and emotion. Her publications include the monograph *Cohesion in Film* (Palgrave Macmillan, 2013) and over 25 international peer-reviewed journal articles and book chapters.

**Andrea Virginás** is associate professor at the Media Department of Sapientia University in Cluj-Napoca, Romania. Her research interests include film genres, European cinema, cultural theory, intermediality and narratology. Her main publications are *(Post)modern Crime: Changing Paradigms? From Agatha Christie to Palahniuk, from Film Noir to Memento* (2011); *The Use of Cultural Studies Approaches in the Study of Eastern European Cinema: Spaces, Bodies, Memories* (2016); and *Film Genres in 21st Century Eastern Europe: Global Puzzles and Small National Solutions* (Lexington Books, forthcoming 2020).

# List of Figures


#### xxx List of Figures


# List of Tables


# The Model

# The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations

# *Lars Elleström*

### **Contents**


L. Elleström (\*)

Linnaeus University, Växjö, Sweden

e-mail: lars.ellestrom@lnu.se

© The Author(s) 2021 3

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_1

# 1.1 What Is the Problem?

All human beings use media, whether in the form of gestures, speech, news programmes, websites, music, advertisements or traffic signs. The collaboration of all these media is essential for living, learning and sharing experiences. Understanding mediality is one of the keys to understanding meaning-making in human interaction, whether directly through the capacities of our bodies or with the aid of traditional or modern external devices.

Media can be understood as communicative tools constituted by interrelated features. All media are multimodal and intermedial in the sense that they are composed of multiple basic features and can be thoroughly understood only in relation to other types of media with which they share basic features. We do not have standard communication on one hand and multimodal and intermedial communication on the other. Therefore, basic research in multimodality and intermediality is vital for further progress in understanding mediality—the use of communicative media—in general. Intermediality is an analytical angle that can be used successfully for unravelling some of the complexities of all kinds of communication.

Scholars have been debating the interrelations of the arts for centuries. Now, in the age of mass media, electronic media and digital media, the focus of the argumentation has been broadened to the interrelations among media types in general. One important move has been to acknowledge fully the materiality of the arts: like other media, they depend on mediating substances. For this reason, the arts should not be isolated as something ethereal, but rather seen as aesthetically developed forms of media. Still, several of the issues discussed within the old interart paradigm are also highly relevant to multimodal and intermedial studies. One such classical locus of the interart debate concerns the relation between the arts of time and the arts of space. In the eighteenth century, Gotthold Ephraim Lessing famously argued in *Laocoön* that there are, or rather should be, clear differences between poetry and painting (1984 [1766]). Lessing's core question of what implications spatiotemporal differences have for media remains acutely relevant today.

I believe it is equally important to highlight media differences and media similarities when trying to get a grip on multimodality and intermediality. If we have earlier seen a bent towards emphasising differences, recent decades have shown a tendency to deconstruct media dissimilarities, not least through the writings of W. J. T. Mitchell (1986), who criticised ideologically grounded attempts to find clear boundaries between media types and particularly art forms. Other scholars, like Shlomith Rimmon-Kenan, have emphasised that media differences come in grades: 'It seems to me that (1) most of the distinctions between media will turn out to be matters of degree rather than of absolute presence or absence of qualities; and (2) what is a constraint in one medium may be only a possibility in another' (Rimmon-Kenan 1989: 161). I feel that this is a productive view that still needs to be developed methodically. I find it as unsatisfying to continue talking about 'writing', 'film', 'performance', 'music' and 'television' as if they were like different people who can be married and divorced as to find repose in a belief that all media are always fundamentally blended in a hermaphroditical way.

In brief, one might say that the crucial 'inter' part of intermediality is a bridge, but what does it bridge over? If all media were fundamentally different, it would be hard to find any interrelations at all; if they were fundamentally similar, it would be equally hard to find something that is not already interrelated. However, media are both different and similar, and intermediality must be understood as a bridge between media differences that is founded on media similarities. The primary aim of this article is to shed light on precisely these differences and similarities in order to better understand intermedial relations.

I identify five tendencies in exploration of mediality, including what is known as multimodality and intermediality studies, which I find problematic. Although these tendencies were stronger a decade ago when I published the initial version of 'The Modalities of Media' (Elleström 2010), and several scholars have proposed ways to tackle them, they still exist.


compare no more than two media types at a time. Countless publications have focussed on word and image, word and music, film and literature, film and computer games, visual art and poetry and other constellations including two or perhaps three media types. While such studies are legitimate and may offer great insights, they usually delimit the field of vision in such ways that the outcomes are not helpful for analysing other forms of media interrelations. This results in a multitude of incompatible terms and concepts that blur the essential core features of media in general.


relationship between so-called texts and images are doomed to fail, leaving us with nebulous and insufficient ideas of 'mixtures' of text and image unless more fine-grained explanations are made. Similarly, the 'verbal' vs. 'visual' media dichotomy is inadequate. Although it may be practical for upholding rough differences between some media types, it is actually confusing and counterproductive when trying to understand media similarities and differences in a deeper way. Because being visual is a sensorial trait and being verbal is a semiotic trait, it is pointless to oppose the two. Some media are verbal, others are not; some media are visual, others are not; and some media are both verbal and visual.

5. *Media traits are not distinguished from media perception and signification*. Another recurring problem is the failure to distinguish between inherent media traits and the perception of those traits. This is understandable since it is, in practice, impossible to separate the two. Nevertheless, it is crucial to discriminate theoretically between the modes of existence of media and the perception of these modes in order to apprehend media differences and similarities. Although this is doubtless a slippery business, it is important to acknowledge that, for instance, the quality of time in a movie, understood as a mode of existence, is not the same as the time required to perceive a still photograph. Furthermore, time can be said to be present in many forms in the same medium. A still photograph, which does not have time as a mode of existence, can nevertheless represent temporal events. If one avoids taking notice of these intricacies, one is left with a featureless mass of only seemingly identical media that cannot be compared properly.

The goal of this article is to suggest solutions to these problems through the following means:


#### 8 L. ELLESTRÖM

5. A nuanced investigation of the relations among basic media traits, perception and signification

I hope that fulfilling this objective will make it possible to understand better what media borders are and how they can be crossed, how one can comprehend the concept of multimodality in relation to intermediality, what it means to combine and integrate different media and how it is possible for different media types to communicate similar things.

My suggested conceptual solutions are not the only ones available. However, to keep my lines of argument as clear as possible, I refrain from engaging in excessive critique of other positions. Furthermore, my ambition is not to propose anything like a complete model for analysing communication; instead, the objective is to scrutinise precisely intermedial relations. Understanding such interrelations may be vital for various forms of investigations, and, depending on the aims and goals of those investigations, the concepts and principles that I propose here must be complemented with other research tools.

The term 'medium' is widely employed, and it would be pointless to try to find a straightforward definition that covers all the various notions that lurk behind the different uses of the word. Dissimilar notions of medium and mediality are at work within different fields of research, and there is no reason to interfere with these notions as long as they fulfil their specific tasks. Instead, I will circumscribe a concept that is applicable to the issue of human communication. However, a brief definition of medium would only capture fragments of the whole conceptual web and could be counterproductive. Instead, I will try to form a model (which actually constitutes a conglomerate of several models) that preserves the term 'medium' and still qualifies its use in relation to the different aspects of the conceptual web of mediality. Thus, the concept of medium can be divided into several deeply entangled concepts in order to cover the many interrelated aspects of mediality.

The core of this differentiation consists of setting apart four media modalities that may be helpful for analysing media products. A media product is a single physical entity or phenomenon that enables interhuman communication. Media products can be analysed in terms of four types of traits: material, spatiotemporal, sensorial and semiotic traits. I call these categories of traits media modalities. During the last decades, the notion of multimodality has gained ground (Kress and van Leeuwen 2001; Bateman 2008; Kress 2010; Seizov and Wildfeuer 2017), stemming from social semiotics, education, linguistics and communication studies. Although my notion of media modalities is inspired by this research tradition, it differs significantly in ways that will become evident. Likewise, I am strongly influenced by the research field of intermediality, which has its historical roots in aesthetics, philosophy, semiotics, comparative literature, media studies and interart studies (for details, see Clüver 2007, 2019; Rajewsky 2008). These research traditions have been decisive for how I have come to circumscribe the various aspects of mediality.

As my arguments unfold, I will distinguish among media products, technical media of display and media types (basic media types and qualified media types). Basic and qualified media types are categories of media products, whereas technical media of display are the physical entities needed to realise media products and hence media types. Consequently, the term 'medium', when used without specifications, generally refers to all of these media aspects.

Thus, various media aspects are not groups of media. Instead, they are complementary, interwoven, theoretical aspects of what constitutes mediality. Accordingly, the wide concept of medium that I will present in this article comprises several intimately related yet divergent notions that I will distinguish terminologically. I believe that multimodality and intermediality cannot be fully understood without grasping the fundamental conditions of every single media product, and these conditions constitute a complex network of both physical qualities of media and various cognitive and interpretive operations performed by the media perceivers. For my purpose, media definitions that deal only with the physical aspects of mediality are too narrow, as are media definitions that only emphasise the social construction of communication. Instead, I will emphasise the critical meeting of the physical, the perceptual, the cognitive and the social.

# 1.2 What Are Media Products and Communicating Minds?

#### *1.2.1 A Medium-Centred Model of Communication*

The starting point of this investigation of media interrelations consists of an examination of the concept of media product, which is the core of all further elaborations in this study. To delineate the concept of a media product properly and thoroughly, it is necessary to have a developed model of human communication that is devised for highlighting precisely the notion of medium (Elleström 2018a, b, c). Although I have designed my model to scrutinise primarily human communication, it is at least partly applicable to communication among other animals as well. It consists of what I take to be the smallest and fewest possible entities of communication and their essential interrelations. If one of these entities or interrelations is removed, communication is no longer at hand; thus, the model is irreducible. I submit that three indispensable and interconnected entities can be discerned:


These three entities of communication have been circumscribed in various ways in established and influential communication models. In the following, I refer to some of these classical models (from linguistics, media and communication studies and cultural studies) to anchor my concepts in well-known communication paradigms and make clear the many ways in which I depart from the standard concepts. Although it is debatable, I have kept the traditional concept of transfer because I think it is part and parcel of the concept of communication. While the term 'transfer' may have misleading associations with material things being moved around, one can hardly avoid the deep experiential similarity between sharing and transferring material and mental entities—as in human communication. These issues will be continuously scrutinised in the ensuing discussions.

Roman Jakobson used the term 'message' to capture the first entity, 'something being transferred', but did not delineate the notion underlying his term (Jakobson 1960). Wilbur Schramm vacillated between two incompatible arguments: that there is no such thing as an entity being transferred, and that the transferred entity is a 'message'—not ideas or thoughts (Schramm 1971). Stuart Hall was also rather vague when he implied that 'meaning' is transferred in communication. Instead of clearly stating that communication is about transferring meaning, he emphasised that 'meaning structures 1' and 'meaning structures 2' may differ; there are degrees of 'symmetry' and degrees of 'understanding' and 'misunderstanding' (Hall 1980: 131). In other words, if there is transfer of meaning in communication, this involves transformation of meaning. This contention is certainly feasible.

While the second entity, 'two separate places between which the transfer occurs', arguably consists of two units, they can only be outlined in relation to each other. Jakobson's terms were 'addresser' and 'addressee', but Schramm preferred 'communicator' and 'receiver'. Finally, Hall avoided outlining the two separate places between which the transfer occurs as persons; in fact, he avoided pointing to such places at all. However, his notion that 'meaning structures' are to some extent transferred implies that such meaning structures indeed need to be located at places that are capable of holding 'meaning'—which must be understood as the minds of human beings, given that human communication is at stake.

The third entity, 'an intermediate stage that makes the transfer possible', has also been conceptualised differently. Jakobson's 'contact' notably incorporates both a material and a mental aspect; it was described as 'a physical channel and psychological connection between the addresser and the addressee' (1960: 353). Schramm used the term *message* to represent not only the transferred entity, but also the intermediate stage of communication (he seems to understand the message as something that is both 'transferred' and 'transferred through'). Importantly, however, Schramm described the transmitting message not only as a material entity—such as 'a letter'—but also as 'a collection of signs', thus indicating the capacity of the material to produce mental significance through signs (1971: 15). Hall also emphasised the semiotic nature of the intermediate stage of communication. His term for this entity was *'meaningful' discourse*; however, his terminology is generally rather incoherent, resulting in uncertainty about the more precise nature of the intermediate stage.

Regarding the first entity of communication, 'something being transferred', there is certainly a point in Schramm's notion that no ideas or thoughts are transferred in communication. As Hall indicated, transfer of meaning is likely to entail a change of meaning; this modification may be only slight or more radical. Nevertheless, I claim that communication models cannot do without the notion of something being transferred. If there is no correlation at all between input and output, there is simply no communication, given the foundational idea that to communicate is 'to share'; thus, a concept of communication without the notion of something being transferred is nonsensical. However problematic it may be, the notion of something being transferred must be retained and painstakingly scrutinised, instead of being avoided. To begin with, I think it is clear that one cannot confine the transferred units or features to distinct and consciously intended conceptions, and perhaps not even to 'ideas' as Schramm understands them.

My suggestion is to use the term 'cognitive import' to refer to those mental configurations that are the output and input of communication (thus, 'import' should not be understood here in contrast to 'export'). The notion that I want to suggest using this term is clearly closely related to notions captured by terms such as 'meaning', 'significance' and 'ideas', although the term 'cognitive import' is perhaps less burdened with certain notions that a term such as 'meaning' seems to have difficulty ridding itself of. Meaning is often understood as a rather rigid concept of verbal, firm, definable or even logical sense. Instead, cognitive import should be understood as a broad notion that also includes vague, fragmentary, undeveloped, intuitive, ambiguous, non-conceptual and pragmatically oriented meaning that is relevant to a wide range of media types and communicative situations. It is imperative to emphasise that although cognitive import is always a result of mind-work, cognition is embodied and not always possible to articulate using language; hence, according to my proposed model, communication cannot be reduced to simply communication of verbal or verbalisable significance.

The second entity, 'two separate places between which the transfer occurs', is usually construed as two persons. However, this straightforward notion is not precise enough for my purposes. Because it is imperative to be able to connect mind and body to different entities of the communication model, it is also essential to avoid crude notions such as that of Jakobson's addresser–addressee and Schramm's communicator– receiver. These notions give the impression that the transfer necessarily occurs between two persons consisting of minds and bodies and with a third, separate, intermediate object in the middle, so to speak, an intermediate object in the form of a 'message' that is essentially disconnected from the communicating persons. It is better to follow Hall's implicit idea that communication occurs between sites that are capable of holding 'meaning'. Warren Weaver's description of communication as something that occurs between 'one mind' and 'another' is simple and to the point (Weaver 1998 [1949]).

My suggestion is to use the terms 'producer's mind' and 'perceiver's mind' to refer to the mental places in which cognitive import appears. First, there are certain mental configurations in the producer's mind, and then, following the communicative transfer, there are mental configurations in the perceiver's mind that are at least remotely similar to those in the producer's mind. The term 'mind' should generally be understood as denoting (human) consciousness that originates in the brain and is particularly manifested in perception, emotion, thought, reasoning, will, judgment, memory and imagination. The term 'mental' refers to everything relating to the mind. The term 'cognition' should be understood as representing those mental processes that are involved in gaining knowledge and comprehension, including, among other higher-level functions of the brain, thinking, remembering, problem-solving, planning and judging. However, even though the mind and its cognition are founded on cerebral processes, mental activities are in no way separated from the rest of the body. On the contrary, I subscribe to the idea that the mind is profoundly embodied—formed by experiences of corporeality (Johnson 1987).

Most of the researchers that I refer to here have recognised, either explicitly or implicitly, that the third entity, 'an intermediate stage that makes the transfer possible', is in some way material. As stated succinctly in a more recent publication, any act of communication 'is made possible by some form of *concrete reification* of the message, which, at its most elementary level, must abide by physical laws to exist and take shape' (Bolchini and Lu 2013: 398). Furthermore, Schramm and Hall clearly discussed the intermediate stage in terms of signs. In line with this, I suggest that the intermediate entity connecting two minds with each other is always in some way material, understood broadly as consisting of physical entities or phenomena, although it clearly cannot be conceptualised only in terms of materiality. As it connects two minds in terms of a transfer of cognitive import, it must be understood as materiality having the capacity to trigger certain mental responses.

My suggestion is to use the term 'media product' to refer to the intermediate stage that enables the transfer of cognitive import from a producer's to a perceiver's mind (what Irina O. Rajewsky called 'medial configuration' (2010)). As the bodies of these two minds may well be used as instruments for the transfer of cognitive import, they have potential to attain the function of media products. I propose that a media product may be realised by either non-bodily or bodily matter (including matter emanating directly from a body), or a combination of the two. This means that the producer's mind may, for instance, use either non-bodily matter (say, paper) or her own body and its immediate extensions (moving arms and sound produced by the vocal cords) to realise media products such as printed texts, gestures and speech. Furthermore, the perceiver's body may be used to accomplish media products; for instance, the producer may realise a painting on the perceiver's skin or push her gently to communicate the desire that she move a little. Additionally, other bodies, such as the bodies of actors, may be used as media products.

In contrast to influential scholars such as Marshall McLuhan, who conceptualised media as the 'extensions of man' in general (McLuhan 1994 [1964]), I define media products as 'extensions of mind' in the context of inter-human communication. Thereby, I avoid the classical distinction in communication studies between mediated and interpersonal communication—communication that needs and communication that supposedly does not need mediation. This distinction has been criticised because of practical difficulties in upholding it (see Rice 2017). I avoid it also because of the theoretical and more profound obstacle of thinking about interpersonal communication as not being mediated (it would be absurd to consider interpersonal communication independent of media capacities and media limitations). The only thing that justifies such a distinction is that so-called interpersonal communication is entirely dependent on specific (but not fundamentally different) forms of media products, namely, those that rely on the producer and perceiver's human bodies and their immediate extensions instead of external devices.

#### *1.2.2 Media Products*

Given that being a media product must be understood as a function rather than an essential property, virtually any material existence can be used as one, including not only solid objects but also all kinds of physical phenomena that can be perceived by the human senses. In addition to those forms of media products that are more commonly categorised as such (like written texts, songs, scientific diagrams, warning cries and road signs), there is an endless row of forms of physical objects, phenomena and actions that can function as media products, given that they are perceived in situations and surroundings that encourage interpretation in terms of communication. These include nudges, blinks, coughs, meals, ceremonies, decorations, clothes, hairstyles and make-up. In addition, dogs, wine bottles and cars of certain makes, sorts and designs may well function as media products to communicate the embracing of certain values or simply wealth, for instance. Within the framework of a trial, surveillance camera footage and spoken word testimony from witnesses both function as media products, as do fingerprints, DNA samples and bloodstains presented by the prosecutor—because they are drawn into a communicative situation.

Because the function of being a media product is initially triggered by the producer's mind, media products can be said to be *produced* by the producer's mind. As I define these concepts, producing a media product does not necessarily mean fabricating it materially. Fingerprints presented in a criminal trial are evidently produced by the prosecutor not in the sense that she materially fabricates them, but in the sense that she gives them a communicative function by placing them in the context of the trial.

It may also be the case that someone uses an 'old' media product, produced by someone else, to communicate. For instance, one could play a recorded love song, written and sung by others, to communicate love to someone special on a certain occasion. In this way, the recorded song, which already has the function of a media product, is appropriated, so to speak, and given a more specific and partly new communicative function. Like the fingerprints (disregarding other differences), the recorded love song is not fabricated by the (new) producer's mind, but rather exposed and given a (new) communicative function.

Given this conceptualisation, it is pointless to try to distinguish between physical existences that are and that are not actual media products. Instead, it is important to have a clear notion of the properties of physical existences that confer the function of media products on them. Clearly, these properties, which I will investigate in the following, are in no way selfevidently present. Perceiving something as a media product is a question of being attentive to certain kinds of phenomena in the world. As humans have been able to communicate with each other for thousands and thousands of years, this attention is partly passed on by heredity, but it is also deeply formed by cultural factors and the experience of navigating within one's present surroundings. Knowledge of musical performance traditions, for example, leads to specific attention to certain details while others may be ignored; thus, accidental noises and random gestures may be sifted out as irrelevant for the musical communication and not part of the media product. Practical experience of the environment normally makes us pay attention to what happens on the screen of a television set rather than to its backside. However, if the television set is used in an artistic installation, or if a repair person tries to explain why it does not work by way of pointing to certain gadgets, it may be the backside that should be selected for attention in order to achieve the function of a media product.

Thus, media products are cultural entities that depend on social praxis; media products and their basic characteristics are (more or less) delimited units formed by (often shared) selective attention on sensorially perceptible areas of communication that are believed to be relevant for achieving communication in a certain context. This means that there is no such thing as a media product 'as such'. I argue that not even a written text is a media product in itself; it is only when its function of transferring cognitive import among minds is realised that it can be conceptualised as a media product. The archaeologist who inspects the marks on a bone and believes that they are caused by accidental scraping is not involved in communication. If the archaeologist believes that the marks are some sort of letters in an unknown language, she may be engaged in elementary communication to the extent that she understands a communicative intent. If the marks are eventually deciphered, communication that is more complex may result. If the deciphering actually turns out to be mistaken, the belief that communication occurred is an illusion. Of course, border cases like these could also be exemplified by everyday interaction among people who may or may not be mistaken about the significance of all kinds of movements, glances and sounds.

McLuhan suggestively argued that not only the spoken word, the photograph, comics, the typewriter and television are media, but also money, wheels and axes (1994 [1964]: 24). In relation to that, I argue that whereas nothing is a media product as such, virtually everything can attain the function of a media product. In that sense, money, wheels and axes *may* also function as media products, although they do not actually do so as regularly as spoken words and photographs.

#### *1.2.3 Elaborating the Communication Model*

I will now display my communication model in the form of a visual diagram (Fig. 1.1) and explain some of its implications. Construing this diagram from left to right, the act of communication starts with certain

**Fig. 1.1** A medium-centred model of communication (Elleström 2018a: 282)

cognitive import in the producer's mind. Consciously or unconsciously, the producer forms a media product, which may be taken in by some perceiver. Thus, the media product makes possible a transfer of cognitive import from the producer's mind to the perceiver's mind. This is certainly not a transfer in the strong sense that the cognitive import as such passes through the media product (which lacks consciousness), but in the sense that there is, ultimately, cognitive import in the perceiver's mind that bears some resemblance to the cognitive import in the producer's mind.

The visual diagram contains the three entities of communication circumscribed above:


Additionally, the visual diagram displays four essential interrelations among these entities:


I will now elaborate on these interrelations, especially the fourth one. I submit that the notion of media product, and the question of how cognitive import may be transferred through a media product, is essential for understanding communication.

The first interrelation, 'an act of production "between" the producer's mind and media product', is always initiated by the producer's mind and always, to begin with, effectuated by the producer's body. Sometimes, this primary bodily act will immediately result in a media product. For instance, when one person begins talking to another person who is standing beside her, the speech emanating from the vocal cords constitute a media product that reaches the perceiver directly. At other times, the primary bodily act is linked to subsequent stages of production, and the primary bodily act can be connected to a broad range of actions and procedures before a media product comes to be present for a perceiver. For instance, talking through a telephone often requires manual handling of the telephone in addition to the activation of the user's vocal cords, and always requires constructed, technological devices that are suitable to transmit the initial speech to another place, in which the actual media product is constituted—that is, the speech that can be heard by the perceiver. Similarly, a child drawing a picture for her father who is sitting at the same kitchen table only has to perform, in principle, one primary bodily act in order to create a media product that is immediately available for the perceiver. However, if the father is in another place, additional stages of actions and procedures must be added: the drawing may be posted and physically relocated, or scanned and emailed, after which it appears in a slightly transformed way as a media product that is realised by a computer screen. Thus, the act of production may be simple and direct, as well as complex and indirect. It may also include stages of storage.

There is an abundance of devices for the production and storage of media products. Although involved in mediality, and often called media of production and media of storage, I prefer not to call them media, in order to keep the terminology clear. Thus, cameras are technical devices of production (with the capacity to register light chemically or physically) that can be said to be attached, more or less distantly, to technical devices of display with various properties, such as silver-plated sheet copper, photographic paper or a screen (a computer screen or a display on the camera itself). Book pages are technical devices of storage and technical devices for the display of visual sensory configurations. In contrast, because they quickly disappear, sound waves generated by vocal cords do not store sensory configurations but only display them.

The second interrelation, 'an act of perception "between" the media product and the perceiver's mind', is always initiated by the perceiver's sense organs and always, to some extent, followed by and entangled with interpretation. Interpretation should be understood as all kinds of mental activities that somehow make sense of the sensory input; these activities may be both conscious and unconscious and are no doubt already present in a basic way when the sense impressions are initially processed. Thus, compared to the potentially extensive act of production, the act of perception is brief and quickly channelled into interpretation, which of course occurs in the perceiver's mind. Nevertheless, the type, quality and form of sensory input provided by the media product, and actually taken in by the perceiver's sense organs, are crucial for the interpretation formed by the perceiver's mind.

For the moment, I will only comment briefly upon the third interrelation among the entities of communication, 'cognitive import "inside" the producer's mind and the perceiver's mind'. One cannot state, without intricate implications, that there is a certain amount of confinable cognitive import inside a mind, and it is undoubtedly difficult to judge the actual extent of similarity between the two amounts of cognitive import in the two minds. Deciding this in a more precise way is probably beyond the reach of known scientific research methods. However, I find the notion that the transferred cognitive import is only one part of the producer's and the perceiver's minds unproblematic. The cognitive import is 'inside' the minds, in the sense that it is closely interconnected with a multitude of other cognitive entities and processes and, ultimately, with the total sum of mental activities in general that surrounds it.

The fourth interrelation, 'a transfer of cognitive import "through" the media product', is central for my arguments. Until now, I have only described the media product simply as the entity of communication that enables a transfer of cognitive import from a producer's mind to a perceiver's mind—a material entity that has the capacity of triggering mental response. However, to give a somewhat more detailed account of this notion, the very capacity itself must be scrutinised.

Of course, the transfer of cognitive import is only partly comparable to other transfers—such as the transfer of goods between two cities by train. The cognitive import transfer is not a material transfer but a mental transfer aided by materiality. In one respect it can be compared to teleportation, which is the transfer of energy or matter between two points without traversing the intermediate space: the cognitive import is indeed transferred between two points (two minds), and, contrary to the transfer of goods, it does not traverse the intermediate space. Nevertheless, as the transfer depends on the media product, it is reasonable to say that the cognitive import goes 'through' the media product. Actually, the media product is neither a neutral object of material transfer, like a freight car, nor an intermediate space without effect, as in teleportation; it constitutes a crucial stage of transition, not only transmission. As Beate Schirrmacher suggested to me in personal communication, the transfer of cognitive import 'through' the media product might alternatively be described as 'a chain or interactions' involving producer's mind, media product, perceiver's mind and everything in between.

Explaining this in some detail requires attention to the whole spectrum, from the material to the mental. My angle for coping with this challenge is to suggest that all media products can be analysed in terms of four kinds of basic traits. As already noted, I call these categories media modalities (Elleström 2010). I will describe these modalities briefly to prepare the ground for further elaboration of the communication model and then come back to them in a lengthier discussion later in the article.

The first three modalities are the material modality, the spatiotemporal modality and the sensorial modality. Media products are all material in the sense that they may be, for instance, solid or non-solid, or organic or inorganic, and comparable traits like these belong to the material modality. All media products also have spatiotemporal traits, which means that such products that do not have at least either spatial or temporal extension are inconceivable; hence, the spatiotemporal modality consists of comparable media traits such as temporality, stasis and spatiality. Furthermore, media products must reach the mind through at least one sense. Hence, sensory perception is the common denominator of the media traits belonging to the sensorial modality—media products may be visual, auditory and tactile and so forth.

Of course, these kinds of traits are not unknown to communication researchers. For instance, Hall discussed the two sensory channels of television (1980), David K. Berlo highlighted all five external senses (1960), and Schramm at least briefly mentioned that 'a message has dimensions in time or space' (1971: 32). However, a thorough understanding of the conditions for communication requires systematic attention to all modalities. It is clear that cognitive import of any sort cannot be freely communicated by any kinds of material, spatiotemporal and sensorial traits. For instance—to use some blatant examples—complex assertions cannot easily be transferred through the sense of smell, and it is more difficult to effectively transfer detailed series of visual events though a static media product than through a temporal media product.

The fourth modality is the semiotic modality. Whereas the semiotic traits of a media product are less palpable than the material, spatiotemporal and sensorial traits, and in fact are entirely derived from them, they are equally essential for realising communication. The sensory configurations of a media product do not transfer any cognitive import until the perceiver's mind comprehends them as signs. In other words, the perceived sensory configurations are meaningless until one understands them as representing something through unconscious or conscious interpretation. This is to say that all objects and phenomena that act as media products have semiotic traits, by definition. By far the most successful effort to define the basic ways in which to create meaning in terms of signs has been Charles Sanders Peirce's foundational trichotomy icon, index and symbol.

Understanding this trichotomy requires us to comprehend an even more foundational semiotic trichotomy: the three sign constituents. In brief, Peirce held that signs, often called *representamens*, stand for *objects* this relationship results in *interpretants* in the perceiver's mind: 'A sign, or *representamen*, is something which stands to somebody for something in some respect or capacity'. This means that the *representamen* stands for an *object* in some respects and thus 'creates in the mind of that person' an *interpretant* (Peirce 1932: CP2.228 [c.1897]). This entails that signs are not pre-existing static items, but rather dynamical functions established by relational constituents that exist only in interaction with each other. Signification is a mental process, although both representamens and objects may be connected to external elements or phenomena; however, the interpretant is entirely in the mind. I would argue that my notion of cognitive import created in the perceiver's mind in communication is a vital example of Peirce's notion of interpretants resulting from signification.

Hence, a media product can be understood as an assemblage of representamens that, due to their material, spatiotemporal and sensorial traits, together with contextual factors, represent certain objects (that are available to the perceiver), thus creating interpretants (cognitive import) in the perceiver's mind.

Peirce defined his three central sign types based on some fundamental cognitive abilities that make representamen–object relationships possible. Icons stand for (represent) their objects based on similarity, indices do so based on contiguity, and symbols rely on habits or conventions (1932: CP2.247–249 [c.1903]; Elleström 2014a: 98–113). I take iconicity, indexicality and symbolicity to be the main media traits within the semiotic modality, which is to say that no communication occurs unless cognitive import is created through at least one of the three sign types. Iconicity, indexicality and symbolicity are simply indispensable for semiosis, and they work because of our capacity to perceive similarities and contiguity and to form habits.

I use the term 'semiosis' here to denote the widest and least strict notion of sign activity and sign use, where signs are always to be understood as results of interpretation—not inherent qualities of objects or phenomena. 'Semiosis' is a catch-all term for everything that involves signs, which may be applied when there is no need for precision. Peirce himself only used the term sporadically, without ever giving it a prominent or particularly specific place in his vocabulary (something close to a definition can be found in 1934: CP5.484 [c.1907]). Briefly, I take *signification* to be the process of meaning creation. While signification is always a mental process, it may also include material aspects; for instance, the mind may perceive physical qualities through media products. *Representation* should be understood more specifically as representamens triggering the presence of objects in the mind; thus, representation is a core part of signification.

Again, processes of signification are not unknown in communication research. Among the scholars quoted in this article, Schramm clearly related to some basic semiotic features. For instance, he accurately noted that 'it is just as meaningful to say that B [the receiver] acts on the signs [the message], as that they act on B' (1971: 22). Indeed, the mind of the perceiver is very active in construing the signs of the media product. In addition, Hall spoke in terms of semiotics, albeit with a distinct linguistic bias. Peirce's semiotic framework is fruitful because it incorporates sign types that work far outside of the linguistic domain, dominated by symbolicity in the form of verbal language.

Furthermore, I wish to emphasise the notion that a semiotic perspective must be combined with a material perspective. Communication is equally dependent on the material, spatiotemporal, sensorial and semiotic modalities. What one takes to be represented objects called forth by representamens (objects such as persons, things, events, actions, feelings, ideas, desires, conditions and narratives) are results of both the basic features of the media product as such (the mediated material, spatiotemporal and sensorial traits) and of cognitive activity, connected to surrounding factors, resulting in representation. While signification is ultimately about mind-work, in the case of communication this mind-work is fundamentally dependent on the physical appearance of the media product although some representation is clearly more closely tied to the appearance of the medium, whereas other is more a result of interpretation, and hence the context of the perceiving mind.

As with material, spatiotemporal and sensorial traits, the semiotic traits of a media product offer certain possibilities and set some restrictions. Obviously, cognitive import of any sort cannot be freely created based on just any sign type. For instance, the iconic signs of music can represent complex feelings and motional structures that are largely inaccessible to the symbolic signs of written text; conversely, written symbolic signs can represent arguments, and the appearance of visual objects, with much greater accuracy than auditory icons. Flagrant examples like these are only the tip of the iceberg in terms of the (in)capacities of signs based on similarity, contiguity and habits or conventions, respectively. Therefore, the semiotic traits of the medium make possible—but also delimit—the communicative transfer of cognitive import through a media product.

In line with this proposal, it is appropriate to bring the notion of noise into the discussion. Many researchers engaged in communication of meaning have picked up Claude E. Shannon's (1948) idea that signal disturbances in communication can be conceptualised as noise. The basic phenomenon of disruptions that occur on the way from the producer's mind to the perceiver's is clearly relevant to the transfer of cognitive import. For instance, Schramm noted that noise is 'anything in the channel other than what the communicator puts there' (1955: 138). As an example, speech can be disturbed by other sounds, and a motion picture can be disrupted because of material decay or censorship. Noise in this sense occurs both in the act of production and in the act of perception. My visual model of communication (Fig. 1.1) shows this noise as disruptions in the arrow representing transfer of cognitive import—both before and after the transfer through the media product—reflecting the unsatisfactory conditions of production and perception.

The problem with the notion of noise when applied to communication of meaning, or cognitive import, is that it might imply that the complete absence of noise would bring about complete transfer of cognitive import, as in the case of technical transmission of computable data, which is clearly not the case. The technological notion of noise is simply not sufficient to understand communication of cognitive import. According to Hall, 'distortions' or 'misunderstandings' are also due to, among other things, 'the asymmetry between the codes of "source" and "receiver" at the moment of transformation into and out of the discursive form' (1980: 131).

This contention is definitely a step in the right direction in terms of offering a more complex notion of possible disruptions in the communication of cognitive import. However, it does not provide a more complete view of restraining factors in the transfer of cognitive import. It is also important to emphasise that creators of media products generally do not have access to, or do not master, more than a few media types. Consequently, they are often unable to form media products that have the capacity to create cognitive import in the perceiver's mind that is similar to the cognitive import in their own mind. Therefore, I argue that important restraining factors of communication are found in the material, spatiotemporal, sensorial and semiotic traits of the media products.

Many exceedingly complex factors are clearly involved when the perceiver's mind forms cognitive import. Furthermore, as Mary Simonson has accurately noted, media products are sometimes 'envisioned and created precisely so that they will likely not transmit meanings and ideas in a straightforward way' (2020: 4). My proposed model highlights one particular cluster of crucial factors: media products have partly similar and partly dissimilar material, spatiotemporal, sensorial and even semiotic traits, and the combination of traits to a large extent—although certainly not completely—determines what kinds of cognitive import can be transferred from the producer's mind to the perceiver's mind. Songs, emails, photographs, gestures, films and advertisements differ in various ways concerning their material, spatiotemporal, sensorial and semiotic traits and hence can only transfer the same sort of cognitive import to a limited extent. Figure 1.1 shows this communicative restriction as disruptions in the arrow representing transfer of cognitive import as it passes through the media product.

#### *1.2.4 Communicating Minds*

Outlining only the fewest possible entities of communication and their essential interrelations, my suggested model of communication (Fig. 1.1) is irreducible but certainly expandable. I have already fleshed it out by suggesting various ways of conceptualising the notion of media product in some detail. I will now also sketch a more multifaceted comprehension of communicating minds: the minds of the producer and the perceiver and their interrelations.

The minimal level of complexity consists of simply one mind producing a single media product of which another perceiving mind makes sense. This, I believe, is the core of human communication. In actual communicative situations, however, the perceiver's mind is often also a producer's mind. Based on the cognitive import generated by an initial media product, the perceiver becomes a producer in terms of creating another media product (of the same or another kind) that reaches an additional perceiver's mind, thereby forming new cognitive import that is more or less similar to that in earlier producers' minds. Hence, a communicative chain is formed. When the communicative chain involves the initial producer and perceiver constantly changing roles and forming new media products (of the same or another kind), we have two-way communication. The creation of new media products in two-way communication is often conceptualised as feedback that may result in the creation of cognitive import that is either only slightly or significantly developed. Communicative chains that are uni- and bidirectional may be combined in a multitude of ways.

Furthermore, media products are often produced or perceived by several minds. For instance, a motion picture is normally both produced and perceived by more than one mind. While the minds of scriptwriters, directors, actors and many others combine to create the motion picture, the audience consists of a multitude of perceiving minds. In contrast, a plenary talk is, as a rule, produced by one mind but perceived by many. An unsuccessful theatre performance may be produced by many minds but perceived (from an off-stage position) by only one.

Another level of complexity consists of the case when perceivers take in their own media product. Although I would not say that pure thinking is communication (as suggested by Berlo [1960: 31]), perception of one's own media product created earlier may mean that the mind tries to construe cognitive import on the basis of the media product rather than on the memory of what one had in mind on the occasion of production. In this case, a transfer of cognitive import actually occurs through a media product from one mind to another, in the sense that the mind, when perceiving the media product, is in a different state than it is during production. The effort of writing a scholarly text is a good example of this sort of internal communication: communication sometimes fails when one cannot understand the words one has written just the day before.

Of course, one can also combine this level of complexity with others, as in the case of interactive video games. Such games are normally constructed and designed by several minds, but the point here is that the actual media products (the many realised sensory configurations that are mediated by screens and sounding loudspeakers each time the game is being played) are also created by the players. Accordingly, we have a kind of communication involving several producing minds that have created certain frames for interaction and resulting consequences (when designing the game), one or several producing minds that create the actual media product in their interaction with the evolving media product (when playing the game) and one or several perceiving minds that are actually the same as those minds that interact with and hence produce the media product: the specific realisation of the possibilities of the video game. Naturally, additional minds that are not co-producers (i.e. an audience) may also perceive this media product.

The notion of the producer's mind and perceiver's mind may well be simple but it is certainly not reductive. On the contrary, it offers a solid basis for analysing all kinds of communicative complexities. While the examples above do not exhaust the intricacies, they may hint at the many complicated ways in which producers and perceivers' minds may be positioned in various communicative circumstances.

In addition to developing the basic notion of transfer of cognitive import *between* two separate minds, I will now also elaborate on the notion of cognitive import *in* the producer's and especially the perceiver's mind. As the irreducible model of communication only states that cognitive import is transferred between minds, it is appropriate to suggest not only a way of understanding how it is formed by basic media traits (which was done in the section on media modalities), but also a way of comprehending how it is moulded by surrounding factors. In addition to its innate basic capacity to perceive and interpret mediated qualities, the mind is inclined to form cognitive import based on acquired knowledge, experiences, beliefs, expectations, preferences and values—preconceptions that are largely shaped by culture, society, geography, history and various communities in the mind's surroundings. This concept is immensely important for the outcome of communication. The perceiver's mind acts upon the perceived media product on the basis of both its hardwired cognitive capacities and its attained predispositions. Evidently, the cognitive import that was stored in the mind before the media product was perceived has a significant effect—to varying degrees—on the new cognitive import formed by communication.

This widely recognised fact has been extensively theorised in various ways. Jakobson discussed it in terms of 'a context [that is] seizable by the addressee, and either verbal or capable of being verbalized' (1960: 353). While context is important for all kinds of communication, I think it is a mistake—even for a restricted focus on verbal communication—to say that the context must be verbalisable in order for it to be relevant. Hall distinctly emphasised the 'social relations of the communication process as a whole' and the 'frameworks of knowledge' (1980: 129–130) and discussed them in detail. The research area of hermeneutics has minutely scrutinised these and other issues that are central to the formation of meaning in a broad context.

Here I will only suggest a complementary semiotic way of circumscribing how surrounding factors form cognitive import in communication. Although the focus is on the perceiver's mind, the suggested basic principles are also relevant for the formation of cognitive import in the producer's mind.

I have already established that the representamens that initiate semiosis in communication come from sensory perception of media products. One perceives configurations of sound, vision, touch and so forth that are created or brought out by someone and understood to signify something; they make objects (in the Peircean sense) present to the perceiver's mind and result in interpretants based on the representamen–object relation. These interpretants, and interpretants resulting from further chains of semiosis, constitute the cognitive import being transferred in communication. The objects emerge from earlier perceptions, sensations and notions that are stored in the perceiver's mind, either in long-term or short-term memory that may also cover ongoing communication. 'Earlier' could be a century before or a fraction of a second before.

In semiotic terms, the stored mental entities may be direct perceptions from outside of communication, interpretants from semiosis outside of communication, interpretants from semiosis in earlier communication or interpretants from semiosis in ongoing communication. This is to say that objects of semiosis always require 'collateral experience' (Peirce 1958: CP8.177–185 [1909]; cf. Bergman 2009) that may derive both from within and without ongoing communication. In other words, collateral experience may be formed by semiosis inside the spatiotemporal frame of the communicative act or stem from other earlier involvements with the world, including former communication as well as direct experience of the surrounding existence.

In line with this twofold origin of collateral experience, I distinguish between two utterly entwined but dissimilar areas in the mind of the perceiver of media products: the *intracommunicational* and the *extracommunicational domains*. This distinction emphasises a difference between the formation of cognitive import in ongoing communication and what precedes and surrounds it (related but divergent distinctions in cognitive psychology have been proposed by Brewer [1987: 187]). I also find it appropriate to make a corresponding distinction between *intracommunicational* and *extracommunicational objects*, both of which are formed by collateral experience from their respective domains.

The extracommunicational domain should be understood as the background area in the mind of the perceiver of media products. It comprises everything with which the perceiver is already familiar. As it is a mental domain, it does not consist of the world as such but rather of what the perceiver believes and knows through perception and semiosis. The perceiver's stored experiences not only consist of raw perceptions, such as foundational sensations of being a body that physically interacts with a spatiotemporal surrounding, but also of perceptions that have been contemplated and processed by the mind through semiosis. This involves estimations and evaluations of encounters with people, societies and cultures that are consciously or unconsciously accepted, put in doubt or rejected. It involves shared experiences and ideas, cultural norms and common beliefs, but also more individual understandings, impressions and values all of which are well known to be crucial factors for the outcome of communication.

The extracommunicational domain includes experiences of what one presumes to be more objective states of affairs (dogs, universities, music and statistical relations), what one presumes to be more subjective states of affairs (states of mind related to individual experiences) and everything in between. Thus, it is actually formed in one's mind not only through semiosis and immediate external perception but also through interoception, proprioception and mental introspection. Hence, the extracommunicational/intracommunicational domain distinction is different from exterior/interior to the mind, world/individual, material/mental and objective/subjective.

Vital parts of the extracommunicational domain are constituted by perception and interpretation of media products. Therefore, former communication is very much part of what precedes and surrounds ongoing communication. Together, non-communicative and communicative prior experiences form 'a horizon of possibilities', to borrow an expression from Marie-Laure Ryan (1984: 127). The extracommunicational domain is the reservoir from which entities are selected to form new constellations of objects in the intracommunicational domain.

In contrast to the extracommunicational domain, the intracommunicational domain is the foreground area in the mind of the perceiver of media products. It is formed by one's perception and interpretation of the media products that are present in the ongoing act of communication. It is based on both extracommunicational objects, emanating from the extracommunicational domain, and intracommunicational objects, arising in the intracommunicational domain, that together result in interpretants making up a salient cognitive import in the perceiver's mind. However, the intracommunicational domain is largely mapped upon the extracommunicational domain. Rehashing Ryan's 'principle of minimal departure' (1980: 406), I argue that one construes the intracommunicational domain as being the closest possible to the extracommunicational domain and allows for deviations only when they cannot be avoided. In other words, familiar ideas and experiences are not questioned until it is necessary to do so.

As the intracommunicational domain is formed by communicative semiosis, it can be called a *virtual sphere*. The virtual should not be understood in opposition to the actual, but as something that has the *potential* to have real connections to the extracommunicational—to be truthful (Elleström 2018b). Therefore, I define the virtual as a mental sphere, created by communicative semiosis and consisting of cognitive import formed by represented objects.

A virtual sphere can consist of anything from a brief thought triggered by a few spoken words, a gesture or a quick glance at an advertisement, to a scientific theory or a complex narrative formed by hours of reading books or watching television (Elleström 2019). Ultimately, everything that is possible to think may be part of a virtual sphere.

Depending on the degree of attention to the media products, the borders of a virtual sphere do not necessarily have to be clearly defined. As communication is rarely flawless, a virtual sphere may be exceedingly incomplete or even fragmentary. It may also comprise what one apprehends as clashing ideas or inconsistent notions. As virtual spheres result from communication, they are, by definition, shareable among minds to some extent.

The coexistence of intracommunicational and extracommunicational objects results in a possible double view on virtual spheres. From one point of view, they form self-ruled spheres with a certain degree of experienced autonomy; from another point of view, they are always exceedingly dependent on the extracommunicational domain. The crucial point is that intracommunicational objects cannot be created *ex nihilo*; they are completely derived from extracommunicational objects. This is because one cannot grasp anything in communication without the resource of extracommunicational objects. Even the most fanciful narratives require recognisable objects in order to make sense (cf. Bergman 2009: 261). To be more precise: intracommunicational objects are always in some way parts, combinations or blends of extracommunicational objects. To be even more exact, intracommunicational objects are parts, combinations or blends of interpretants resulting from representation of extracommunicational objects.

It is possible to represent, say, griffins (which, to the best of our knowledge, exist only in virtual spheres) because of one's acquaintance with extracommunicational material objects such as lions and eagles that one can easily combine. A virtual sphere may even include notions such as a round square, consisting of two mutually exclusive extracommunicational objects that together form an odd intracommunicational object. Literary characters such as Lily Briscoe in Virginia Woolf's novel *To the Lighthouse* are composite intracommunicational objects consisting of extracommunicational material and mental objects that stem from the world as one knows it. You cannot imagine Lily Briscoe unless you are familiar with notions such as walking, talking and eating; what it means to refer to persons with certain names; what women and men, adults and children are; what it means to love and to be bored; and what artistic creation is. In addition, more purely mental extracommunicational objects can be modified or united into new mental intracommunicational objects. Objects such as familiar emotions can be combined into novel intracommunicational objects consisting of, say, conflicts between or blends of emotions that one perceives as unique although one is already acquainted with the components.

The question then arises: if all intracommunicational objects are ultimately derived from extracommunicational objects, why do we often experience virtual spheres as having a certain degree of autonomy? This is because we may perceive them, in part or in whole, as new *gestalts* that disrupt the connection to the extracommunicational domain. This happens when we do not immediately *recognise* the new composites of extracommunicational objects. The reason why they are not being re-cognised is that they have not earlier been cognised in the particular constellation in which they appear in the virtual sphere. Several such disruptions lead to greater perceived intracommunicational domain autonomy. Even though intracommunicational objects are entirely dependent on extracommunicational objects, they can be said to emerge within the intracommunicational domain.

Having described the interrelations between the intracommunicational and the extracommunicational domains in some detail, I will now present an overview with the aid of a visual diagram (Fig. 1.2). Whereas the intracommunicational domain simply consists of one virtual sphere, the extracommunicational domain consists of two rather different elements: on the one hand, other virtual spheres, and, on the other hand, what I propose to call the *perceived actual sphere*. This means that, from the point of view of a virtual sphere, there are three more or less distinct spheres: the virtual sphere itself, other virtual spheres and the perceived actual sphere.

The *perceived actual sphere* consists of *extracommunicational*, *immediate and presented* material and mental objects *beyond the realm of communication* that the perceiving mind is acquainted with. 'Perceived' shall be understood in a broad sense to include exteroception, interoception and proprioception, joined by mental introspection and semiosis based on perception of the actual sphere. 'Immediate and presented' shall be understood in contrast to communication: the perceived actual sphere does not

**Fig. 1.2** Virtual sphere, other virtual spheres and perceived actual sphere (Elleström 2018b: 432)

consist of mediated representations formed by media products brought out by minds and their extensions, but is *immediately present* to us. Note that immediately present does *not* mean that the perceived actual sphere is independent of the mediating mental mechanisms that connect sensation to perception or the complicated mediating functions that connect perception to the external world.

The *other virtual spheres* consist of *extracommunicational, already mediated and represented* material and mental objects that the perceiving mind is acquainted with. As virtual spheres are thoroughly semiotic, these objects are always made out of former interpretants. The other virtual spheres result from communication and comprise mediated representations formed by media products brought out by minds and their extensions.

Hence, the *virtual sphere* consists of *extracommunicational*, *immediate and presented* material and mental objects from the perceived actual sphere + *extracommunicational, already mediated and represented* material and mental objects from other virtual spheres + *intracommunicational, mediated and represented* material and mental objects *that emerge within the virtual sphere*.

Together, the intra- and extracommunicational domains constitute *the world as one knows it*, which corresponds to what Siegfried J. Schmidt called *actuality*, 'our world of experience'; 'we have to postulate a strict separation between reality, which is cognitively inaccessible but has to be presupposed as existing at least for logical reasons, and actuality, which is constructed by the real brain' (Schmidt 1994: 499). Hence, everything outside of these domains—*the unknown*—corresponds to what Schmidt referred to as the cognitively inaccessible *reality*.

Like all schematic representations, this model is intended to provide an overview of an intricate state of affairs. Nevertheless, it not only points to mental areas that are fundamentally different in certain respects, but also reveals their complex interrelations. Thus, one must emphasise that every virtual sphere, from the point of view of that sphere, is intracommunicational, and is therefore composed of objects that are derived from itself (to the extent that parts, combinations and blends of extracommunicational objects may be understood as distinct), as well as from other virtual spheres and the perceived actual sphere. This comprises a *mise-en-abyme*: intracommunicational virtual spheres are formed by perceived actual spheres and by other extracommunicational virtual spheres that are, in turn, formed by perceived actual spheres and by other extracommunicational virtual spheres *ad infinitum*.

Adding the diagram in Fig. 1.2 to the diagram in Fig. 1.1 might give a sense of how one may expand the irreducible model of communication in terms of a more complex understanding of the transferred cognitive import. In brief, the totality of the intracommunicational and extracommunicational domains in Fig. 1.2 (the outer circle) is equivalent to the whole perceiver's mind in Fig. 1.1 (the outer circle). The intracommunicational domain, comprising the virtual sphere in Fig. 1.2 (the inner circle), consists of the cognitive import in the perceiver's mind according to Fig. 1.1 (the inner circle). This virtual sphere is not only formed by the perception and interpretation of the specific traits of the media products that are present in the ongoing act of communication, as emphasised in Fig. 1.1. It is simultaneously based on a combination of extracommunicational and intracommunicational objects that, together, result in interpretants making up salient cognitive import in the perceiver's mind, as demonstrated in Fig. 1.2. In other words, the cognitive import in the perceiver's mind, bringing about a virtual sphere, is formed by both ongoing experience of the particular traits of the media product and the general collateral experiences of all sorts in the perceiver's mind.

The extent to which cognitive import may be shared among the producer's and perceiver's mind is undoubtedly partly determined by how much the extracommunicational domain of the perceiver's mind overlaps with the extracommunicational domain of the producer's mind (understood as the background area in the mind of the producer of media products). This conclusion corresponds well with established views on the importance of shared experiences and knowledge for successful communication.

#### 1.3 What Is <sup>a</sup> Technical Medium of Display?

#### *1.3.1 Media Products and Technical Media of Display*

At this stage of the account, it is necessary to introduce a delicate but sometimes vital distinction between media products and *technical media of display*. I have stated that media products are physical entities or processes that are necessary for communication because they interconnect minds. More precisely, I should also emphasise that being a media product is a function that requires some sort of perceptible physical phenomenon to come into existence. I call these physical items or phenomena technical media of display (cf. Jürgen E. Müller's distinction between 'technical conditions' and 'media products' [1996: 23]).

I choose the term 'technical' to attach to one of the meanings of the Greek word *téchne*̄: practical skill and the methods employed in producing something. Accordingly, technical media of display should be understood as entities that realise media products; they distribute sensory configurations with a communicative function. Terms such as 'technical media of distribution', 'dissemination' or 'presentation' would all be accurate. The cumbersome term 'technical media of display of sensory configurations' is perhaps the most precise one for my purpose.

I define a technical medium of display as any object, physical phenomenon or body that *mediates* sensory configurations in the context of communication; it realises and displays the entities that we construe as media products. Technical media of display are those perceptible physical items and processes that, when used in a communicative context, acquire the *function* of media products. Strictly speaking, this means that when the same physical items and processes are not used in a communicative context, they are not technical media of display.

My definition of the notion of technical medium of display is narrower than that of 'physical media' circumscribed, for instance, by Claus Clüver (2007: 30). Devices used for the realisation of media products, but not tools used only for the production or storage of media products, are technical media of display. The brush and the typewriter are tools for production that are normally separated from the material manifestations of media products and are, as such, not normally technical media of display according to my definition, although they count as physical media in Clüver's sense (2007). For the same reason, a computer hard disk—a device for storage—is not routinely a technical medium in the sense that I emphasise here. The video camera is partly a tool for production and partly a device for the realisation of media products (if it includes a screen for film display), so it can be habitually seen as a technical medium of display. A guitar, which can produce and realise musical sound simultaneously, also often works as a technical medium of display if one considers its immediate extensions in the form of sound waves. Some physical existences, such as ink on paper, may both store and display sensory configurations and thus work as technical media if present in communicative situations. Such pieces of paper can mediate sensory figurations that we understand to be, say, written words, whereas a pen, which can only produce and not display written words, is not, in its role as producer of writing, a technical medium of display.

Technical media of display clearly exist in diverse forms. I have already suggested that media products can be realised by either bodily or nonbodily matter. From the perspective of the producer's mind, being situated in a human body, this means that there are external technical media (extra-bodily materialities such as clay, screens, ink on paper, sound waves from loudspeakers or just about anything chosen from the surroundings, including other bodies) and there are internal technical media (the producer's body in its entirety, parts of it or physical phenomena emanating directly from it, such as a voice). All forms of external and internal technical media of display can be combined with each other in countless ways.

Regarding external technical media of display, any perceptible physicality can be used in the function of a media product. A stone and a tree branch lying on the ground are only a stone and a branch. However, if someone picks them up and uses them to intimidate somebody else (to communicate threat) or to manufacture sculptures (to communicate something aesthetic), they become technical media of display—physical entities with a communicative function, the function of being media products.

Harold A. Innis (1950) emphasised the importance of technical media such as stone, clay, papyrus and paper for the historical development of communication—more specifically writing—and society at large. More modern technical media of display include electronic screens and sound waves produced by loudspeakers. Thus, very different kinds of physical entities may act as external technical media of display and realise media products. They may simply be at hand in the environment of the producer's mind and body (like directing a waiter's attention to an empty glass to communicate the desire to be given a new drink) or they may be more or less crafted with a communicative purpose (like using a piece of paper to display the words 'one more beer, please'). They may also be internal and consist of corporeal actions and immediate extensions of the body (like a movement of hand and arm imitating the act of drinking or a voice saying 'one more beer, please').

These examples do not in any way exhaust the many possible modes of existence for technical media. For instance, one may note that items that are manufactured for producing media products, not displaying them, may actually be used as technical media of display in certain circumstances. A pen, which is not a technical medium of display in its role as a producer of writing, may become a technical medium of display if, say, it is placed in a shop window in order to indexically communicate the notion that pens are for sale in the shop.

The distinction between a media product and the technical medium of display is clearly theoretical rather than a distinction between two different kinds of material entities. On the contrary, the physical technical medium is a prerequisite for the existence of a media product, and, in a communicative situation, the perceiver identifies only one level of presence: the perceived sensory configurations emanating from some physical existence. However, the distinction is needed in order to demonstrate the difference—and mutual interdependence—between, for example, what one construes as a piece of music (a media product) and the sound waves emanating from a music audio system (a technical medium of display). Confronted with the famous question in William Butler Yeats's poem 'Among School Children'—'How can we know the dancer from the dance?'—the distinction allows us to give two different but fully compatible answers. On one hand, the dancer and the dance are inseparable in the sense that they are the same material entity occupying physical space and time. On the other hand, they are two different things. Whereas the dancer is a body acting as a technical medium of display, the dance is a function of the material body—a media product.

Although this distinction is sometimes hard to grasp, it often aligns well with everyday parlance and thinking. Allow me to illustrate this further. Some technical media of display, such as audio systems, are well fitted to be reused many times. This is also the case for a technical medium such as a television set (which actually consists of two kinds of technical media of display: a screen that emits photons and loudspeakers that set the air into pulsation) that may realise several different media products (many television programs). A communicating human body may be conceptualised in a similar fashion. When moved in certain ways and in certain circumstances, the body mediates certain sensory configurations and realises what one understands as gestures (media products). As long as the memory of these gestures is kept in the producer's mind, similar gestures can be performed by the same technical medium of display—the body—thus creating a large amount of equivalent media products. Of course, the same body may also be used for realising a multitude of different media products. Conversely, many types of technical media of display can realise a media product such as a television programme; not only television sets but also, for instance, laptop computers, which also consist of a screen and loudspeakers.

On the other hand, because of their physical qualities, some technical media of display tend to be used only once or a few times. A marble block being cut to a certain form mediates certain sensory configurations and realises a sculpture and can usually be reused only a limited number of times. As the block not only displays but also stores the sculpture, the reuse of the technical medium of display implies the destruction of the initial media product.

However, common language does not always provide words to properly describe the distinction between technical media of display and media products. This is because of the boundless and dynamic nature of human communication: for reasons of mental economy, only the most common and salient media products are categorised and given names. An example can be given through the communicative acts performed by the thirsty person discussed above. The movement of the person's hand and arm is used as a technical medium with which to realise what is commonly known as a gesture, a kind of media product. The paper is used as a technical medium for realising a media product that may be called, for instance, a written note. The raised empty glass, however, resists being described in ordinary language; one may say that 'glass' or 'a glass' is used as a technical medium, but what kind of a media product does it realise? This is not clear. Nevertheless, the media product is there, whether there is a proper term to denote it or not.

All these observations call for some discussion regarding duplication of media products. According to my definition, the concept of media product implies that every single display through a technical medium constitutes a specific media product. This display may last for a very short time (a cry of warning, for instance), for a very long time (such as a rock painting) or anything in between. In any case, the display of such media products can be repeated in various ways. Several cries of warning can be heard, several rock paintings can be seen, and some of these are very similar. In some cases, the similarity between media products is so detailed that it is more than reasonable to think that they are 'the same'. When I watch the movie *Fantasia*, I believe that it is the same movie that I saw some years ago, having the same title and being identical in virtually all details, although it was then displayed on the screen in a movie theatre and not on the screen of my television set.

However, the two realisations of *Fantasia* are not the same media product. On a theoretical level, it is important to be able to acknowledge that every display of a media product is unique, even though several media products may be extremely similar indeed—like the thousands of copies of operating instructions for a certain kind of toaster. On a pragmatic level, however, it is efficient to operate with the notion of sameness. Life outside the domain of scholarly writing would become very difficult to handle if we did not recognise that different people, at different times, located at different places, may actually watch 'the same television program', such as a specific episode of *Monty Python's Flying Circus*. However, people actually perceive different media products that are generally virtually undistinguishable but slightly different when it comes to qualities such as the size and resolution of the moving images and the quality of the sound—differences that may or may not affect how cognitive import is construed.

Under theoretical pressure, the 'sameness' of different actual displays becomes diffuse and problematic. Are my toaster operating instructions, covered in coffee stains and almost illegible, the same media product as your unblemished copy? As they can hardly communicate the same cognitive import (understanding how to handle the toaster), I would say not. If I argued that two unstained copies of the operating instructions are the same, the obscure question arises: how many stains or torn pages are required to render them different? In the end, the question of sameness becomes a somewhat metaphysical question. Therefore, strictly speaking, different media products may only be the same in the respect that they are very similar. Although different media products are never ontologically the same, they may be thought of as being 'the same' in many other important respects. One could perhaps say that very similar media products are variations of an abstract but recognisable communicational composition that may be reproduced more or less efficiently.

#### *1.3.2 Mediation and Representation*

As postulated above, media products are the entities through which cognitive import is transferred among minds in communication. Such products require technical media of display in order to be realised. Different forms of technical media of display have different capacities to mediate sensory configurations and make them present to the perceivers' minds, which has consequences for the outcome of communication. The perception of media products is also deeply entangled with cognitive operations, resulting from the encounter with the sensory configurations. These perceptual and cognitive functions can be broadly described as interpretation, and more specifically analysed in terms of signification.

As this complex process of transfer of cognitive import from a producer's mind to a perceiver's mind involves both material and mental aspects, I find it helpful to distinguish between two profoundly interrelated but nevertheless discernible basic facets of the communicative process: *mediation* and *representation*. Mediation is the display of sensory configurations by the technical medium (and hence also by the media product) that are perceived by human sense receptors in a communicative situation. It is a *presemiotic* phenomenon that should be understood as the physical realisation of entities with material, spatiotemporal and sensorial qualities—and semiotic potential. For instance, one may hear a sound. Representation is a semiotic phenomenon that should be understood as the core of signification, which I delimit to how humans create cognitive import in communication. When a perceiver's mind forms sense of the mediated sensory configurations, sign functions are activated and representation is at work. For instance, the heard sound may be interpreted as a voice uttering meaningful words.

To say that a media product represents something is to say that it triggers a certain type of interpretation. This interpretation may be more or less hardwired in the media product and the manner in which a person perceives it with her or his senses, but it never exists independently of the cognitive activity in the perceiver's mind. When something represents, it calls forth something else; the representing entity makes something else the represented—present in the mind. In terms of Charles Sanders Peirce's foundational notions, this means that a sign or *representamen* stands for an *object*. Peirce's third sign constituent, the *interpretant*, can be understood as the mental result of the representamen–object relation (see, for instance, 1932: CP2.228 [c. 1897]). As stated earlier, one may further understand my notion of cognitive import created in the perceiver's mind in communication as an example of Peirce's notion of interpretant—and of course, the concept of interpretation has everything to do with the semiotic idea of interpretants in signification.

Representation, the very essence of semiosis, occurs constantly in our minds when we think without having to be prompted by sensory perceptions. However, it is also triggered by external stimuli; in this context, focusing on external stimuli resulting from mediation is appropriate. Thus, although representation also occurs in pure thinking and in the perception of things and phenomena that are not part of mediation, I delimit the account of representation to the creation of cognitive import based on mediated sensory configurations—stimuli picked up by our sense receptors in communicative situations. My contention is that all media products represent in various ways as soon as sense is attributed to them; or, in other words, when they are attributed sense, they become media products. Hence, one can understand media products as assemblages of representamens that, due to their mediated material, spatiotemporal and sensorial traits—and because of collateral experience in the intra- and extracommunicational domains—represent certain objects, thus creating interpretants (cognitive import) in the perceiver's mind. It is through representation, and more broadly signification, that virtual spheres are created in the perceiver's mind. Hence, according to my terminology, the idea of nonrepresentative media products is self-contradictory.

My current emphasis is on the notion that basic encounters with media have both a presemiotic and a semiotic side. Whereas the concept of mediation highlights the material realisation of the media product, made possible by a technical medium of display, the concept of representation highlights the semiotic conception of the medium. Although mediation and representation are clearly entangled in complex ways, it is vital to uphold a theoretical distinction between them. This theoretical distinction is helpful in analysing complex communicative relations and processes. In practice, however, mediation and representation are deeply interrelated. Every representation is based on the distinctiveness of a specific mediation. Furthermore, some types of mediation facilitate certain types of representation and render other types of representation impossible; different kinds of mediation have different kinds of semiotic potential. As an obvious example, vibrating air emerging from the vocal cords and lips that is perceived as sound but not words is well suited for the iconic representation of bird song, whereas such sounds cannot possibly form a detailed, three-dimensional iconic representation of a cathedral. However, distinctive differences among mediations are frequently more subtle and less easily spotted without close and systematic examination.

# 1.4 What Are Media Modalities, Modality Modes and Multimodality?

#### *1.4.1 Multimodality and Intermediality*

To facilitate such systematic examination of mediality, I will now expand on what I have already introduced as the four modalities of media. This requires a brief discussion of the two research fields of multimodality and intermediality. Although they focus on similar issues, cross-references between these two interrelated research fields are rare. Nevertheless, Mikko Lehtonen combined the notions of intermediality and multimodality two decades ago when, in an article in a journal of media and communication studies, he accurately stated, 'multimodality always characterises one medium at a time. Intermediality, again, is about the relationships between multimodal media' (Lehtonen 2001: 75; cf. also the rewarding discussions in Fornäs 2002). Although Lehtonen used the concepts in different and not very developed ways, compared to the framework that I have sought to elaborate here, I subscribe to the basic idea that intermediality is about the relationship between media having a multitude of vital traits, or modes.

Nevertheless, it is not evident how this notion should be operationalised. The term 'medium' simply means 'middle', 'interspace' and so forth, and the term can justifiably be used in an abundance of different ways. The term 'modality' is related to 'mode', and these terms are also, for good reason, widely employed in different fields. A 'mode' is a way to be or to do things. Just like 'medium', the term 'mode' can, has and should be used to stand for different notions in diverse contexts. Therefore, certain ways of using terms such as 'modality' and 'mode' must not necessarily compete or be in conflict with very different ways of using them. However, in trying to form a terminologically and conceptually coherent research branch, it is essential to interrelate terms as well as concepts in lucid ways.

In the context of media studies and linguistics, 'multimodality' sometimes refers to the combination of, say, text, image and sound, and sometimes to the combination of sense faculties (the auditory, the visual, the tactile and so forth). Thus, multimodality has been defined as 'the use of two or more of the five senses for the exchange of information' (Granström et al. 2002: 1). The idea that multimodality is the combination of several human (primarily external) senses is also widespread in research areas such as medicine, psychology and cognitive science. However, the field of multimodality itself uses less clear-cut definitions. Gunther Kress and Theo van Leeuwen (2001) understood a mode or modality as any semiotic resource, in a broad sense, that produces meaning in a social context: the verbal, the visual, language, text, image, music, sound, gesture, narrative, colour, design, taste, speech, touch, plastic and so on. While this approach to multimodality has some pragmatic advantages, it produces a rather indistinct set of modes that are hard to compare and correlate since they overlap in many ways (Kress and van Leeuwen 2001: vii, 3, 20, 22, 25, 28, 67, 80; Kress and van Leeuwen 2006: 46, 113, 177, 214). Despite recent suggestions for systematic analysis of multimodality (Bateman et al. 2017), the fundamental notion of multimodality remains circumscribed rather haphazardly by researchers attaching to the Kress and van Leeuwen tradition. However, Kress's book *Multimodality* (2010) circumscribed the notion of mode more firmly within a frame of social semiotics (Chap. 5, 'Mode'), and the selection of what might constitute modes is narrower than in earlier publications. On the other hand, Kress emphasised the distinctiveness of modes such as images and writing.

By emphasising the distinctiveness of modes, Kress's notion of multimodality comes close to the view that media types are inherently different. Earlier efforts to describe relations among different media generally started with precisely the same conceptual units that we also find in multimodal research—image, music, text, film, language (verbal media) and visuality (visual media)—presuming that it is appropriate to compare these entities. The indistinctness of such comparisons is confusing if one treats the compared units as fundamentally different media with little or nothing in common.

In contrast to such views, Mieke Bal has convincingly demonstrated that 'word' and 'image' are interrelated and integrated in complex ways (1991). W. J. T. Mitchell is another scholar who has successfully criticised this mode of thinking by importantly pointing to the way in which media types (more specifically art forms) that are generally seen as opposites actually share various traits (1986). However, Mitchell's use of traditional dichotomies such as text vs. image and verbal vs. pictorial makes it difficult to grasp the nature of the similarities of media. Meanwhile, most other scholars working with similar issues have continued to operate with the dichotomy of verbal vs. visual media types. This is problematic because of what I would describe as the modal incommensurability of the two notions: whereas the verbal is a variation of the symbolic, in Peirce's sense, and hence a semiotic property, the visual belongs to the domain of sense perception. Hence, the two notions belong to different categories of media traits, different modalities, and are not fit to form a dichotomy just as there is little point in contrasting blue cars with fast cars.

As long as such obscurities continue, it remains unclear how to understand notions such as multimodal and intermedial and how they are related. Generally, ambiguities remain even in the most qualified scholarly publications (see, for instance, Moser 2007a, b). The fuzziness of concepts termed 'media' and 'mode' also remains in a central research area such as communication studies (as demonstrated in Parks 2017).

It is no wonder, then, that the discourses on media and modalities tend to be either separated or mixed up. Why bother to combine, or to keep apart, notions that seem to be fuzzy in rather similar ways? There are many media types, which might be the same as saying that there are many modes of communication. In ordinary situations, a language use that simply equates 'media', 'modalities' and 'modes' is unproblematic. However, I think it is a good idea to separate the meanings of 'medium', 'modality' and mode' to make it possible to differentiate between intermediality and multimodality in such a way that Lehtonen proposed—namely, to see intermediality as 'the relationships between multimodal media' (2001: 75).

To the best of my knowledge, there is nothing in the etymology of the words 'medium', 'modality' and 'mode', or in their established uses, that clearly determines how they should be interrelated. Therefore, I see it as my task to raise a theoretical construction and propose how to use these central terms in relation to each other.

My starting point is the idea that media are both similar and different and that media cannot be compared without clarifying which aspects are relevant to the comparison and how these aspects can relate to each other. Therefore, I propose a model that starts not with the units of established media forms, or with efforts to distinguish between specific types of intermedial relations between these recognised media, but with the basic categories of features, qualities and aspects of all media. As already explained briefly, I propose to think in terms of media modalities—types of media traits. The modalities are the indispensable cornerstones of all forms of media, integrating physicality, perception and cognition. Separately, these modalities constitute complex fields of research and are not related to the established media types in any definitive way. However, they are crucial in efforts to describe the character of every single media product. They are all familiar for research, even though their interactions have not been accounted for systematically. As stated earlier, I call them the material modality, the spatiotemporal modality, the sensorial modality and the semiotic modality, and they are found on a scale ranging from the material to the mental. The first three modalities are presemiotic and concern mediation. The semiotic modality concerns representation or, more broadly, signification: how the mediated sensory configurations come to signify cognitive import in the perceiver's mind and form a virtual sphere.

Scholars constantly describe and define media based on one or more of these modalities. However, this is not always sufficient, because all media are necessarily realised in the form of all four modalities. Therefore, I argue that all four of them should be considered. In this respect, there is a fundamental difference between my approach and the systematic, often hierarchic but simplistic classifications and divisions of the arts, the aesthetic media types, which were put forward from the eighteenth century and well into the twentieth century (see Munro 1967: 157–208). Nevertheless, the roots of thinking in terms of media modalities go way back in time. An important early thinker who saw things clearly was Moses Mendelssohn, who built a typology with the aid of distinctions such as 'natural' versus 'arbitrary' signs, 'the sense of hearing' versus 'the sense of sight' and signs that are represented 'successively' versus 'alongside one another' (1997 [1757]: 177–179). The typology is sketchy but instructive since Mendelssohn clearly realised that the borders of the arts 'often blur into one another' (1997 [1757]: 181).

Much later, the systematic thinking of the linguist Roman Jakobson came close to the idea of media modalities. He discussed and interrelated the five external senses, spatiality and temporality, as well as Peirce's sign trichotomy icon, index and symbol (1971a, b, c). Jakobson also made important but undeveloped efforts to put this in the context of 'communication systems', albeit with language as the undisputed centre and measure (1971c). This linguistic bias implies that Jakobson thought of communication at large as 'systems', which I believe gives a warped picture of the wealth of communication that occurs without the boundaries of systems. Another reason for his failure to achieve a nuanced overview over communication is the common tendency to reason in terms of false dichotomies. A question such as 'What is the essential difference between spatial and auditory signs?' (1971b: 340), contrasting a spatiotemporal and a sensorial mode, offers a tilted starting point for investigating signs in communication.

Similar tilted starting points are detectable in Jiří Veltruský's comparison of artistic media forms (1981). In Veltruský's account, it remains unclear what the 'material' of an art form is. According to the author, materials can be divided into the 'auditory and visual'; the material of music is said to be 'tones' and the material of literature is said to be 'language'. Furthermore, the material of literature is supposed to oscillate 'between materiality and immateriality' (1981: 110). Although this categorisation is representative, it is not at all illuminating. The category of material is untenable since it includes media traits that cannot be treated as equals: tones, language and even the immaterial. Tones must be seen as related primarily to the sensorial modality, whereas language must be understood in semiotic terms; however, spoken language actually also consists of some sorts of tones. What the immaterial material is, I do not know.

Mitchell came closer than Veltruský to the idea of media modalities. In one publication, he discussed 'four basic ways in which we theoretically differentiate texts from images'. Three of these ways are 'perceptual mode (eye versus ear)', 'conceptual mode (space versus time)' and 'semiotic medium (natural versus conventional signs)' (1987: 3). Although limited to a comparison of texts and images, this description contains three of the media modalities in their embryonic forms. Moving from text and image to the more specific media types poetry and painting, Mitchell also argued that 'there is no *essential* difference between poetry and painting, no difference, that is, given for all time by the inherent natures of the media, the objects they represent, or the laws of the human mind' (1987: 2–3). Although it is important not to exaggerate the differences between media, I would say that it is fully possible 'to give a theoretical account of these differences' (1987: 2), essential or not, which Mitchell doubted.

Later interesting discussions of these issues, including actual efforts to systematise several of those media traits that I categorise in modalities, are found in publications by Helen C. Purchase (1999) and Eli Rozik (2010). However, although constantly recurring, the material, spatiotemporal, the sensorial and the semiotic types of media traits tend to be fused and mixed up in fundamental ways. Perhaps the most common mistake in these discussions is to confuse the notions of visual and iconic: whereas the visual is about using a specific sense faculty (whether this is connected to iconic, indexical or symbolic signs), the iconic is semiosis based on similarity (whether this similarity can be seen, heard, felt or otherwise sensed) (see Elleström 2016).

#### *1.4.2 Media Modalities and Modes*

In 2010, I published the first version of this article: 'The Modalities of Media: A Model for Understanding Intermedial Relations' (Elleström 2010). In that piece, I introduced a distinction between two levels to facilitate and sharpen methodical descriptions and analyses of media products. On one hand, there are *the types of traits* that are common for all media products, without exception; on the other hand, there are *the specific traits* of particular media products or types of media products. To make the distinction transparent, I call the former modalities and the latter modes. In brief, then, *media modalities* are categories of basic media traits, and *media modality modes* (or simply *media modes* or *modality modes*) are basic media traits.

I have argued that there are four media modalities, four types of basic media modes. For something to acquire the function of a media product, it must be *material* in some way, understood as a physical matter or phenomenon. Such a physical existence must be present in space and/or time for it to exist; it needs to have some sort of *spatiotemporal* extension. It must also be perceptible to at least one of our senses, which is to say that a media product has to be *sensorial*. Finally, it must create meaning through signs; it must be *semiotic*. This adds up to the material, spatiotemporal, sensorial and semiotic modalities. It follows from the definition of a media product as the intermediate entity that enables the transfer of cognitive import from a producer's to a perceiver's mind, where a virtual sphere is created, that no media products or media types can exist unless they have at least one mode of each modality.

The modalities should be understood as categories of related media modes that are *basic* in the sense that all media products have traits belonging to all four modalities. All media products appear as specific combinations of particular modes of the four media modalities. A certain media product must be realised through at least one material mode (as, say, a solid or non-solid object), at least one spatiotemporal mode (as threedimensionally spatial and/or temporal), at least one sensorial mode (as visual, auditory or audiovisual) and at least one semiotic mode (as mainly iconic, indexical or symbolic). Hence, the four media modalities form an indispensable skeleton upon which all media products are built.

By 'modalities', I thus mean the four necessary categories of media traits ranging from the material to the mental, and by 'modes' I mean the specific media traits categorised in modalities. I do not define entities such as 'text', 'music', 'gesture' or 'image' as modalities or modes; in the following section of this article, I will instead explain them in terms of media types.

As emphasised, three of the four modalities are presemiotic, which means that they cover media modes that are involved in signification—the creation of cognitive import in the perceiver's mind—although they are not semiotic qualities in themselves. Thus, the material, spatiotemporal and sensorial modalities are not *a*semiotic; they are *pre*semiotic, meaning that the modes that they cover are bound to become part of the semiotic as soon as communication is established. The presemiotic media modes concern the fundamentals of mediation, which is to say that they are necessary conditions for any media product to be realised in the outer world by a technical medium of display, and hence for any communication to be brought about. All four modalities obviously depend strongly on each other—just as the modes may be entangled with each other in several ways, depending on the character of the media product.

With the aid of this theoretical framework, basic media differences and media similarities can be pinpointed. Crucial divergences and fundamental parallels can be highlighted among all conceivable sorts of media—existing and yet to be devised—which provides a firm ground for understanding, describing and interpreting the most elementary media interrelations. Of course, I can only hint here at the complexity of the innumerable interrelations that can be derived from the four modalities and their modes.

*The material modality* is a category of material media modes. All media products are material, or more broadly physical, which makes them perceptible and hence accessible to the perceiver's mind in various ways. However, distinctions can be made among material properties in ways that may overlap. I discern at least two vital ways of distinguishing material modes. As described in physics, there are different states of matter, four of which are relevant for everyday life: a media product may be solid or in the form of liquid, gas or plasma. As examples, consider a solid road sign made of painted metal, liquid water used in an art installation, gas in the form of vibrating air (sound waves) produced by vocal cords and plasma in a television screen or other device for communicative display. Another way of distinguishing material modes is to separate organic and inorganic matter. For instance, whereas an outstretched arm with a pointing finger is an organic media product, a tailor's dummy is an inorganic media product. This is a biological rather than physical distinction, although the two are equally relevant for everyday life.

*The spatiotemporal modality* is a category of spatiotemporal media modes. As they consist of physical matter, all media products have spatiotemporal properties and can therefore be grasped by human minds. Following well-established models in physics, the three spatial dimensions and the temporal dimension can be considered as a unit. Thus, space and time form a four-dimensional spatiotemporal entity consisting of width, height, depth and time. Although all media products actually exist in such a four-dimensional world, the relevant properties of media products those properties that because of selective attention perform the function of a media product—may be more restricted. I argue that media products must have at least one and may have up to four spatiotemporal modes.

However, these modes cannot be freely combined: to perceive space with the senses, at least two spatial dimensions are required. This means that the only conceivable monomodal spatiotemporality would be exclusively temporal media products. Speech or song emanating from a single point might be considered as instances of media products that are only temporal, although I think it is reasonable to state that even such media products have some rudimentary spatial qualities. Tracing media modes is seldom a question of definitely affirming or dismissing them. Nevertheless, it is important to discern differences. Thus, temporality, a mode of the spatiotemporal modality, is an aspect of songs, speeches, gestures and dance, but not of stills and most sculptures. Whereas a photograph has only two dimensions (width and height), a sculpture has three spatial dimensions (width, height and depth). A dance and a mobile sculpture have four dimensions (width, height, depth and time). Dance performances and political speeches have a beginning, an extension and an end situated in the dimension of time, while a photograph, as long as it exists, simply exists. If you close your eyes or block your ears in the middle of a performance or a speech, you miss something and cannot grasp the spatiotemporal form in its entirety. If you close your eyes while looking at a photograph, you miss nothing and the spatial form remains intact. In these respects, there are distinct and relevant spatiotemporal differences among media products and media types, even though the presence or absence of certain modes may sometimes be disputed.

All media products, like all objects and phenomena, are necessarily perceived in time and space before they create cognitive import in the perceiver's mind. Semiosis is also a spatiotemporal phenomenon. However, because media products are constituted only by parts of the physical surroundings that are chosen for selective attention in acquiring a communicative function, this does not rule out the actual media differences. Also, some media types, such as visual, verbal (symbolic) signs on a flat but static surface (such as printed texts), are conventionally decoded in a fixed sequence, which makes them second-order temporal, so to speak: sequential but not actually temporal, because the physical matter of the media products does not change in time.

*The sensorial modality* is a category of sensorial media modes. All media products have sensorial properties in the sense that their materiality, somehow existing in time and space, must be perceived by one or more of our senses to reach the mind and trigger semiosis. Media products simply do not exist unless they are grasped by the senses. We usually think about the five external sense faculties of humans, which I here describe as the five main modes of the sensorial modality: seeing, hearing, feeling, tasting and smelling. The visible is a mode of signboards, gestures, films, websites and tattoos. The audible is a mode of instrumental music, recited poetry, films, radio weather forecasts and the shouting of salespeople in the street. Communication can also be accomplished by how the surface of a gift feels, how a meal tastes or how a flower smells.

Still, there are other human senses, described in terms such as interoception (sensing the internal state of the body) and proprioception (sensing body position and self-movement), and these senses may be relevant for human communication and vital for the perception of media products, especially when the human body itself is used as a media product. Someone who physically makes someone else lose her balance by pushing her may communicate threat, which is perceived by sight and touch but also by the perceiver's proprioception—the perceiver's body constituting the media product.

*The semiotic modality* is a category of semiotic media modes. While the material, spatiotemporal and sensorial modalities form the framework for explaining the presemiotic processes of mediation, the semiotic modality is the frame for understanding representation. All media products are semiotic because if the sensory configurations with material, spatiotemporal and sensorial properties do not represent anything, they have no communicative function, which means that there is no media product and no virtual sphere in the perceiver's mind. Hence, all objects and phenomena that act as media products have semiotic traits, by definition.

Whereas the semiotic traits of media products are less palpable than the presemiotic ones and are in fact largely derived from them—because different kinds of mediation have different kinds of semiotic potential—they are equally essential for realising communication. The mediated sensory configurations of a media product do not transfer any cognitive import until the perceiver's mind comprehends them as signs. In other words, the sensations are meaningless until they are understood to represent something through unconscious or conscious interpretation.

Although the sensory configurations have no meaning in themselves, the process of interpretation begins in the act of perception. Conception does not come after perception; rather, all our perceptions are results of the endeavours of an interpreting, meaning-seeking mind. The moment we become aware of a visual sensation, for instance, the sensation is already meaningful at a basic level because meaning-making already starts in the unconscious apprehension and arrangement of what is perceived by the sense receptors. Meaning-making continues in the more or less conscious acts of creating sensible patterns in the intracommunicational domain and relevant connections to the extracommunicational domain.

These observations are not valid only for the perception of media products. The world at large is meaningless in itself; its significance is the result of interpreting minds—perceiving and conceiving subjects situated in social circumstances—attributing import to states of affairs, actions, occurrences, natural objects and artefacts. Following Peirce, meaning can be described as the result of sign functions, and although there are no signs until some interpreter has attributed significance to something, it is possible to distinguish between different sorts of signs.

Earlier, it was common to distinguish between conventional signs and natural signs. Peirce's most important trichotomy—icon, index and symbol—attaches to this division even though it avoids the slightly misleading idea that some signs exist 'in nature'. It is far beyond the scope of this study to account for all of Peirce's complex semiotic ideas, so I simply state that I follow his specific idea that signs result from mental activity based on, as I would have it, certain cognitive capacities.

As noted, Peirce defined the three sign types in terms of the representamen–object relationship. Icons stand for (represent) their objects on the ground of similarity, indices do so on the ground of contiguity, often described as 'real connections', and symbols operate on the ground of less durable habits or stronger conventions (see, for instance, 1932: CP2.303–304 [1902], CP2.247–249 [c.1903]; Elleström 2014a: 98–113). I regard perceiving similarity and contiguity and forming habits as fundamental cognitive abilities. I also take iconicity, indexicality and symbolicity to be the three main semiotic media modes; no communication can occur unless cognitive import is created in the perceiver's mind through at least one of the three sign types—icons, indices and symbols.

This sign division is also echoed in research branches that do not engage in semiotics. During the twentieth century, it was common to distinguish between different but complementary ways of thinking. Some cognitive functions have been said to be mainly directed by 'pictorial representations', whereas others have been understood to mainly rely on 'propositional representations'. The pictorial is more concrete and related to perceiving similarity and contiguity, while the propositional is more abstract and related to forming habits. Brain research has shown that the two ways of thinking can largely be located in the two cerebral hemispheres. Cognitive science involves an almost universal dichotomy—cognition based on similarity and cognition based on rules—although there are different opinions regarding their interrelations and dominance (Sloman and Rips 1998).

I suggest three terms to denote the processes of iconic, indexical and symbolic representation. Although these terms are widely used for different purposes in diverse contexts, they fit the rationale of this study. Hence, I propose calling iconic representation *depiction*, referring to indexical representation as *deiction*, and denoting the process of symbolic representation with the term *description*. The manner in which I use these three terms makes their significance both broader and narrower than in many other contexts; I annex them only to be able to efficiently distinguish verbally among the three main types of signification.

Depiction, deiction and description are not mutually exclusive; as modes of the other modalities, they are often (perhaps even always) combined to create multimodal media, that is, media that are both visual and auditory, spatial and temporal, iconic and indexical and so forth. According to Peirce, who stressed that the determinate aspects of all signs are 'in the mind' of the interpreter, the three modes of signification are always mixed, but often one of them can be said to dominate (1932: CP2.228 [c.1897]). In most written, verbal texts, the symbolic sign functions of the letters and words dominate the signification process. In instrumental music and all kinds of visual still images (such as drawings, figures, tables and photographs), iconic signs generally dominate, although photographs also have an important indexical character. Depictions in music and visual still images differ, of course, since the musical representamens are auditory and perhaps mainly represent motions, emotions, bodily experiences and cognitive structures, while the visual representamens of still images can effortlessly represent a broad range of objects from many areas. Nevertheless, all of these iconic sign functions are based on similarity. I am well aware of the lack of consensus when it comes to the question of musical meaning, but my point is that no matter how one defines the semiotic character of a media type, it must include semiotic particularities that are sometimes at least partly media-specific. Music and visual still images simply do not communicate in the same way.

As I have already stressed, a semiotic perspective must be combined with a presemiotic perspective. Communication is equally dependent on the presemiotic media modalities and the semiotic modality. Represented objects called forth by representamens are results of both the basic features of the media product as such and of semiotic activity situated in a social context. While signification is ultimately about mind-work, in the case of communication this mind-work is dependent on the physical appearance of the media product. However, some representation is clearly more closely tied to the appearance of the medium, whereas other representation is more a result of interpretation, and hence the setting of the perceiver's mind.

Thus, the spatiotemporal, the sensorial, the material and the semiotic modes together form the specific character of all media products, and generally also media types as they are circumscribed at certain periods. Traditional sculpture is three-dimensional, solid and non-temporal. It is primarily perceived visually, but it also has tactile qualities that can be understood as part of its defining qualities. Generally, the iconic sign function dominates. An animated movie, as we understand this media type today, with its moving images and evolving sounds, is temporal. It is mediated by a flat surface with visual qualities combined with sound waves. The images are primarily iconic, and they lack the specific indexical character of images produced by ordinary movie cameras. The sound generally consists of voices, sound effects and music: the musical sounds, but often also much of the voice qualities, are very much iconic, while the parts of the voices that one can discern as language are mainly interpreted as habitual signs. Printed advertisements, as they are normally understood, have a solid, two-dimensional, non-temporal materiality and are perceived by the eye. Most of them gain their meaning through verbal symbols combined with iconicity in the visual form of their elements, including the verbal symbols. Printed advertisements that contain readable words are sequential but not temporal as such, although the conventions of language make it necessary to read the letters and words in a certain order to make sense. As already emphasised, the presemiotic and semiotic modes of a media product offer certain possibilities and set some restrictions: any kind of cognitive import cannot be freely created based on just any media type.

The concept of media modalities that I have outlined here roughly supports ideas about media always containing other media (McLuhan 1994 [1964]: 8, 305) and media always being mixed media: 'the very notion of a medium and of mediation already entails some mixture of sensory, perceptual and semiotic elements' (Mitchell 2005: 257, 260; cf. Mitchell 1994: 95, 2005: 215, 350). However, the concept of media modalities also accounts, in some detail, for how media are differently entangled in each other, and in which respects media may *not* be contained by or mixed with other media: different media necessarily share the four basic modalities, but they have the modes of the modalities only partly or not at all in common. There are media similarities and media dissimilarities and media are mixed, or multimodal, in dissimilar ways.

All media are multimodal in that they must have at least one mode from each modality. Most media are also multimodal in the sense that they have several modes from the same modality: they may be materially multimodal, having both solid and liquid modes, for instance. They may be spatiotemporally multimodal, being both two-dimensionally spatial and temporal, for example. They may be sensorially multimodal, being dependent on being both seen and heard. They may be semiotically multimodal, for instance, by forming cognitive import through icons and indices as well as symbols. Because signification requires at least some degree of activity of all three sign types, all media are probably semiotically multimodal. Some media, such as computer games and theatre, are multimodal on the level of all four modalities.

The four media modalities are categories of basic media traits. However, the traits that they cover, the various modes, are not isolated, self-sufficient traits. Therefore, the proposed model offers no simple, mechanical way of checking off the modality modes, one after another, but it instead suggests a method of minutely investigating the features of various media and ways of analysing and interrelating them. This is a more detailed and specific way of outlining media multimodality compared to multimodality understood as the combinations of socially constructed entities such as writing, music and gesture. However, the model of media modalities does not in any way exclude the social aspect of communication, which I have already accounted for in the discussions of how communicating minds are formed, and which will return in the following section on media types.

# 1.5 What Are Media Types?

### *1.5.1 Basic and Qualified Media Types*

Having outlined the concepts of media modalities, modality modes and multimodality, I can now suggest a way of thinking about media types. Reasoning in terms of types can involve several pitfalls. Nevertheless, it is virtually impossible to navigate in one's material and mental surrounding without categorising objects and phenomena; otherwise, everything would be difficult to grasp and to explain. Categorisation brings about borders—or at least border zones—and borders should always be disputed. The area of communication is no exception: it is unavoidable to categorise media into types, and it is not evident how these categorisations should be made.

What, then, does one categorise in communication? I suggest that a central element for categorisation in this broad area is the media product, understood here as a single entity in contrast to types of media. Whereas media products are individual communicative entities, media types are clusters of media products. In everyday discourse, and in this article (unless otherwise specified), the term 'medium' may refer to an individual media product as well as a media type. More specifically, 'a talk' and 'a photograph' refer to specific media products, and 'talk' and 'photography' refer to types of media.

Despite the complex nature of media products, it is fully possible to categorise them in various ways. A discussion of media categorisation requires that proper attention be paid to the basic qualities of media products, understood as physical intermediate entities that enable transfer of cognitive import between at least two minds, resulting in a virtual sphere in the perceiver's mind. This involves qualities that must be understood as being situated within the range from the purely material to the purely mental. I have already described these traits that involve physical properties as well as cognitive processes in terms of media modalities.

In the end, each media product is unique. However, thinking species such as humans feel the need to categorise things in order to navigate the world and communicate efficiently. This leads to the categorisation of media products, and, as is often the case with classification in general, our media categories are usually quite fluid. Nonetheless, thinking in terms of media modalities is helpful for understanding media differences and similarities and hence for understanding how media can be categorised. This is not the whole story, though. Some categorisations are more solid and stable than others are because they depend on partly dissimilar factors. There are simply different types of media categories.

That is why I find it helpful to work with the two complementary notions of *basic media types* and *qualified media types*—two types of media types. People sometimes pay attention to the most basic features of media products and classify them according to their most salient material, spatiotemporal, sensorial and semiotic properties. For instance, people sometimes think in terms of still images (most often understood as tangible, flat, static, visual and iconic media products). This is what I call a basic medium (a basic type of media product), and it is relatively solid because of its perennial fundamental traits. Basic media types are categories of media products grounded on basic media modality modes.

However, when such a basic classification is not enough to capture more specific media properties, we qualify the definition of the media type that we are after and add criteria that lie beyond the basic media modalities. We also include all kinds of aspects about how we produce, situate, use and evaluate media products in the world. We tend to talk about a media type as something that has certain functions or that we use in a certain way at a certain time and in a certain cultural and social context. Qualified media types are simply categories of media products grounded not only on basic media modality modes but further qualified.

For instance, we may want to delimit the focus to still images that are handmade by very young people—children's drawings. This is what I call a qualified medium (a qualified type of media product), and it is more indefinitive than the basic medium of a still image, simply because the added specific criteria are vaguer than those captured by the media modalities. It may be difficult to agree upon what a handmade drawing actually is: Should drawings made on computers or scribble on the wall be included? When does a child actually become a young adult? The notion of childhood varies significantly among cultures and changes over time, not to mention the individual differences in maturity. Therefore, the limits of qualified media are bound to be ambivalent, debated and changed much more than the limits of basic media are.

Because processes of categorisation are multifaceted, serve different purposes and often involve vague terminology, the distinction between basic media types and qualified media types is not always clearly distinguishable in actual media classifications. Also, because the modes of the modalities are not always easily isolated entities, there is no definite set of basic media types. There is also an abundance of basic media that we have no terms for at all, which makes explaining and discussing them a cumbersome exercise. In fact, everyday language only covers a few rudimentary media types. Here I think of the terms 'text' and 'image' that, in various terminological constellations, come close to standing for several related basic media types.

If 'text' is defined as any media type primarily based on (verbal) symbols, it becomes possible to discern variations such as 'auditory text' (consisting of sound waves in air or possibly water or some other gas or liquid that are heard in a temporal flow), 'tactile text' (consisting of solid, threedimensional signs on a surface that does not evolve in time) and various forms of 'visual text' (consisting of, say, non-organic or organic materials in two or three spatial dimensions that are either temporal or not). Likewise, if 'image' is defined as any media type primarily based on icons, it is possible to differentiate between basic media types such as 'auditory image' (consisting of sound waves that are heard in a temporal flow and resulting not primarily in verbal symbols but in icons), 'tactile image' (consisting of solid, three-dimensional signs on a surface that does not evolve in time) and several forms of 'visual image' like 'visual still image' (non-temporal) and 'visual moving image' (temporal) in various material appearances.

Because of the almost infinite possible modal combinations, we must accept that some basic modal groupings are commonly distinguishable at a certain time and in a certain culture, and that the future may hold new habits and technical solutions that make novel basic media types relevant. For example, imagine a basic media type consisting of organic materiality in the form of a liquid that is perceived as both a spatial extension and a temporal flow, which can be both seen and felt and which produces mainly iconic meaning. Assuming that a technical medium of display capable of realising media products with such traits was invented and grew popular, we might expect an increasing need for a term to represent such a basic media type.

Categorising media products in basic media types is about categorising what are considered the relevant features of all perceived sensory configurations and how they trigger semiosis. We have observed that something becomes a media product because it attains a communicative function in mediating between several minds, but not all traits of the mediating physical entity or process are involved in the communicative function. The selective attention of the perceiver's mind, which is often formed by social praxis, decides what material, spatiotemporal and sensorial qualities of certain parts of the physical entity or process become involved in signification, resulting in a virtual sphere. When we perceive a standard book page, we usually ignore its slightly three-dimensional features and think of it as a flat surface; we also look at it in certain ways rather than try to taste it or listen to it.

Hence, basic media types—such as inorganic, flat, static, visual texts are really categorisations of salient traits that enable communication in certain ways, not simply of objectively existing traits of physical items or occurrences. This becomes apparent especially considering the semiotic modality. Although they are based on the presemiotic modes, it is the semiotic modes that fulfil the communicative function of the media product, and different sign types, different forms of representation—belonging to different basic media types—may well result from similar forms of mediation depending on different forms of expectation and interpretation. For instance, when trying to make sense of certain inscriptions on an old monument, exactly the same visual, ornamental configurations can be understood either as icons representing natural objects or abstract ideas on the ground of perceived similarity or as symbols representing names or places on the ground of conventions.

As noted above, it is often insufficient to consider only the media modalities when seeking to understand how media products are categorised. One must also consider their communicative functions in societies and a world of constant change. In addition to basic media types, there are qualified media types, which depend on history, culture and communicative purposes. They include classes such as lectures, music, television programmes, news articles, visual art, Morse Code messages, sign language and email. Although they are normally based on one or several basic media types, and may therefore have a certain degree of stability, their defining features are formed by fluctuating conventions. My understanding of qualified media types comes fairly close to how other scholars have defined media at large: '"medium" could be defined in a moderately broad sense as a conventionally distinct means of communication, specified not only by particular channels (or one channel) of communication but also by the use of one or more semiotic systems serving for the transmission of cultural "messages"' (Wolf 1999: 35–36); 'what we identify as a specific "medium"—as well as what we consider "natural" about and how we perceive and use both traditional and new media—are shaped by a wide variety of factors, ranging from physical material, technological infrastructure, means of access, social conventions, media habits, preferences of communication partners, and institutional structures' (Rice 2017: 536).

One could say that the dependence of qualified media types on basic media types moderates the potentially radical changes of qualified media types. Although societies, technologies, cultures, values, habits and communicative expectations change, there is often a natural resistance towards complete metamorphoses of qualified media types. For instance, few would find a point in letting a qualified medium such as music be developed in such a way that its basic presemiotic modal qualities (sound evolving in time) were counted out. Likewise, one would hardly accept a qualified media type such as surveillance video to include media products that do not contain temporally evolving visual iconicity. Whereas painting is a qualified medium because expected aesthetic qualities are to be presented within certain social and artistic frames that are bound to undergo changes, its expected modal traits are relatively stable and provide a useable starting point for discussing the limits of the media type. For instance, few would accept that a media product that cannot be seen is a painting and if it is strongly three-dimensional, rather than two-dimensional, a strong case could be made for it being a relief rather than a painting.

By the same principle—qualified media types depending on basic media types—there are categorisations that are often understood to form single qualified media types, whereas they might be seen as several interrelated qualified media. I argue that literature as art is preferably treated as at least two qualified media types: literature that one sees (reads) and literature that one hears. Of course, visual (written) and auditory literature are deeply entangled; we constantly transform the auditory to the visual and vice versa when we write down literature and read it out loud but still expect the different media products to function in roughly the same way. Hence, the qualifying processes are partly similar for the two qualified media, but they are still significantly different in certain respects since they are based on at least two different basic media.

Thus, qualified media types often contain more solid cores of basic media types, which partly justifies the much debated idea of medium specificity and the controversial notion that there are sometimes also essential differences between qualified media types. Whereas many scholars load their revolvers when they hear the word 'essential' (because media qualities that are described as essential are often just social constructions), I think that similarities and dissimilarities among qualified media types in terms of basic presemiotic and semiotic features can be said to be essential. Most people in most cultures now understand a qualified medium such as film to be a combination of visual, predominantly iconic signs (images) displayed on a flat surface and sound in the form of icons (as music), indices (sounds that are contiguously related to visual events in the film) and symbols (as speech), all expected to develop in a temporal dimension. The combination of these features is no doubt a historically determined social construction of what we call the medium of film, but given these qualifications of the medium, it has a certain essence.

Because qualified media types are cultural conceptions that are created, perceived and defined by human minds, there are no media types 'as such' and therefore no independent essences of qualified media 'as such'. However, once we agree that, for pragmatic reasons, it is meaningful to say that there are dissimilar media types, essential presemiotic and semiotic modes are inscribed into these conventionally defined qualified media. It would be nonsensical to argue that a static collection of visual symbols (letters and words) displayed on book pages or a screen actually constituted a film. This is because there are essential dissimilarities on a basic level between our conceptions of written literature and film. A century ago, the two qualified media were construed slightly differently, so the essential dissimilarities between what was then called written literature and film were slightly different; the same terms were used to refer to somewhat different qualified media types.

However, it is not always possible to trace cores of basic media in qualified media. A qualified media type such as popular science is so broadly conceived that it can be realised by all kinds of presemiotic and semiotic modes as long as scientific ideas are communicated in a way that is not too complicated. Whereas such qualified media types are vague in terms of modality modes, they may well be precise in terms of communicative functions.

Furthermore, not all media products are regularly categorised. As we have noted, there is an abundance of variations of media products, especially considering that any physical item or phenomenon may be drawn into communication and acquire the function of media product, but only the most institutionalised types of media products are clearly categorised as qualified media types. This is the case for non-professionals as well as scholars. Thus, there are several kinds of media products that we normally do not categorise in qualified media types. For example, certain television programmes are readily understood as instances of the nature documentary qualified media type; however, when using an empty glass to communicate the desire to get more beer, it is unclear to what type of qualified medium such a glass might belong. Although not urgent, this problem should be noted.

#### *1.5.2 The Contextual and Operational Qualifying Aspects*

The grounds on which media types are qualified can be divided into at least two main aspects. The first is the origin and delimitation of media in specific historical, cultural and social circumstances. This can be termed the *contextual qualifying aspect* and involves forming media types on the grounds of historically and geographically determined practices, discourses and conventions. We tend to think about a media type as a cluster of media products that one begins to use in a certain way, or gain certain qualities, at a certain time and in a certain cultural and social context. This is in line with Joseph Garncarz's notion that media must be seen 'not only as textual systems, but as cultural *and* social institutions' (1998: 253). Visual art, Morse Code messages, sign language and email are not eternal media types, although they could be neatly described in terms of media modalities—they appear, they perhaps eventually disappear, and they are fully intelligible only in certain shared circumstances.

Sometimes it is more or less radical technological developments, such as the invention of new materials or forms of reproduction, that quickly trigger the genesis of what one takes to be new qualified media types (as is the case with various forms of so-called digital media). It may also be the case that new technology only slowly gives rise to new qualified media types. It has been argued that 'cinema' did not become 'cinema' the day the technique was invented (Gaudreault and Marion 2002). It took a while before a sufficient number of media products, created through cinematographic techniques, were original and characteristically similar enough to be thought of as a new media type. Eventually, two notions came to be attached to the same term: 'cinema' as a set of techniques and 'cinema' as a qualified media type developed within the frames of, but not determined by, the technological aspects. Video presents a similar case. First, a set of technical devices for the production, storage and distribution of media products were launched, and only later did these devices give birth to a qualified medium with certain communicative qualities (Spielmann 2008 [2005]). It is sometimes instead media products based on old techniques that are seen as a new qualified media type when they are adopted in new contexts, as when photographs are exhibited at galleries and museums and come to be seen as photographic art.

The second of the two qualifying aspects is the general purpose, use and function of media, which may be termed the *operational qualifying aspect*. This aspect encompasses construing media types on the ground of claimed or expected communicative tasks. Whereas communication is generally a goal-driven activity, the goals may be very different, so it is natural to associate individual media products with other familiar media products that are known to have certain purposes and functions. Therefore, media products tend to be categorised to enhance understanding of what they could or should achieve. This means that such classification is not only descriptive but also prescriptive; it may deeply affect the effects on the perceiver's mind. Here, I can hint at only a few of the myriad existing communicative functions.

On an overarching level, media products can be thought of as more private or more official; there is a difference between how secluded communication is expected to work compared to communication with open access for everybody. This is why the idea of a category of mass media (often referred to as simply 'media') is so widespread. It is a common evaluation that one's more private affairs are preferably communicated among a limited group of people that one trusts, whereas some media types are capable of reaching large groups of people and are therefore suited for communicating things of more general interest. In this way, the media types under the umbrella term 'mass media' are qualified operationally. However, media types are also qualified contextually. So, even though the distinction between private and mass media has never been sharp, we have seen in the last few years how the boundary has become increasingly blurred in so-called social media, where private and even intimate matters are commonly communicated openly and at least potentially accessible to a mass audience. Although still useful for most people, the distinction between private and mass media types will clearly continue to be debated and modified.

On a more specific level, crossing the fragile border between private and mass communication, media products may be claimed or expected to bond, create trust or share affections among people. We think in terms of caresses, consolations, promises, gifts and acts of courtesy. Although it may feel unusual to think of these things as media products, they are precisely such intermediate entities that enable transfer of cognitive import among minds, and we categorise them according to their claimed or expected communicative functions. Similarly, media products may have main functions to warn, threaten or frighten.

It is also common for media products to be claimed or expected to communicate various forms of truthfulness. Although it is not always clearly detectible in terms of how we categorise media products, this kind of purpose and use probably permeates a majority of media types (with the obvious exception of decidedly misleading communication). Qualified media types that are maintained to communicate news—television news, articles in newspapers, public announcements on streets and town squares and perhaps even gossip—are mainly expected to be truthful regarding factual and weighty recent events and their interconnections. Qualified media types called documentaries are largely construed on the purpose and function of representing truthfully and in some detail the interconnections of a specific set of persons and events in the past or in the present. There is also a multitude of media types that overtly function to educate, inform, instruct, train, provide wisdom and the like—media types that can be circumscribed in terms of various forms of expected truthfulness. Similarly, artistic media types, even those that are termed 'fiction', are expected to communicate truthfully, albeit in ways that are partly different from those media types mentioned previously. Art is generally claimed and believed to communicate general rather than particular truthfulness, for instance, not necessarily what a living person with a certain name said, did and felt in a specific place on a particular date, but rather what many people are likely to say, do and feel under certain circumstances.

Other forms of claimed or expected communicative functions that steer the construction of qualified media types include entertaining and aesthetic qualities. A performer would not produce stand-up comedy if her or his performance was not at all amusing; videogames need to be pleasurable to some degree to be regarded as games; movies that fail to be scary in an engaging way are not likely to be seen as horror movies; and jokes that are not funny for anyone are not really jokes—or at best they are failed jokes. Disregarding the obvious difficulty of distinguishing art from entertainment (which is perhaps not really necessary), artistically qualified media types such as music, dance, calligraphy, poetry and architecture are construed on the assumption that to deserve to be included in these art forms, the media products must fulfil certain aesthetic standards. Although this view has been contested in various ways, it remains a central factor for most people.

One can highlight the importance of the operational qualifying aspect with a comparison of dance, gesture and so-called body language. Although dance is generally considered an art form that is governed by aesthetic standards, it is closely related to and dependent on gesture and body language—media types that are also seen as part of everyday practical communication. All three media types are probably among the most perennial and widespread forms of communication (less dependent on the contextual qualifying aspect), and they are virtually inseparable in terms of modality modes. The primary modes involved in dance, as well as in gesture and body language, are organic and solid materiality (the human body), all four spatiotemporal dimensions and visuality. Semiotically, I believe that all three media types are equally dependent on icons (signification based on similarity with elements, chains of events and ideas), indices (signification based on contiguity with entities and developments in the body's external surrounding as well as emotional and cognitive processes within the body itself) and symbols (signification based on habits—both personal habits and collective conventions). Therefore, the difference between dance on the one hand and gesture and body language on the other remains to be found in the operational qualifying aspect. Whereas dance is supposed to fulfil certain current aesthetic criteria in order for it to be accepted as such, the same does not apply for gesture and body language.

All of these particular qualifying aspects can exist side by side, and they may well overlap. As we have seen in some of the examples, the contextual and operational qualifying aspects often interact. As Jürgen E. Müller (2008a, b, 2010; cf. Bignell 2019) emphasised, the communicative functions of a media type often arise, become gradually accepted or disappear at certain moments in history and in certain socio-cultural circumstances. The qualifying aspects are, precisely, *aspects* of the multifaceted mechanisms that lie behind categorisations of media products, and it is probably feasible to split these aspects into three, four or even more specific aspects.

It is impossible to avoid noticing the relativity of most qualified media types. Sometimes, a qualified media type may also seem to contain several more finely restricted media types. These more limited qualified media types might be referred to as qualified submedia types, or simply submedia. The concept of a submedium is effectively the same as most notions of genre. In other words, a genre is a qualified media type that is qualified also within the frames of an overarching qualified medium: a submedium. However, some genres, such as Western novels and Western movies, being subtypes of novels and movies, attach to each other across the borders of qualified media types and exist as twin submedia.

In the end, it is probably always possible to add criteria to make further distinctions among qualified media types (cf. Ettlinger 2015). Because qualification and requalification of media types are bound to continue as long as humans exist and are able to communicate, total agreements are utopic and unnecessary. Consequently, my ambition here is not to argue in favour of certain ways of circumscribing particular qualified media types, but rather to highlight the general mechanisms behind basic and qualified categorisations of media products.

### *1.5.3 Technical Media of Display, Basic Media Types and Qualified Media Types*

Having explained the concepts of basic and qualified media types as different forms of categorisation of media products, I will now clarify the relation between technical media of display and basic and qualified media types. I have defined technical media of display as any objects, physical phenomena or bodies that mediate sensory configurations in the context of communication; they realise and display the entities that acquire the function of media products. Thus, every technical medium of display can be described according to the range of basic media it can and cannot realise—or, more precisely, which presemiotic modes it is more or less fit to mediate. One could also argue that different technical media of display can realise basic media types more or less completely and successfully. However, strictly speaking, it is a contradiction in terms to say that a basic media type may be realised only in parts; if one or several modality modes are missing, it is actually another basic medium, and one must think in terms of media being transformed. This line of thinking is ultimately selfevident, considering that basic media types are categories of media products and media products are functions of sensory configurations mediated by technical media of display.

Given that every technical medium of display can only realise certain basic media types, it follows that they can also only realise certain qualified media types. This is because many qualified media are construed on cores of basic media and are therefore dependent on particular technical media of display. One can only realise a theatre performance by a combination of technical media such as human bodies, some form of indoor or outdoor area and props. A television set, which displays a feature film very well (apart from the size of the screen), is only capable of partly realising a theatre performance: the three-dimensional spatiality, complex corporeality and multisensoriality of the theatre are reduced to a flat screen and a concentrated source of sounds—which means that it is not really theatre that one sees and hears on the television, but theatre transformed to something else.

Since the existence of certain technical media of display is a facet of every historical moment and cultural space, several qualified media types are more or less strongly dependent on specific technical media having a socially determined existence (the contextual qualifying aspect). Technical media of display inevitably also play a crucial part in the forming of the general purpose, use and function of media (the operational qualifying aspect). An oil painting can be described as a qualified medium characterised not only by certain modality modes but also by unique aesthetic qualities linked to the technical medium of oil colour, which was invented and developed at a certain time and in a certain cultural context. Similarly, qualified media types such as computer games are inconceivable without the resource of recently invented technology, and more specifically, they depend on electronic screens as technical media of display, which have only existed relatively recently.

This historical and functional closeness between physical existents (technical media of display) and qualified ways of categorising media (qualified media types) explains why the same term is often used to represent both, which sometimes creates confusion. We have already noted that 'cinema' (technologies for producing but also displaying cinema) did not become 'cinema' (a qualified media type) the day the technology was invented. Likewise, the term 'photography' can refer to devices and techniques for production, to several technical media of display (paper in books and magazines, electronic screens, t-shirts and even cakes), or to one or several qualified media types (photography as documentation or as art).

On the other hand, some qualified media types are broadly conceived and not so determined by specific technical media of display. The way that sculpture is usually conceived means it can be realised by all technical media of display that can mediate solid, three-dimensionally spatial and visual materiality, which includes technical media such as bronze, stone, plaster, plastics, sand, ice and metal. This allows for a larger variety of individual media products within the same media category.

# 1.6 What Are Media Borders and Intermediality?

## *1.6.1 Identifying and Construing Media Borders*

With a deeper understanding of the multimodal nature of media products as well as their categorisation in media types, it is now possible to return to the issue of intermedial relations. For good reason, scholars have argued that intermediality is a result of constructed media borders being trespassed. Indeed, nature does not give any definite media borders, which means that it is not evident what intermedial relations are. Werner Wolf emphasised that media borders are created by conventions and defined intermediality as a relation 'between conventionally distinct media of expression or communication: this relation consists in a verifiable, or at least convincingly identifiable, direct or indirect participation of two or more media in the signification of a human artefact' (Wolf 1999: 37). Christina Ljungberg stressed the performative aspect of border crossings, arguing that intermediality is something that sometimes 'happens', an effect of unconventional ways of performing medial works (Ljungberg 2010).

However, there are at least two kinds of media borders. As we have seen, media differ partly because of modal dissimilarities and partly because of divergences concerning the qualifying aspects of media, and the conventionality and performativity of media borders are mainly a facet of the qualifying aspects (Rajewsky drew a similar conclusion [2010]). Intermedial relations between basic media types such as visual moving images and visual still images can be relatively clearly described within the framework of the four modalities, whereas intermedial relations between qualified media types such as auditory literature and music largely also rely on the two qualifying aspects.

In the first case, the border between the two basic media (visual moving image and visual still image) lies in the spatiotemporal modality, since still images are spatial, whereas moving images are both spatial and temporal. In the second case, the border between the two qualified media (auditory literature and music) is partly modal in character and partly qualified in character. It is modal because of differences in the semiotic modality: all auditory literature is primarily (but not exclusively) symbolic, and music is primarily (but not exclusively) iconic. It is qualified because the boundaries between what one counts as auditory literature and music largely depend on different communicative ambitions and expectations. A reading of a poem that is reasonably close to the sound of ordinary speech is generally considered to be literature, whereas a singing performance of the same poem counts as music. However, there are many performance variants between the literary and the musical that cannot be clearly classified as either auditory literature or music since there is no definite border to be crossed. Instead, there is a border zone that is located differently in different periods and cultures. The classification is sometimes simply a question of whether the poem is performed within the frames of a poetry event or a musical concert. However, this cultural and aesthetic ambiguity of the difference between auditory literature and music is clearly linked to the semiotic modality. Even a neutral reading of a poem has some iconic potential, and what one takes to be the increasing musicality of a more varied, rhythmic and melodic reading is, in fact, strongly linked to increased iconicity.

Thus, I subscribe to the idea that the borders between what I refer to as qualified media types are largely relative. Boris Eikhenbaum's brief comment from nearly a century ago about the media types that we call art forms remains relevant today: 'None of the arts are fully bound entities, since syncretic tendencies are inherent in each of them; the whole point is in their inter-relationship, in the grouping of elements under one sign or another' (1973 [1926]: 124–125). I also believe that Mitchell's later contention that there are no 'essential' differences between media that are 'given for all time by the inherent natures of the media, the objects they represent, or the laws of the human mind' (1987: 2–3) is broadly correct—if we consider the qualifying aspects of media types. However, it is also the case that several qualified media types have indispensable cores of basic media types, which means that once a community has formed these qualified media types on the ground of contextual and operational qualifications, and as long as they are of service, they may differ 'essentially' regarding modality modes, from other qualified media types. As long as we think that a weather forecast on the radio is something that we have to hear and a printed newspaper article is something that we have to see, there will be an 'essential' difference between sensorial modes of these two qualified media types.

In brief, then, the classification of basic media types is relatively stable, whereas the classification of qualified media types is relatively unstable. It follows from this that media borders can be stronger and weaker; in other words, media borders can be understood to be both identified and construed, depending on whether one considers basic media borders or qualified media borders.

#### *1.6.2 Crossing Media Borders*

One might understand the crossing of media borders as the phenomenon, that one can classify a particular media product in different ways. For instance, one might categorise a certain three-dimensional, solid artefact as both an artistic sculpture and an object for religious adoration, which means that it, in a broad sense, bridges over qualified media borders. This is possible because the processes of categorising media products in qualified ways are largely open-ended, overlapping and changing.

However, one might also understand the crossing of media borders in a narrow sense as bridging over basic media borders. To explain this, it is important to consider the cross-modal cognitive capacities of the human mind, which no doubt evolved to make it possible to cope with a multimodal world. Practically all media borders can be bridged over to some extent, although certainly not completely, through these cross-modal cognitive capacities. They are central for mediality as such and indispensable for understanding intermedial relations.

Within a semiotic framework, cross-modal cognitive capacities refer to the abilities to create *cross-modal representations*. In the context of communication, these abilities explain the imperative phenomenon that meaning-making often goes beyond the media product's actual presemiotic modality modes. For instance, a visual, two-dimensional and static image may represent something that is perceived to be both threedimensionally spatial and temporal, such as a deer running in the forest. Whereas we perceive only two actual dimensions with our eyes, we perceive (or rather construe) virtual third and fourth spatiotemporal dimensions in our mind. Similarly, we regularly construe virtual materialities and sensory perceptions. A relief on a temple wall that is actually made of stone may be understood to represent a living organism such as a lion, which means that the representation crosses the border between non-organic and organic materiality. When studying a musical score, we only actually perceive visual configurations, but we understand them to represent auditory patterns: virtual sound is construed in our minds. All of these virtualities, these represented objects that are made present to our minds through signs in communication, result from semiotic activity: iconicity, indexicality and symbolicity. Thus, virtual spheres are partly made of interpretants resulting from cross-modal representation.

Cross-modal representation in communication involves a difference between the presemiotic modality modes of the media product and the material, spatiotemporal and sensorial traits of the virtual sphere that it represents, which requires cross-modal cognitive capacities. In singlemodal representation in communication, the material, spatiotemporal and sensorial modes of the media product are akin to the traits of the virtual sphere that it represents (such as a solid, visual, two-dimensional, static image representing a solid, visual, flat and unchanging object). This is arguably less cognitively demanding.

The term 'cross-modal' is used in various ways in a multitude of research areas. In the context of communication, it usually refers to connections among the external senses (see, for instance, Brochard et al. 2013). However, in line with the concept of media modalities, cross-modal here means the linking of all forms of different presemiotic modes within the same media modality. More specifically, cross-modality should be understood here as *cross-material, cross-spatiotemporal and cross-sensorial representation through iconicity, indexicality or symbolicity*. For instance, solid media products may represent non-solid objects, static media products may represent temporal objects, and auditory media products may represent visual objects—through iconicity, indexicality or symbolicity. Importantly, this means that dissimilar basic media types can partly represent the same objects. For instance, the notion of a running dog—a solid, organic, spatiotemporal and largely visual and auditory object—can be represented by a variety of different basic media types, not just solid, organic, spatiotemporal and visual or auditory media. This is what I mean when I state that cross-modal cognitive capacities can bridge over basic media borders: our minds are, to some extent, capable of leaping from mode to mode in the act of representation.

The functions of icons, indices and symbols—iconicity, indexicality and symbolicity—may be simple and straightforward as well as complex and sometimes difficult to grasp. All three sign types may cross the boundaries of what Peirce called the representamen, in the respect that something visual can represent something tactile, something static can represent something temporal and so forth. However, cross-modal representation may also mean that something material represents something mental. Our minds' capacity to connect the experience of concrete objects and phenomena with the experience of thinking, feeling, perceiving and imagining is fundamental for our ability to communicate cognitive import. Whereas a visual circle may be an icon for a material, concrete object such as the sun, it may also work as an icon for mental, abstract phenomena such as harmony, satisfaction or eternity because of a perceived similarity between the visual form and the cognitive notions. A visual circle may also function as an index for the earlier presence of a material object like a pen or a brush that actually created the circle. Similarly, it could be understood as an indexical sign for the mental act of wanting to draw a circle: there is a real connection between the producer's intention and the realised circle. The pen *was there*, but also the idea *was there*. Finally, a visual circle may be understood as a symbol, a sign based on habits, such as the letter O. In English, the written letter O signifies symbolically in at least two different ways. On one hand, it stands for a certain kind of sound (or rather a group of related sounds), and sound is a material phenomenon that we perceive with our external senses. On the other hand, the letter O stands for something abstract and conceptual in the sense that it represents a linguistic function—to form meaningful words—that can only be realised in conjunction with other letters.

Although abundantly present in all three sign types, cross-modal representation is perhaps most noteworthy in iconicity (Ahlner and Zlatev 2010; Elleström 2017). The ability to perceive cross-modal similarities is a remarkable cognitive capacity. While similarities are most clearly perceived among visual and auditory phenomena, respectively (a photograph of a boat clearly looks like a boat and a skilled whistler is able to sound just like a blackbird), similarities can be established across material, spatiotemporal and sensorial borders—and between the material and the mental. This is because mode-specific dissimilarities of details can be disregarded and similarity can be perceived on higher, more abstract and cross-modal levels. For example, visual traits may depict auditory or cognitive phenomena, and static structures may depict temporal phenomena. Hence, graphs may depict both changing pitch and altering financial status. Similarly, a variety of media types can depict similar ideas and concepts, such as the notion of speed, because they are abstracted from a broad range of sensory perceptions of different materialities and also mental experiences.

Initially, the purpose of my account of material, spatiotemporal and sensorial modes was to clarify the basic properties of media products working as *representamens*. However, as I have just demonstrated, it is clear that the modalities can also be used to characterise the *objects* of media products—what they represent, what they call forth in the mind of the perceiver in creating a virtual sphere. While represented objects such as abstract concepts may have an almost purely cognitive character, objects that are made present to the mind in signification may also be more or less concrete and physical. A painting of a face represents a face because the features of the painting are similar to the features of actual, physical faces as they are stored as recollections in our minds (Elleström 2014a). Hence, media products have certain material, spatiotemporal and sensorial modes, and, similarly, the objects that they depict, deict or describe may have either the same or other material, spatiotemporal and sensorial modes—or they may have a cognitive nature.

#### *1.6.3 Intermediality in a Narrow and a Broad Sense*

Given that media types and media borders are of various sorts and have different degrees of stability, it follows that media interrelations are multifaceted. Therefore, it may be helpful to provide some elementary divisions regarding the general nature of media interrelations. I first postulate that *mediality* is everything pertaining to media in communication. *Intramediality* concerns all types of relations among similar media types, and *intermediality* involves all types of relations among dissimilar media types. However, considering that there are (at least) two kinds of media borders, there are (at least) two ways of understanding media interrelations, making the classes intramediality and intermediality broader or narrower.

The term 'intramedial' is commonly used to refer to slightly different conceptions depending on how the notion of medium is circumscribed (see, for instance, Rajewsky 2002: 12). This is the case also for 'intermedial'. Here, I follow the distinctions that I have recently expounded and suggest that media interrelations can be intramedial in a broad and in a narrow sense. Intramediality in a broad sense regards relations among (media products belonging to) similar basic media types, and intramediality in a narrow sense regards relations among (media products belonging to) similar qualified media types. Similarly, I suggest that media interrelations can be intermedial in a broad sense and in a narrow sense. Intermediality in a broad sense regards relations among (media products belonging to) dissimilar qualified media types, and intermediality in a narrow sense regards relations among (media products belonging to) dissimilar basic media types.

Thinking of intramedial and intermedial relations in a narrow and a broad sense is useful for disentangling the intricate notion of crossing media borders. To avoid confusion, it is recommended to keep the intramediality and intermediality classes together, which entails combining one broad and one narrow notion. Intramediality in a broad sense (meaning relations among similar basic media types) belongs together with intermediality in a narrow sense (meaning relations among dissimilar basic media types). Intramediality in a narrow sense (meaning relations among similar qualified media types) belongs together with intermediality in a broad sense (meaning relations among dissimilar qualitied media types).

Elaborating on intermediality, it can be concluded more specifically that intermedial relations in a narrow sense are relations among (media products belonging to) dissimilar basic media types, that is, relations among media types based on different modality modes. This involves transgressing relatively strong media borders when moving between them. Intermedial relations in a broad sense, on the other hand, are relations among (media products belonging to) dissimilar qualified media types *including cases where no differences in modality modes are present*. Because several qualified media types are based on the same modality modes, they belong to the same basic media type, and their interrelations are intermedial only in a broad sense. This involves transgressing relatively weak media borders when moving between them. For instance, the two media types *written poetry* and *scholarly article* are clearly qualified in different ways, although they are both typically understood to consist of visual, static and mainly symbolic signs on a flat and generally solid surface. Whereas the interrelation between written poetry and scholarly article is intermedial in a broad sense, it is not intermedial in a narrow sense. Sections of poetry can normally be seamlessly incorporated into scholarly articles (and vice versa) without modifying modality modes.

Thus, intermedial relations in a narrow sense are largely a question of 'finding' or identifying media borders between dissimilar basic media types. Intermedial relations in a broad sense are more a question of 'inventing' or construing media borders between dissimilar qualified media types based on similar basic media types. As the mechanisms for classifying media products into media types are anything but clear-cut, it is often not evident how to apply this seemingly straightforward distinction between different forms of media interrelations. However, the division of intermedial relations into a narrow and a broad sense offers a methodical way of considering the intricate nature of intermediality.

# 1.7 What Are Media Integration, Media Transformation and Media Translation?

#### *1.7.1 Heteromediality and Transmediality*

Media interrelations are multifaceted. I now wish to add another viewpoint on media interrelations, to be placed on top of the ones already discussed. I suggest distinguishing between a synchronic and a diachronic perspective on media interrelations. Having a synchronic perspective means considering how media features appear at a certain moment. Having a diachronic perspective means considering how media features appear in relation to preceding and possibly subsequent media. Evidently, these two perspectives are analytical outlooks; I do not suggest using them to categorise media products. All media products can be investigated from both a synchronic and a diachronic perspective. While there is no doubt that certain media products are remarkably apt for diachronic analysis, no media products exist that cannot be treated in terms of diachronicity without some profit.

I propose calling the synchronic perspective on media interrelations *heteromediality*. With references to Mitchell (1994) and Elleström (2010), Jørgen Bruhn defined heteromediality as 'the multimodal character of all media and, consequently, the *a priori* mixed character of all conceivable texts' (2010: 229). I think this is an apt description of how media exist from a synchronic perspective. For me, the term 'heteromediality' refers to the general concept that all media products and media types, having partly similar and partly dissimilar basic presemiotic modes, overlap and can be described in terms of amalgamation of material properties and abilities for activating mental capacities that can be understood as various sign functions. This implies that media products and media types can only be properly understood in relation to each other. In my view, heteromediality, the synchronic perspective on media interrelations, is equally relevant for intra- and intermedial relations. It is the fundamental condition for mediality as such.

I also propose calling the diachronic perspective on media interrelations *transmediality*. Transmediality has been widely discussed and defined in various but fairly consistent ways. For instance, Irina O. Rajewsky circumscribed transmediality in terms of phenomena that are not media-specific, such as parody (Rajewsky 2002). I propose a very broad delineation of transmediality to match the comprehensive concept of heteromediality. For me, the term 'transmediality' refers to the general concept that media products and media types can, to some extent, mediate equivalent sensory configurations and represent similar objects (in Peirce's sense of the notion); in other words, they may communicate comparable things (Elleström 2014b: 11–20). This means that there may be transfers in time among media. Even though multitudes of more or less different media products and media types are used, communication can be grasped as a succession of interconnected representations, chains of overlapping virtual spheres. Clearly, transmediality, the diachronic perspective on media interrelations, cannot be properly understood without profoundly comprehending heteromediality, the synchronic perspective on media interrelations. As heteromediality, transmediality is relevant for both intramedial and intermedial relations. However, because of the complicated nature of media differences, transmediality in intermedial relations will be discussed separately and receive more attention. In these discussions, intermediality means intermediality in a narrow sense (relations among dissimilar basic media types), and intramediality means intramediality in a broad sense (relations among similar basic media types). This is because I want to focus specifically on the role of media modalities.

Heteromediality concerns the *combination and integration* of media products and basic or qualified media types. How can media be understood, analysed and compared in terms of the combination and integration of modality modes and qualifying aspects? This viewpoint emphasises an understanding of media as coexisting modality modes, media products and media types. Therefore, (intramedial and intermedial) heteromediality can also be called *media integration*.

Intermedial transmediality concerns *transfer and transformation* of media products and basic or qualified media types. How can the transfer and transformation of cognitive import represented by different forms of media be adequately comprehended and described? This viewpoint emphasises an understanding of media involving temporal gaps among modality modes, media products and media types—either actual gaps in terms of different times of genesis or gaps in the sense that the perceiver construes the import of a medium based on previously known media. Because media differences bring about inevitable transformations, intermedial transmediality can also be called *media transformation*.

Intramedial transmediality concerns the *translation* of media products and basic or qualified media types. I use the term 'translation' to adhere to the common idea that translation involves transfer of cognitive import among similar forms of media, such as translating written verbal language from Chinese to English. Therefore, intramedial transmediality can be broadly referred to as *media translation*.

#### *1.7.2 Media Integration*

As stated, the synchronic perspective on media interrelations, heteromediality, is foundational for comprehending mediality as such, and there is little point in distinguishing between intramedial and intermedial heteromediality. It is imperative to emphasise both the notion of combination and the notion of integration, stressing that sharing and combining media properties always entails integrating them to some degree. That is why I also refer to heteromediality as *media integration*. Compared to other intermediality scholars, I more strongly emphasise that there is a floating scale between combination and integration and avoid stricter divisions. For instance, Hans Lund made a heuristic distinction between three kinds of word–picture relations: combination, integration and transformation (1992 [1982]: 5–9). Claus Clüver distinguished between multimedia texts (separable texts), mixed-media texts (weakly integrated texts) and intermedia texts (fully integrated texts) (2007: 19).

The core of heteromediality consists of the multimodal character of media products, as explained in some detail in the earlier sections of this article. Every media product is made of a combination of media modality modes, generally including several modes from at least some of the modalities. Consequently, it is fair to say that media products consisting of many different modes are integrated or even mixed already as single media products, as Mitchell emphasised (1994). However, it is vital to note that media types are modally mixed or integrated in very different ways, allowing different kinds of media integrations with other media types composed of dissimilar modal mixtures.

Heteromediality also involves the combination and integration of different media products (that are already integrated on a more basic level). The circumstances under which a person is motivated to decide that she or he is dealing with 'one' media product rather than 'several' media products are rarely evident. Therefore, it may be that one and the same act of communication can be accurately analysed as consisting of one highly multimodal media product as well as of several thoroughly integrated media products. For instance, two people engaged in face-to-face communication both continually produce temporal, auditory and visual sensory configurations with a multitude of other modality modes, using their bodies and their immediate extensions, and perhaps other items, as technical media of display to realise a stream of communication. As the two minds give each other feedback, the continuous communication is nevertheless segmented in turn-taking to a certain degree; there are moments of relative silence or immobility on one side or another, after which something at least partly new is produced.

Although it may be impossible to determine exactly when or where one media product ends and another one begins, it is reasonable to think that each communicating mind in a case like this produces several media products rather than one. Similarly, one may experience that there is a certain autonomy in what one sees and hears. In other words, gestures and body language might (or might not) be perceived as media products that are not fully integrated with speech, because we are all familiar with hearing speech without seeing gestures and body language, and vice versa. However, these mental mechanisms of perceiving either single or several media products are certainly affected not only by the representing sensory configurations but also by the represented objects. The more successfully a single coherent virtual sphere is created, the more one is probably inclined to say that the media products are deeply integrated or actually constitute a single media product forming one perceptual gestalt. This means that, in each communicative situation, such as when one encounters a multitude of impressions during a lecture involving a variety of educational aids, it may be an open question whether one is guided by the disparity of material, spatiotemporal, sensorial or semiotic modes and feels that one encounters several combined and more or less integrated media products, or rather perceives a single total and highly multimodal media product. In any case, the heteromedial perspective offers theoretical tools for disentangling the interrelations.

Media types are categories of media products, which means that it may be an equally open question whether we are dealing with a weak combination or a strong integration of several basic or qualified media types, or in fact just a single highly multimodal, inclusive media type. This is because media categorisations are subjective and follow pragmatic communicative incitements rather than systematic rules. Nevertheless, it is clear that highly multimodal media types, compared to less multimodal media types, are more often perceived as combinations and integrations of several media types, most likely because one is used to experience and think of the various parts separately.

For instance, a qualified media type such as documentary photography can be said to be grounded on the basic media type materially solid, visual and flat still images. Similarly, a qualified media type such as animated cartoons for children might be said to be grounded on a single broad basic media type that is materially both solid and in gas form, spatiotemporally consisting of time and at least two spatial dimensions, sensorially audiovisual and semiotically dominated by icons as well as indices and symbols. However, it is probably more enlightening to think in terms of an integration of several basic media types (that can actually be perceptually separated). On one hand, materially solid, visual and flat moving images, on the other hand what might briefly be described as auditory text (verbally symbolic, temporal sounds that are heard) and non-verbal sounds (iconic and indexical, temporal sounds that are heard).

Theatre, to take another example, potentially combines and integrates a multitude of basic media types; almost anything can be brought into a scene and made part of the performance. The aesthetic aspects of these combinations and integrations of basic media are part of how many people understand and define theatre as a qualified media type. Each basic medium has its own modal characteristics, and when combined and integrated according to certain communicative ambitions and expectations, the result is known as 'theatre'. Theatre consists of different kinds of materialities—which are both profoundly spatial and temporal, appeal to both the eye and the ear and produce meaning by way of all kinds of signs—and it is contextually and operationally qualified in several ways. Therefore, theatre could be described as a profoundly multimodal qualified medium that is susceptible to intermedial analysis. It makes sense to say that it not only integrates several basic media, but also several qualified media; one may recognise parts of a theatre performance as, say, music, architecture, gesture, dance and speech. However, it might be an overstatement that 'theatre is a hypermedium that incorporates all arts and media' (Chapple and Kattenbelt 2006: 20; cf. Kattenbelt 2006: 32) because once the different media types are integrated, they become something else: the qualified medium of theatre.

To compare, one could argue that the pop song (here narrowly understood as something that one listens to without access to live performance) is a qualified medium that combines the two basic media types of auditory text (verbal symbols that are heard in a temporal flow) and auditory image (icons that are heard in a temporal flow). The consequences of combining and integrating these two basic media are not as far-reaching as the combination of several basic media in theatre. Auditory text and auditory images have the same materiality: sound waves that are taken in by the organs of hearing. Their way of being fundamentally temporal, but also to a certain degree spatial, is similar. The difference between auditory text and auditory image is clearly in the semiotic modality: whereas signification in auditory texts is mainly based on symbols and grounded on habits, signification in auditory images is mainly based on icons and grounded on similarity.

However, an unqualified combination and integration of these two basic media types is not enough to produce a pop song. Normally, both the auditory text and the auditory image need to have certain qualities that confer on them not only the value of 'lyrics' and 'music' but also of 'pop lyrics' and 'pop music'. The qualities of qualified media types become even more qualified when aspects of qualified submedia types, or simply genres, are involved. We usually consider the lyrics produced by the singer to be music in themselves, as is the sound produced by the instruments. Consequently, the integration of the two basic media in a pop song is deep, since the two media types are virtually identical when it comes to three of the four modalities. Concerning the fourth modality, the semiotic, it is perfectly normal to integrate the symbolic and the iconic signprocesses in the interpretation of both lyrics and music. Whereas literary texts are generally more saliently symbolic, and music is generally more saliently iconic, the combination and integration of lyrics and music stimulates the perceiver to find iconic aspects in the text and to realise the symbolic facets of the music.

Compared to theatre, the basic media of pop songs are strongly integrated because of their identical sensory configurations, which may make it seem that they are actually based on one basic media type and constitute one qualified submedium rather than an integration of several submedia. On the contrary, because of its strongly multimodal character, theatre might be seen as comprising several integrated basic and qualified media types rather than just one.

#### *1.7.3 Media Transformation*

As noted, the diachronic perspective on media interrelations, transmediality, is relevant for both intermedial and intramedial relationships. It covers all kinds of actual and potential diachronic media interrelations. This goes beyond the general field of media history, that is, the study of how media types evolve throughout the centuries (we find this narrower sense of a diachronic perspective on media in, for instance, Rajewsky 2005: 46–47). Regarding the diachronic perspective on *intermedial* relationships among dissimilar media (which I comprehend here as intermedial in a narrow sense: relations among dissimilar basic media types), I find it imperative to emphasise both the notion of transfer, indicating that identifiable represented characteristics are actually or potentially relocated among media (the narrative of a comic strip can be clearly recognised in a movie), and the notion of transformation, stressing that transfers among different media always entail changes (the narrative in the movie can hardly be identical to the one in the comic strip). For the sake of brevity, however, I refer to this perspective simply as *media transformation*; thus, media transformation equals intermedial transmediality.

Just as a combination of media products and media types involves grades of integration, transfer of cognitive import among media products and media types involves transformation, to different degrees. The human body, a technical medium of display, perfectly realises a solo dance or a gesture. In order to communicate something similar to the dance or the gesture, the technical medium of a television screen will work quite well, a printed still image will do the job less well, and the sound emitted by a radio will only be able to realise media products that are radically altered, although they may still be able to create recognisable virtual spheres. This depends on the dissimilar modal capacities of the various technical media of display, suitable for realising different basic media types. Therefore, when the transfer of cognitive import among media is restricted by the modal capacities of the technical media of display, or when the technical media allow of modal expansion—in brief, when the transfer brings about more or less radical modal changes—it can be described as transformation.

More specifically, transmediality generally involves the idea that different media products (belonging to the same or dissimilar media types) may trigger the same or similar cognitive import; they may create the same or at least similar virtual spheres. Therefore, it is only a short step from the idea that virtual spheres may be transmedial, to varying degrees, to recognising that cognitive import can be transferred among similar or different kinds of media. When inserting a temporal perspective, it often makes sense to acknowledge not only that similar cognitive import is or may be signified by various media, but also that parts of or even whole virtual spheres, that are similar enough to be recognised, may recur after having appeared in another medium. Thus, transmediality involves actual or potential transfers of cognitive import not only among minds (which is the indispensable core of communication as such), but also among media that is, among minds perceiving different media.

When describing how a media product is perceived and construed in prompting specific cognitive import forming a particular virtual sphere, it is convenient to simply refer to its *characteristics*. I used the term 'compound media characteristics' earlier to represent the concept that media products and media types bring into being individual or typical cognitive import that forms specific (types of) virtual spheres in the perceiver's mind (Elleström 2014b). The term includes the word 'compound' to avoid mixing up the material, spatiotemporal and sensorial media traits that *represent* (the presemiotic modality modes) and the multifaceted characteristics that are *represented*. Therefore, it might be clearer to instead use the term 'represented media characteristics' or simply 'media characteristics', while recalling that 'media characteristics' refers to the represented cognitive import.

Represented media characteristics include everything that one might think of. They may be concrete or abstract and they may be conceived in terms of form or content: animals, persons, minds, structures, stories, rhythms, compositions, explanations, contrasts, themes, motifs, ideas, events, interrelations, moods and so forth. Some of the things and phenomena that media represent have material, spatiotemporal and sensorial traits. However, all things that media represent, in the broad sense of making them present to the perceiver's mind, are media characteristics.

The advantage of sometimes using the term 'represented media characteristics' instead of simply 'cognitive import' is that it emphasises the specificity of what certain media products or media types represent. Certain media characteristics are attached to particular media products and some are attributed to particular basic and qualified media types. Ultimately, though, 'represented media characteristics' means the same as 'specific cognitive import created by the perceiver's mind in communication'. The point here is that represented media characteristics are more or less transmedial, meaning that they can be more or less successfully transferred among different media products or even different basic and qualified media types (Elleström 2014b: 39–45). This largely, but certainly not solely, depends on the present or absent modality modes of the involved media.

Returning to specifically intermedial transmediality, I distinguish between two forms of media transformation (intermedial transmediality). The first is *transmediation* (repeated representation of media characteristics by a different form of medium, such as a person orally communicating the same story as a computer game), and the second is *media representation* (representation of another medium of a different type, such as a written review that describes the performance of a piece of music).

Transmediation, another kind of medium that again represents some media characteristics, can more precisely be described in terms of my previous distinction between intracommunicational and extracommunicational domains. The intracommunicational domain consists of the virtual sphere—represented cognitive import. The extracommunicational domain consists of the perceived actual sphere and other virtual spheres: cognitive import stemming from previous representations in earlier communication. Transmediation occurs when already represented objects from other virtual spheres, created by other media types, become part of a virtual sphere; this is the same as saying that media characteristics are represented again by another form of medium. For instance, the people in a newspaper photograph or the visual actions in a film may be described by spoken words; a musical score may be performed by a musician; the oral statements of a witness may be written down; a story and characters in a theatrical play may be adapted to a movie; the gist of a scientific account may be rendered into a visual diagram; and written alphabetical text may be transformed to Braille writing. Even the recipe in a cookbook being realised as a meal communicating, for instance, affection, contrasts or the sense of a certain season of the year, can be understood in terms of transmediation.

Examples of media representation, a medium representing another medium of a different kind, are dialogues, gestures or photographs being heard and seen in a film; a scholarly treatise discussing media interrelations; pictures of drawings on a website; a song about love letters; and a written article in a magazine describing social media. If a written article in a magazine not only describes social media in general but also, say, events that have already been communicated on social media, we have media representation *and* transmediation. The two types of media transformation are not in any way mutually exclusive; on the contrary, they often coexist. Furthermore, they include not only transformations among specific media products but also among qualified media types and between media products and qualified media types. Filmic qualities in a written article in a magazine are a case of transmediation from the qualified medium of film to a specific media product. The artistic genre ekphrasis is generally defined as poems representing paintings, which is a case of qualified submedia representing other qualified media and normally includes transmediation of media characteristics from painting to poem.

I want to emphasise that it is not necessarily the technical medium of display that 'forces' the transformations in media transformation. Naturally, media transformations may also result from communicative choices to take advantage of the modal possibilities offered by the target medium. In the classical example of novels being adapted to films, modal differences between the two qualified media types clearly make it necessary to alter many things; however, transmediations of this kind also offer possibilities for creative choices and voluntary transformations that are desirable. In this case, transmediation can be seen as a possibility rather than a problem. In other cases, such as transmediations among statements, written reports and footage from surveillance cameras in criminal trials, transmediation is definitely a problem rather than a creative opportunity; judges rarely appreciate inventive new versions of earlier media characteristics.

Obviously, there are many kinds of media transformation. These sometimes involve fairly clear and complete relations between media products, such as when a particular newspaper article is evidently recognisable in its online version (albeit with fewer words and added animations and hyperlinks), or when a specific novel can be identified as the source of a feature film (although the narrative has been abridged and sound and visual iconicity have been added). It is sometimes rather a question of less definitive and fragmentary media characteristics that travel among media products and media types, such as when musical form is traced in a short story, when visual characteristics associated with comic strips can be said to have found their way to a television commercial, or when certain formal media characteristics of literature are transmediated to dance (cf. Aguiar and Queiroz 2015).

As demonstrated in the section on media borders, transfer of media characteristics over modal borders is often possible despite essential presemiotic and semiotic dissimilarities among media. This is not least because our brains have cross-modal abilities; they can make meaningful transmissions between, say, visual and auditory information, or spatial and temporal forms of presentation. This allows for media characteristics being more or less transmedial. Hence, the fact that there are fundamental or even essential media dissimilarities does not preclude shared representational capacities and the transfer of media characteristics among dissimilar media. Over thirty years ago, Dudley Andrew noted that in order to explain how different sign systems can represent entities that are approximately the same (such as narratives), 'one must presume that the global signified of the original is separable from its text' (Andrew 1984: 101). This is no doubt true, especially if one relativises the proposition and adds that the represented media characteristics are *to some extent* separable from the representing sensory configurations. Represented objects are ultimately cognitive entities in our minds, and these entities can be made present by different kinds of signs, although media differences will always ensure that they are not completely similar when represented again by another kind of medium.

#### *1.7.4 Media Translation*

Although I have discussed transmediality primarily within the frames of intermediality (in a narrow sense), the diachronic perspective on media interrelations is relevant also for *intramedial* relations (which I comprehend here as intramedial in a broad sense: relations among similar basic media types, which may actually involve dissimilar qualified media types). I refer to intramedial transmediality as *media translation*. I choose this term because 'translation' attaches to the common notion of translation as transfer among verbal languages. Hence, media translation is an extension of this idea to include transmediality among all forms of similar media types, not just media types based on verbal language. Much of what I have said about media transformation is also applicable to media translation, with the obvious difference that whereas media transformation involves dissimilar media types, media translation involves similar media types, which makes media translation somehow less complicated to grasp. Nevertheless, basic media transformation categories such as transmediation and media representation have their equivalences in media translation. Intramedial transmediation would then include phenomena such as cover versions of pop songs, remakes of feature films, rephrased oral statements and translations of menus from Spanish to English. Intramedial media representation could include dinner talks mentioning any form of speech, paintings representing other paintings, television shows discussing television shows in general or specific television programs and news articles referring to themselves. However, a lengthy discussion of media translation would not add much to what I have already concluded regarding media transformation.

# 1.8 What Is the Conclusion?

How can media be circumscribed within the realm of communication and how can media interrelations be conceptualised? These questions have been at the heart of this article from start to end. The incompatibility of many of the suggested answers in the past is largely caused by the shifting approaches of different scholars and research traditions. Technological features, as well as modal and qualifying aspects, have been emphasised in diverse and often exclusive ways in the efforts to find slim and efficiently operable definitions of the concept of medium. Jürgen E. Müller emphasised this problem several decades ago (1996: 81–83). One alternative has been to lean on conceptions of media that are open-ended and mind triggering but difficult to handle analytically, such as McLuhan's (1994 [1964]). The advantage of working with a set of entangled and complementary concepts—media product, technical medium of display, media modalities and modes and basic and qualified media types—is that such a conglomerate of concepts sets certain parameters at the same time as it incorporates most of the actual comprehensions of mediality. Therefore, I have tried to offer an array of interrelated analytical perspectives that may be used for careful analysis of media interrelations, without strictly compartmentalising media products and their interrelations.

Although I have provided a few detailed accounts of media and their interrelations, my overview requires a more exhaustive elaboration and exemplification. I have offered a *model* for understanding media and intermedial relations, and the point of models is precisely to put aside specific details to make possible a view that is more generally valid. Therefore, I hope that the model may also offer a starting point for methodical analyses in the service of various research questions attaching to mediality at large and more specifically media interrelations.

In a certain sense, the presented model is bottom-up in nature. Instead of beginning with a small selection of established media types and their traits and interrelations, which is the usual scholarly methodology, it is founded on observations of all kinds of media, leading to a broad but firm definition of the concept of media product and an explanation of media modalities that are shared by all media products and hence also media types. Hence, the conceptual framework can properly deal with any individual media product even if it is found outside of established and wellresearched areas of communication. The model can also account for the plain but central fact that media products and media types are both similar and different. While there are four media modalities that underlie all conceivable media, each modality encloses several modes that vary among media products and media types. However, these modality modes are not always easily detectable properties; rather, they are found on a scale from physical traits to perception, cognition and interpretation.

The existence of several modality modes belonging to different media modalities means that the concept of media multimodality can be comprehended in various ways. In the broadest sense, a media product or a media type is multimodal if it combines, for instance, solid materiality, temporality, visuality and iconicity; in this respect, all media are definitely multimodal because they must be realised by at least one mode of each modality. In a more restricted sense, media multimodality means that a media product or media type includes several modes of the same modality. In this specific sense, there are material multimodality (multimateriality), spatiotemporal multimodality (multispatiotemporality), sensorial multimodality (multisensoriality) and semiotic multimodality (multisemioticity). Considering this narrower sense of multimodality, all media are at least slightly multimodal because the modality modes are generally either overlapping or mutually dependent in complex ways that I have only hinted at.

However, I have demonstrated in more detail the ways in which the concepts of media products, technical media of display, media modalities, modality modes, multimodality and basic and qualified media types make it possible to delineate properly concepts such as mediality, media borders, intramediality, intermediality, heteromediality and transmediality. Taking the intricacy of the many aspects of mediality into account, intermediality could actually be described as 'media intermultimodality'. As argued, I think it is worth viewing intermediality as a complex set of relations among media that are more or less multimodal in various ways, although I hesitate to use the cumbersome term 'media intermultimodality'. Nevertheless, the concept that it stands for has proven fertile (see Lavender 2014).

Multimodality is vital for mediality, and although an intramedial perspective is necessary for understanding many communicative phenomena, an intermedial perspective is essential for grasping the intricate field of mediality at large—because crossing media borders is the rule rather than the exception in communication. Because of their ubiquity and complexity, I do not think it is possible to circumscribe a specific corpus of multimodal media products or intermedial relations, although I find many of the scholarly systems of intermedial 'works' and 'relations' valuable (cf. the enlightening overview of intermedial positions and issues in Rajewsky 2005). Intermedial relations can only be pinned down to a certain extent and intermedial analysis cannot live without its twin sister, intermedial interpretation.

While intermediality is certainly about specific intermedial relations, it is also, and perhaps primarily, about *studying* all kinds of media with an awareness of media differences and similarities. As stressed by Jørgen Bruhn (2010), what makes intermedial studies important is that they offer insights into the nature of all media, not only a selection of peripheral media. Although the objects of intermedial studies may well be, for instance, media that have been categorised as 'intermedial' or 'multimodal', they may also be what have been taken to be (for the moment) 'normal' media. The outcome of the studies depends less on the objects of investigation than on the way the studies are performed. The ambition of the model that I have here outlined, first presented in an initial form a decade ago (Elleström 2010), is that it continues to offer helpful tools for careful analysis and interpretation of all forms of media interrelations, regardless of the inducements and goals of the investigations.

#### References


———. 2019. From the "Mutual Illumination of the Arts" to "Studies of Intermediality". *International Journal of Semiotics and Visual Rhetoric* 3: 63–74.


———. 2017. Bridging the Gap between Image and Metaphor Through Cross-Modal Iconicity: An Interdisciplinary Model. In *Dimensions of Iconicity*, Iconicity in Language and Literature, 15, ed. Angelika Zirker, Matthias Bauer, Olga Fischer, and Christina Ljungberg, 167–190. Amsterdam: John Benjamins.

———. 2018a. A Medium-Centered Model of Communication. *Semiotica* 224: 269–293.

———. 2018b. Coherence and Truthfulness in Communication: Intracommunicational and Extracommunicational Indexicality. *Semiotica* 225: 423–446.

———. 2018c. Modelling Human Communication: Mediality and Semiotics. In *Meanings & Co.: The Interdisciplinarity of Communication, Semiotics and Multimodality*, ed. Alin Olteanu, Andrew Stables, and Dumitru Bortun, 7–32. ̧ Cham: Springer.

———. 2019. *Transmedial Narration: Narratives and Stories in Different Media*. Basingstoke: Palgrave Macmillan.


———. 1971c. Language in Relation to Other Communication Systems. In *Selected Writings II, Word and Language*, 697–708. The Hague: Mouton.

Johnson, Mark. 1987. *The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason*. Chicago and London: University of Chicago Press.


———. 2006. *Reading Images: The Grammar of Visual Design*. 2nd ed. London: Routledge.


———. 1987. Going Too Far with the Sister Arts. In *Space, Time, Image, Sign: Essays on Literature and the Visual Arts*, ed. James A.W. Heffernan, 1–10. New York: Peter Lang.

———. 1994. *Picture Theory: Essays on Verbal and Visual Representation*. Chicago: University of Chicago Press.

———. 2005. There are No Visual Media. *Journal of Visual Culture* 4: 257–266.

Moser, Sibylle. 2007a. Iconicity in Multimedia Performance: Laurie Anderson's *White Lily*. In *Insistent Images*, Iconicity in Language and Literature 5, ed. Elěbieta Tabakowska, Christina Ljungberg, and Olga Fischer, 323–345. Amsterdam: John Benjamins.

———. 2007b. Media Modes of Poetic Reception: Reading Lyrics versus Listening to Songs. *Poetics* 35: 277–300.

Müller, Jürgen E. 1996. *Intermedialität. Formen moderner kultureller Kommunikation*. Münster: Nodus Publikationen.

———. 2008a. Perspectives for an Intermedia History of the Social Functions of Television. In *Media Encounters and Media Theories*, ed. Jürgen E. Müller, 201–215. Münster: Nodus.

———. 2008b. Intermedialität und Medienhistoriographie. In *Intermedialität Analog/Digital. Theorien—Methoden—Analysen*, ed. Joachim Paech and Jens Schröter, 31–46. Munich: Wilhelm Fink.

———. 2010. Intermediality Revisited: Some Reflections about Basic Principles of this *axe de pertinence*. In *Media Borders, Multimodality and Intermediality*, ed. Lars Elleström, 237–252. Basingstoke: Palgrave Macmillan.


———. 1958. In *Collected Papers of Charles Sanders Peirce VIII, Reviews, Correspondence, and Bibliography [CP8]*, ed. Arthur W. Burks. Cambridge, MA: Harvard University Press.

Purchase, Helen C. 1999. A Semiotic Definition of Multimedia Communication. *Semiotica* 123: 247–259.

Rajewsky, Irina O. 2002. *Intermedialität*. Tübingen: A. Francke.

———. 2005. Intermediality, Intertextuality, and Remediation: A Literary Perspective On Intermediality. *Intermédialités* 6: 43–64.

———. 2008. Intermedialität und *remediation*: Überlegungen zu einigen Problemfeldern der jüngeren Intermedialitätsforschung. In *Intermedialität Analog/Digital: Theorien—Methoden—Analysen*, ed. Joachim Paech and Jens Schröter, 47–60. Munich: Wilhelm Fink.

———. 2010. Border Talks: The Problematic Status of Media Borders in the Current Debate about Intermediality. In *Media Borders, Multimodality and Intermediality*, ed. Lars Elleström, 51–68. Basingstoke: Palgrave Macmillan.


Schramm, Wilbur. 1955. Information Theory and Mass Communication. *Journalism Quarterly* 32: 131–146.

———. 1971. The Nature of Communication between Humans. In *The Process and Effects of Mass Communication*, eds. Wilbur Schramm and Donald F. Roberts, Rev. ed., 3–53. Urbana: University of Illinois Press.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Media Integration

# A Recalibration of Theatre's Hypermediality

# *Mark Crossley*

#### **Contents**


# 2.1 Introduction

As my wife and I take a seat in the lobby of The Lansdowne, we are asked if we would like a tour of the building in case we are interested in renting one of their "stylish one, two or three bedroomed apartments", close to Birmingham city centre and with a range of in-house communal amenities including the Fitness Studio and the Cultural Mixer with a home cinema and pool table (The Lansdowne 2019). We say politely that "we are fine

M. Crossley (\*)

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_2

De Montfort University, Leicester, UK e-mail: mcrossley@dmu.ac.uk

<sup>©</sup> The Author(s) 2021 95

thanks, we already have a house", and that we are "just here for the performance". The member of The Lansdowne staff smiles politely and walks away.

Having been to see a range of theatre pieces in recent years that deftly and mischievously fashion artifices of commercial behaviour as an entrée into performance (dreamthinkspeak's *Absent* 2015 springs to mind in which the façade of the hotel enterprise is utterly convincing until you literally see behind the doors), I am always intrigued by events such as that transpiring at The Lansdowne: Who and what is "performing" in this 18-storey residential tower? Specifically we have come to see Stan's Cafe's *It's Your Film*, a piece originally created over twenty years ago in 1998. I was fortunate enough to see a performance in that year and fascinated to see it again all this time later, particularly with its added spectatorial dimension for 2019. To quote the marketing blurb from the theatre company:

This unique four and a half minute show is performed to an audience of one. Viewers are seated in a passport photo style booth, with the show unfolding through a letterbox-sized aperture*—it looks like a film but is performed just for you!* A story of lost love and detection set in Birmingham, the show takes its inspiration from film noir and uses special slide and video projections, and even a Victorian theatre trick called Pepper's Ghost, which allows live actors to magically dissolve or be layered over each other. For the first time ever, audience members will be invited to watch the performance a second time from behind the scenes to see how the magic is done. (From a promotional email received 24th April 2019)

My previous knowledge of the piece, as a self-contained performance, supresses my incertitude as to the boundaries of the event, but nevertheless the scale of the apartment block, its metropolitan bravura and the efficient commercial disposition of its employee inform and frame my expectation of this urban noir love story. Before we even begin to experience the piece itself, the qualified media of architecture and what might floridly be referred to as the performance of real estate surrounds us at a presemiotic and semiotic level—the polished concrete, the airy lobby and the industrial signage, signifiers of modernity, youth, urbanity and wealth. Elleström's term, "media product", which he defines as "the intermediate stage that enables the transfer of cognitive import from a producer's to a perceiver's mind" (2020: 13), has pertinence in this context as the media products of property lettings are seemingly at work in the prelude to, or in the all-consuming service of, theatre. The request from The Lansdowne developers for Stan's Cafe to re-stage the show as part of their publicity and sales drive is resonant of what Richard Schechner refers to as protoperformance (2002), a precedent or impetus to performance, in this case an event prompted both by a specific request and by its own previous incarnation in 1998 and subsequent iterations. We only experience this proto-performativity briefly and superficially in the seemingly innocuous collision of newly developed property and re-fashioned theatre, but in that momentary juncture, the playful ambiguity of theatre's encroachment into its wider environment and the uncertainty this triggers within us over its parameters are unmistakable. The door to the booth is situated to our right, the liminal divide, but where we sit is more than theatrical foyer, we are embedded within a hybrid of qualified media, architecture and real estate co-opted (perhaps unknowingly by the developers) into the theatrical signification of *It's Your Film* (a city that is bigger than us, hidden from us) and theatre co-opted more consciously and strategically into the qualified medium of real estate performance, or what I later refer to as the architecture of commerce.

#### 2.2 Recalibration

This capacity of theatre to shapeshift and ingest other media is captured in Chiel Kattenbelt's suggestion that "when two or more different art forms come together a process of theatricalization occurs" (2008: 20). This chapter outlines and recalibrates theatre's position as a hypermedium, in other words its capacity to envelop a seemingly endless profusion of modes, modalities and media (basic, qualified and technical; see Elleström 2020) within its ambit and reframe them as theatrical performance. This quality of theatre offers both opportunities and tensions for contemporary artists, audiences/participants and those engaged in the study of intermediality as it complicates and multiplies unpredictability into the processes of mediation and representation. This hospitality of theatre with its open invitation to other qualified and technical media creates a dynamic yet crowded environment, alongside which the proliferation of digital and post-digital options creates ever-greater challenges to authorial agency historically afforded to writers, directors and actors as well as increasingly complex challenges to audiences/participants in understanding the layers of signification presented to them or experienced by them.

The premise that theatre has a particular capability to assimilate all other media was notably elucidated in Chapple and Kattenbelt's influential text *Intermediality in Theatre and Performance* within which they state that "theatre has become a hypermedium and home to all" (2006: 24). The argument has long been rehearsed throughout the late twentieth and early twenty-first centuries, in both multimedial and intermedial theories, that specifically theatre has the capacity to embrace other artistic forms without fundamentally altering their structure. Peter M. Boenisch, for example, described theatre as a "fully transparent medium" (2006: 112) with the ability to leave its incorporated media free of "any palpable fingerprints of its mediatisation" (112) unlike television or film. He exemplifies this by stating that "a video might be projected as part of a theatre performance, which is then recorded for TV; yet the video on stage is still a video, whereas on the television it will be the broadcast of the showing of a video" (112). Kattenbelt developed this point further when he proposed that film, television, video and DVD (when they appear in a theatre setting) become staged and "in this capacity, not only cinematic, televisual, videographic or digital, but at the same time theatrical" (2008: 22–23). However, my contention in this chapter is that these recent perspectives are limited in the rigour of their analysis and greater interrogation is needed of the complex reality of what is happening in contemporary theatre, both in terms of what theatre-makers are seeking to achieve and in how audience/participants are interpreting the plethora of media products. My perspective builds upon Elleström's own recurring dissatisfaction with the generalised definition and assumptions of theatre as a hypermedium. Originally outlined in 2010, Elleström reiterates the point in the opening chapter of the present publication:

Therefore, theatre could be described as a profoundly multimodal qualified medium that is susceptible to intermedial analysis. It makes sense to say that it not only integrates several basic media, but also several qualified media; one may recognise parts of a theatre performance as, say, music, architecture, gesture, dance and speech. However, it might be an overstatement that "theatre is a hypermedium that incorporates all arts and media" (Chapple and Kattenbelt 2006: 20; cf. Kattenbelt 2008: 32) because once the different media types are integrated, they become something else: the qualified medium of theatre. (Elleström 2020: 77)

In pursuit of this closer interrogation of theatre as a hypermedium, I will refer to a range of contemporary productions but with particular attention to three performances that I have experienced in recent years, *The Ferryman* (2017) by Jez Butterworth which opened at The Royal Court Theatre in London before transferring to the West End where I witnessed it at The Gielgud Theatre in 2018; *In Many Hands* by the Brussels-based experimental artist Kate McIntosh, performed at festivals across the world and experienced by myself at Utrecht's Spring Festival in 2018; and finally, as already discussed in part, *It's Your Film* by Birmingham-based Stan's Cafe Theatre Company. I have selected these productions as they span a diverse range of theatrical styles from the more traditional naturalism of *The Ferryman*, viewed en masse in a proscenium arch theatre, the handson participatory experience of *In Many Hands* in which the audience are invited to touch and share a plethora of objects, through to the fleeting solo cine-theatrical experience of *It's Your Film*. These brief descriptions alone are indicative of the breadth of contemporary theatre practice, not just in terms of genre (what Elleström refers to as submedia in 2020: 78) but also in terms of materiality, spatiotemporality, sensoriality, diverse semiotic signs as well as contextual and operational aspects that draw from a profusion of qualified media, not merely conventional theatre.

#### 2.3 Hypermedium and Hypermediacy

Before considering these productions in more detail, it is important to establish the most current debates regarding theatre and hypermediality. Recently, in *Intermedial Theatre: Principles and Practice* (Crossley 2019), I began to reconsider and modify the conception of theatre as a hypermedium, citing Claudia Georgi's more nuanced articulation. Her argument, which at times uses Elleström's modal theory, has resonances of Kattenbelt and Boenisch, as she writes that theatre is notable for "its ability to integrate other media without affecting their respective materiality and mediality" (2014: 46). Georgi's theoretical approach becomes more granular however as she distinguishes certain aspects of theatre's mutability, contending that any sign can be incorporated within theatre but as this is a trait to be found in other "plurimedial media" such as film, theatrical distinctiveness is actually evidenced in the material mobility:

What is unique to theatre is thus not its semiotic mobility as such, but what could in analogy be termed its 'medial mobility,' i.e. the ability to leave the materiality of the incorporated media intact while their respective signs acquire an additional semiotic quality as theatrical signs. (Georgi 2014: 47)

In response to this, I reflected on how greater clarity may be gleaned beyond the term "additional", arguing that "[t]he additional layer that Georgi refers to may not be easily distinguished as an addition when the original semiotic signification may be denuded to the point where we are actually perceiving a hybrid between the original and the theatricalized" (Crossley 2019: 19). My contention was that the interrelationship between materiality and other presemiotic modalities was so intricate that any repositioning of, or interaction with, material objects on stage (book, statue, film, text, fabric, etc.) disrupts and reframes these elements to such a degree that further investigation is required to articulate the nature and complexity of these cognitive imports (the input and output of communication from producer's mind to perceiver's mind as defined in Elleström 2020), moving beyond a binary distinction of pre-theatrical and "additional" theatrical signification.

In addition to the specific term hypermedium, hypermediacy as a related but discreet concept has been scrutinised, stratified and defined in recent years, initially within Bolter and Grusin's influential *Remediation: Understanding New Media* (1999) in which they delineated two significant types of mediation, referred to as transparent immediacy and hypermediacy. Andy Lavender summates the distinction: "In *Remediation* Jay Bolter and Richard Grusin describe processes of *immediacy*, that efface the appearance of the artwork by giving the spectator an apparently direct access to its matter; and processes of *hypermediacy*, through which you see the medial arrangement that presents the artwork for your engagement (1999: 33–34)" (2019: 54). In theatre, these two concepts can easily find themselves in close proximity and partnership, and in this context of hypermediality, I propose certain gradations as to their distinctiveness, suggesting they are simultaneous conditions in correspondence during all performance. Immediacy can be defined in a variety of forms, including direct communication from producer to perceiver (e.g. stand-up comedy), but it is also in evidence in a production such as *The Ferryman*. The performance follows a naturalistic narrative, focusing on the impact of "The Troubles" in Northern Ireland on one family in County Armagh during the 1980s. Our spectatorship, within a classic West End theatre auditorium, is, perhaps counter-intuitively, an invitation to frame our experience as immediate and unmediated, emotional and direct as we invest in the grief and the longing of the protagonists. Despite the overtly theatrical context in which we sit, the social praxis informing this event directs us towards a persistent, yet agreeable, naivety. Whilst our experience is significantly enhanced by our extracommunicational knowledge of the sociopolitical events portrayed in the play and our cultural understanding of what is expected of us in the stalls with programme in hand and so forth, we are persuaded to uncouple the two domains and immerse ourselves in the autonomy of the virtual sphere created by the semiosis of intracommunicational objects on stage, as Elleström proposes: "This is because we may perceive them, in part or in whole, as new *gestalts* that disrupt the connection to the extracommunicational domain" (2020: 30). If we wished to disengage for a moment and look around us, we would quickly recognise the medial arrangements of the artwork, our fellow spectators, the gilded proscenium and the specific modes of naturalistic theatre in such a space (the artifice of the rustic mise en scène, the stairs to nowhere upstage left, etc.), but we decide to resist the hypermediacy of the context and remain committed to the virtual, immediate domain.

#### 2.4 Temporality and Sensoriality

Hypermediacy also manifests itself within the construction and participation of *In Many Hands*. At the beginning of the performance, the audience of only forty-five people are invited to wash their hands before entering into a large auditorium within which there are three long tables, arranged in a triangle, with chairs down one side of each of them. Aside from assistants directing where to sit, there is no one overtly performing for us. We are intrinsic to the performance as participants but also more radically as agents, activating the event and controlling the temporality. Such participatory experiences echo Lavender's contention in the present publication, as he notes that "the actor/performer takes on a more protean form in this environment" (2020: 120). Our own centrality is evident in this Fierce Festival description of the performance:

This project steps away from the stage—instead bringing the audience into a series of aesthetic sensory situations, inviting them to experiment with materials and encounter physical phenomena themselves. *In Many Hands* is part laboratory, part expedition, part meditation—as it unfolds, visitors take their time to engage and explore as they wish, following their noses and curiosities. (Fierce Festival 2019)

As the performance progresses, we are passed, one by one, objects of interest, all the way down one table, then the other before reaching the final participant on the last table—sponges, shells, mud, rocks, seeds, bird skulls and dozens more, a multitude of textures, temperatures and resistances, eliciting degrees of intrigue, humour and squeamishness. It feels part interactive museum display and part ceremony. The material modality of these objects has not been altered to any significant extent; their mobility from original context into performance is seamless, yet it is the temporality of the event which is so specific and theatrical. In contemporary performance, I would argue that it is our control and manipulation of time that is often the most significant factor in creating the "additional" layer, or perhaps what I might suggest is a hybrid signification. In *The Ferryman*, the audience are relatively passive in temporal terms (putting aside the virtual sphere of temporality), as we sit dutifully through the duration of the play. Essentially, it has a fixed temporality, akin to film or recorded television. In one sense, *In Many Hands* also has an overarching fixed temporality as the show has a total duration of approximately ninety minutes. However, within that timescale the participants construct the temporality of both personal and collective moments quite significantly. As each object approaches, we decide the point at which we accept it from the audience member to our right and when we shall pass it on to the person to our left. The temporality of the sensorial engagement with each object is delineated predominantly by every individual with adjustments made in response to prompts (impatience, anticipation) from their "collaborators" left and right. The collective autonomy over temporality is heightened in the latter stages of the performance as the scale of objects increases to include swathes of damp fishing net and finally a large gauze carried aloft across the heads of the whole crowd. The intensity of sense data and subsequent sensations is closely intertwined with the ephemeral temporality of the touching. The objects which have a static temporality in and of themselves are animated by our examination of them and the brevity of our connection to them. This temporal agency that theatre is able to exploit over media is commented upon by the performance maker Jo Scott:

I am suggesting that performance is always in a process of undoing the temporalities of its media […] because of the intersection between the 'timing' of the performer's action and the temporality of the media with which they intersect. Intermedial activation in live performance wrestles the fixed media from their moorings and sets them loose, so we *feel* their happening differently and the temporality of the piece itself also shifts. (Scott 2019: 112)

As the participant performers in this event, we negotiate the different temporalities and sensorialities for ourselves and between ourselves, executing subtle split-second judgements over our experience. In this regard, the sensations of the finale are acutely individual as we feel the soft fabric on our fingertips, yet also undeniably communal as we are all shadowed by, and share the delicate weight of, the expanse of cloth. It may be argued that temporal and sensorial control are in evidence in other qualified media, with parallels often being drawn between gaming and theatre. It could also be suggested that in our encounters with certain art objects, fine art and sculpture, for example, we have agency over our temporal engagement as we decide to move towards, away from or around such objects. However, the particularity of theatre is that the temporalities and sensorialities of these other qualified media can themselves be subsumed within theatre and theatre-makers can then consciously seek to engineer or encourage such participations within the theatrical event.

From a hypermedial perspective, this event combines a number of qualified media: certainly theatre in terms of a performance within an auditorium but also museum exhibition and ceremony, facilitated by Kate McIntosh and her collaborators, but orchestrated by the audience. Modes of theatre such as the atmospheric lighting are intertwined with crossmodal modes; the objects are both exhibits and props. To advance Georgi's notion of an additional layer, I would emphasise the dialogic nature of this signification as the cognitive import flexes between the extracommunicative knowledge we have of these inert objects but then seeks to place this in relation to their presence in the theatrical space. McIntosh has spoken of the extensive and meticulous selection process for each of the objects (2017), but as they pass through our hands, we have only a transitory moment to wrestle with the juxtaposition of modes and what these may represent. As Elleström reminds us, "compared to the potentially extensive act of production, the act of perception is brief and quickly channelled into interpretation, which of course occurs in the perceiver's mind" (2020: 18). Immediacy and hypermediacy are present both in the event as we have immediate access to the objects and one another, and they are materially immediate to us, yet simultaneously the construction of the artwork is confidently in evidence, the assistants silently mediating our experience through the passing of objects, the presence of each audience member positioned closely in relation to one another, necessarily in touching distance.

# 2.5 Signification and Participation

The distinction between art and real-world experience is increasingly blurred in many contemporary performance genres from performance art to gaming and social media–based practice. It is also at work in site-based theatre that intentionally draws upon built and social environments such as dreamthinkspeak's *Absent* or less overtly or consciously such as Stan's Cafe's *It's Your Film*. Immersive theatre, which has seen a global exponential rise in recent years, may initially seem disconnected from the real world as we are enveloped within the complex mise en scène of a performance from Punchdrunk, WildWorks, You Me Bum Bum Train and the like. However, the centrality of our own material body as close observers or participants continues to bombard us with real-world sense data (breathlessness as we run, the heat of our face behind a mask, the scents carefully embedded in the scenography), and hence this creates a visceral dialogue between the virtual and actual spheres. The frisson is in the hinterland between the two, as a Punchdrunk performer leads you by the hand for a confidential chat in a caravan in *The Drowned Man* (2013) or kisses you tenderly goodnight in *Sleep No More* (2003–present). These encounters, which I have experienced first-hand in London and New York, respectively, are too immediate to consider as discreet or additional layers of signification; they are more an instant perception of both/and (to loosely appropriate Robin Nelson's term for intermediality 2010), as it is both distinctly fictional *and* real. It is hypermediated theatrical signification.

Patrice Pavis was alert to this movement in diffuse performance framing towards the end of the twentieth century. He wrote that many contemporary artists sought to create "the impression that there is no division between art and life, contemporary art has often endeavoured to invent forms in which the frame is eliminated" (Pavis 1998: 155–156). Such practices bring our attention to the impact of "noise", as Elleström refers to it: "The basic phenomenon of disruptions that occur on the way from the producer's mind to the perceiver's" (2020: 23). Such a phenomenon is of major significance within the capacious hypermedium of theatre. When a single media product (painting, sculpture, poem, solo dance, etc.) is presented within its own qualified medium of fine art and so on, there is a more predictable bandwidth in the potential cognitive import from producer to perceiver's mind. Whilst the perceiver may draw upon a range of extracommunicational domains to inform the intracommunicational domain, there is a greater degree of direct and bounded modes, and therefore signifiers, that can be employed and thus interpreted. Elleström refers to this modal stability in relation to art: "Whereas painting is a qualified medium because expected aesthetic qualities are to be presented within certain social and artistic frames that are bound to undergo changes, its expected modal traits are relatively stable" (2020: 58). However, as theatre is able to host a plethora of qualified and technical media within which are countless modal options, the increase in disruption and "promiscuous" cognitive import is inevitable. Compounding this promiscuity are what may be described as the intentional acts of "noise disruption" embedded within work by the artists themselves, in terms of collision, contradiction and juxtaposition. This premise finds correspondence with Simonson's articulation of "intermedial gaps" in her contribution to the present publication, "moments that withhold as much as they communicate, and that communicate withholding" in which there is the fertile potential for "acts of concealment" (2020: 4–5). Such intentions can be seen in, amongst others, Montage, Fluxus and Assemblage practices which gathered pace within the twentieth century and were often appropriated into theatre experimentations. These practices problematise the premise of intentionality in relation to the act of production and transfer outlined in Elleström's Fig. 1.1 (2020: 16). The media product may be clearly framed as performance, but specific modes of transfer—text, image, audio recordings and so on—may have intentionally been found and selected at random, rejecting established norms of authorial control. In such instances, the act of production is proportionally at a greater divide, in terms of predictability, from the act of perception compared to media products within other qualified media. This disruption of cognitive import is amplified when the capacity for participant agency is factored in, as we make unpredictable interventions in to the processes of mediation.

The movements and timings of all forty-five participants during *In Many Hands* make infinite adjustments to the cognitive import for each and every one of us within the room. The tactile nature of the event and the plethora of objects we touch multiply the extracommunicational domains at play as each person confronts every object with intimacy and a degree of vulnerability. Kate McIntosh, in reference to the performance, notes how "people's threshold of what is challenging is really different from one person to another" (2017); a fragile bird's skull may be anatomically fascinating to one person whilst an unnerving glimmer of mortality to another. In such instances, the "overlaps" between producer's and perceiver's minds are unpredictable at best. This is certainly not to say that such uncertainty is problematic or a weakness. It is one of the great strengths of theatre, particularly in the experimentations of recent decades, that coincidence, happenstance and participant agency are central to the construction of cognitive import and representation. It may be argued that one of the fundamental and particular features of the hypermedium of theatre is the facility for collaboration and participant authorship through real-time transformation of events.

# 2.6 Angles of Mediation and Exclusivity

This particular capacity of theatre to proliferate new angles of mediation and hence profusions of representations can also be seen in *It's Your Film* as we are taken behind the scenes for the second viewing of the piece. In the original version, our viewpoint is intentionally restricted as we look ahead towards a small aperture in which images are projected into our eyeline, never quite knowing if these are pre-recorded or live, vignettes of furtive assignations no more than a few seconds each. For the following version, we are taken through a different door and sat alone on a chair to the side of the backstage area. What unfolds before us is a choreography of three performers, constructing each image in the moment, projected through the Pepper's Ghost device into the booth for the next audience member. Such a performance obfuscates the line between extracommunicational and intracommunicational domains as the second experience is intensely informed by the preceding experience in the booth. The rich, virtual domain of noir intrigue from the first performance is still imprinted on the memory as you are witness to its deconstruction a few minutes later. The intracommunicational experience of the first bleeds into the latter's intracommunicational world as we partly project ourselves back into the original experience in order to take pleasure in the conspiracy of the backstage reveal.

This playfulness can also be seen in more mainstream work such as *Network* (2018), starring Brian Cranston at The National Theatre in London, for which during certain performances you could book a table on stage for a fine dining *Foodwork* experience. Whilst these diners may not also have seen the play from the auditorium, as per *It's Your Film*, it is setting a similar mediating puzzle as participants enter into a dialogic mode of spectatorship, partly immersed within their onstage mise en scène and partly projecting themselves out into the perspective of the auditorium to imagine the viewpoint usually afforded to the audience. The pleasure, according to several audience members I spoke to who participated in the *Foodwork* events, was the thrill of knowing what the main audience could and could not see. Pivotal to this thrill is their contextually qualified understanding of proscenium arch productions such as *The Ferryman*.

In both of the examples at The National Theatre and The Lansdowne, theatre has created and afforded angles of exclusivity which are a specific quality distinct from other qualified media. An art exhibition may have a private viewing, but in the end all visitors to the gallery see the same art object from a similar angle. A film premiere may have an illustrious and select guest list, but the media product is the same in any cinema once on screen, whilst a great novel is the same set of symbolic signs in my hand as it is for any literary critic. The context of perception is different in each of these latter cases but the modalities and modes are the same. However, in theatre, the spatial modality can be fundamentally adjusted within any given performance event, often affording simultaneously different spectatorial dimensions as in *Network* and *It's Your Film*. It could be argued that gaming has a similar capacity as multiple gamers can be perceiving a virtual environment from multiple angles online but these are virtual recalibrations of space whereas theatre can accommodate such virtual spaces materially intact as well as real-world physical displacements. Alongside the material mobility already alluded to by Georgi and others, and the temporal and sensorial mobility that I have foregrounded, it is clear that spatial mobility is also of considerable significance when defining the properties and potency of the theatrical hypermedium.

### 2.7 Architecture of Commerce

Having considered, with greater nuance, the qualities of theatre as a hypermedium, what may be fathomed in regard to theatre's capacity for housing other media and reframing them with hybrid both/and signification? Is there something unique or dominant in this complex interplay of mediation and representation? Before we ascribe theatre such a preeminent position in a hierarchy of qualified mediation, I am reminded again of the experience at The Lansdowne. Earlier in the chapter, I suggested an interdependence between commercial architecture (specifically real estate) and theatre. Whilst theatre can strategically and creatively colonise such space, it is assertable that the dominant qualified medium in such scenarios is the commercial environment itself. In free market neo-liberal economies, it would be naïve to think that there is always a mutual exchange within such practices. At times theatre can subvert or question the values of such economies and the marketisation and commodification of everyday experiences whilst housing itself in the very centre of such commercial institutions. *vHotelling* (2016) by the Australian company Not Yet It's Difficult is a prime example of this, set within the hotels of Gold Coast yet designed to question the façade of hospitality. However, alongside such work that pushes back against corporate influence, there is undoubtedly an argument that, in many instances, the architecture of commerce is a far more voracious and all-consuming qualified medium than theatre. It is immediately evident in the Gielgud Theatre, owned by the Delfont Mackintosh group, which staged *The Ferryman*, as £100 ticket prices are quickly complemented by £10 glasses of champagne sipped within the auditorium. The modes of theatre are often aligned to the modes of commerce. This correlation is overtly celebrated in the Theatre in the Clouds experience at The Shard in London which presents theatre near the top of the tower. *The Handbook* style magazine writes that:

Once seated, the £95 per person ticket allows for you to indulge in two glasses of Champagne and a brimming selection of Shangri-La canapés; fairly priced considering all that's involved, plus solely travelling up to The Shard's viewing point costs £32 alone. What's more, the audience is made up of a truly intimate number of people, just shy of 20, making it an idyllic date-night experience or something for those seeking a night that subverts the norm of dinner dates and drinks. (The Handbook 2019)

The angles of exclusivity in this instance are predominantly driven by financial rather than directorial decisions, and the same argument could be made of *Network* and other such productions starring marquee names. It must be stressed that these observations of theatre co-opted or central to economic imperatives are not offered, on my part, to make broad political points, but are placed as a reminder of our contemporary economic context. Jane Jacobs, the renowned activist on urban planning, famously stated that "A city cannot be a work of art" (1961: 372), yet works of art are often integral, beyond their capacities to entertain, to the economic functions of "the city" and likewise they are sustained by these functions. The contextually and operationally qualifying aspects of theatre are often indivisible from those of commerce, and if not indivisible, then they are brought into convergence for mutual gain. The "communicative task", to use Elleström's term for the operationally qualifying aspect (2020: 61), is often in theatre as well as commerce, to reaffirm and re-sell lifestyle aspirations. The architectural spaces of commerce are technical media, be that skyscrapers staging Noel Coward or stately homes staging Shakespeare, but they are also a cornerstone of the qualified medium of commerce which often houses theatre, or certain modes of theatre, as part of its own mise en scène. Moving towards a conclusion, it may seem that I am suggesting that theatre has capitulated to commercial imperatives, but on the contrary my specific argument is that theatre's status as a hypermedium needs framing, as Pavis would remind us, in the contemporary context of obscured divisions between art and, in this case, commercial life. Theatre is both adept at negotiating this relationship, sometimes out of financial necessity and sometimes out of creative curiosity, or both. If not in partnership with the architecture of commerce, then theatre, as in the case of *vHotelling*, has the armoury to expose the inconsistencies or inequalities of the economic systems in which we reside as it has the capacity to inhabit and subvert the modalities and modes of commerce.

#### 2.8 Conclusion

At the beginning of this chapter, it was highlighted in many of the references that theatre was a particularly special form of qualified medium, a hypermedium and an expansive and generous host to all other qualified media whose materiality it deftly embraced intact but reframed with theatrical signification. Using Elleström's modalities of media, my intention in this chapter has been to enhance these statements by Georgi, Kattenbelt and others by establishing the significance of the spatiotemporal and sensorial modalities, alongside the material modality, in realising the hypermedium and to shed greater light on what this specific hybridised theatrical signification may look like and what it may accomplish.

The mobility of materiality is undoubtedly the most obvious of distinctions for theatre in this context as objects, bodies, recordings, screens, social media feeds and so on can all be enfolded within a theatrical event. This material mobility can be extended to the consideration of theatre as a technical medium, a physical host for performance. Unlike other qualified media, theatre can inhabit and shift through endless technical hosts, from traditional West End theatres to Birmingham residential towers, from London skyscrapers to Australian hotel complexes and far beyond. A single theatre performance can commence in one type of space and traverse through multiple other spaces (pavements, bedrooms, churches, etc.), appropriated as the technical media of theatre for the time they are required and then jettisoned back to their original context. Other qualified media are arguably not so flexible, as despite ongoing experimentations, fine art generally requires (or is perceived commercially to require) some form of canvas to realise the basic medium of visual images and literature needs the pages of a book or a digital device to realise the basic medium of written text. It may be initially argued that art, for example, can be galleried in any location but the gallery is not the technical medium of art; it is just one location, contextually agreed by society, to observe the qualified medium of fine art. The act of perception in fine art can be substantially, if not entirely, delineated from the context in which the media products of art are observed. We may profoundly contemplate a masterpiece by Hockney or O'Keefe in the privacy of our own home with a copy purchased online, but this is not possible with theatre, even when spectating online, as we are always cognisant of the physical context. Location for theatre is the equivalent of canvas or a book; it is intrinsic to the form, enmeshed within the media products of theatre and hence the cognitive import.

Beyond the significance of material mobility, the influence of specific spatiotemporal and sensorial affordances within the hypermedium is likewise not to be underestimated. Through such affordances, new and imaginative angles of mediation are crafted by theatre-makers to reveal the medial arrangement whilst simultaneously saturating you, often inculcating you, in the virtual domain. Theatre, particularly participatory and immersive practice, has the extraordinary capacity to suspend the audience between the extracommunicational and intracommunicational domains as immediate sense data and subsequent sensations converge or conflict with the richness of the virtual domain that we are inhabiting *and* the complex set of discourses that we are referencing from beyond the immediate experience. The acceptance into our hands of the bird's skull in *In Many Hands* compels us to respond to the surface and appearance of the bone, as sense data and affective sensation, whilst simultaneously sensing and perceiving the gravity of the ceremony from the pace of movement around us and our external knowledge of ceremonial practices extrinsic to this event; we navigate both the moment and the context.

Back at The Lansdowne, my wife and I emerge out of the darkened rooms of *It's Your Film*, blinking into the light of a Birmingham street. Whilst the architecture of commerce may be an undeniably persuasive medium, today we did not rent a flat; we saw some theatre.

#### References


———. 2020. The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 1*, ed. Lars Elleström, 3–91. Basingstoke: Palgrave Macmillan.


———. 2020. Multimodal Acting and Performing. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 1* , ed. Lars Elleström, 113–140. Basingstoke: Palgrave Macmillan.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Multimodal Acting and Performing

# *Andy Lavender*

#### **Contents**


# 3.1 Modes, Modalities and the Actor as a Medium

This chapter considers how the work of the actor or performer might be understood with reference to ideas of multimodality. I should explain straight away that I sometimes use the terms 'actor' and 'performer' interchangeably, below, since I am less concerned here with distinctions that might be drawn between actors who play characters and performers who present something other than a fictionalised figure. 'Actor' and 'performer' for present purposes are individuals who convey an act of performance that we are interested to describe in a systematic way. I should also

© The Author(s) 2021 113

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_3

A. Lavender (\*)

Guildhall School of Music & Drama, London, UK e-mail: andy.lavender@gsmd.ac.uk

say immediately that such description is an attempt to transpose Lars Elleström's ideas about media modes and modalities (2020) to a consideration of performing. Of course, the actor is not a 'medium' in the way of the media in which she performs (theatre, film, radio drama, etc.). And yet the actor has a communicative function that can be elucidated according to the principles by which media communicate. There are some conceptual similarities between this project and that of Miriam Vieira in the present publication, which demonstrates how architecture as a medium intersects with considerations of embodiment and perspective (2020). I examine below how we can see the actor as what Elleström describes as a 'technical medium of display' (2020: 33–40) and how this is always a matter that involves the perception of the spectator.

Elleström makes his case in an essay entitled "The modalities of media: A model for understanding intermedial relations" (Elleström 2010) and revises and expands upon it in the current publication (2020). As Jørgen Bruhn explains in a draft to his article in the present publication, Elleström's approach was novel in taking "seriously the fact that many of the insights of both media theory and interart studies had very clear parallels in the field of multimodal studies" and in elaborating a scheme that combined intermedial and multimodal perspectives within a single system of communications analysis (Bruhn 2020; see also Lotherington 2020: 218, 226). In multimodal approaches, communicative modes are notably diverse and include, for instance, written texts, visual images, diagrams, typography, facial gestures, and nods of the head (see, e.g., Bateman et al. 2017: 16; Djonov and Zhao 2014: 1; Fernandes 2016: 1). A key principle is the *transactional* nature of modes, their work as communicative elements within a process of signification. As Gunther Kress suggests, "*Mode* is a socially shaped and culturally given resource for making meaning" (2014: 60, original emphasis; see also Jewitt 2014: 22). This aligns with scholarly interests elsewhere in media and mediation, and Elleström's work brings a semiotic orientation (concerned with meaning making) to considerations of the technical and aesthetic infrastructures of media, which are themselves pressured by specific historical and industrial situations. As Heather Lotherington observes, "[m]edia and mediation are exceedingly complex in today's communication landscape. A media product or medium exists in historical-social-cultural space as well as physicalsensorial-cognitive space." As she suggests, "Elleström's […] intermediality paradigm offers analytical specificity and categorization that is helpful in understanding contemporary communication" (2020: 222, 229).

In some quarters, 'mode' (as a way of doing something, or the particular form that a phenomenon, condition or activity takes) precedes 'modality', which describes ways in which the activity/phenomenon/object can be seen to express its mode. However, we are with Elleström, for whom modalities (of which there are four) should be ascribed conceptually before modes. The modalities and modes then operate synchronously within a larger interrelated system of communication. Indeed, the communicative aspect (shaped by Kress's 'culturally given') is key. This functional approach emphasises the role of the receiver/perceiver amid mediation and circumscribes the formation and efficacy of media in the first place. As Elleström summarises:

there are four media modalities, four types of basic media modes. For something to acquire the function of a media product, it must be *material* in some way, understood as a physical matter or phenomenon. Such a physical existence must be present in space and/or time for it to exist; it needs to have some sort of *spatiotemporal* extension. It must also be perceptible to at least one of our senses, which is to say that a media product has to be *sensorial*. Finally, it must create meaning through signs; it must be *semiotic*. This adds up to the material, spatiotemporal, sensorial and semiotic modalities. […] no media products or media types can exist unless they have at least one mode of each modality […] the four media modalities form an indispensable skeleton upon which all media products are built. (Elleström 2020: 46)

The skeleton is formed of two distinct kinds of bone (if you will allow the analogy to be continued). Three of the modalities—the material, spatiotemporal and sensorial—are 'presemiotic' (2020: 47), in that they do not originate from cognition but rather describe structuring aspects that will *then affect* cognition. The fourth modality, the semiotic, Elleström describes as "the frame for understanding representation. All media products are semiotic because if the sensory configurations with material, spatiotemporal and sensorial properties do not represent anything, they have no communicative function" (2020: 49). The four modalities, then, are always mutually in play in some way in any process of mediation, and each of the modalities can be described by way of a subset of modes. As Elleström suggests: "All media are multimodal in that they must have at least one mode from each modality" (2020: 53).

The modalities and modes are given in tabular form in the 2010 version of Elleström's essay, under the headings 'Modality', 'What the modality is', and 'The most important modes of the modality'. For example, the 'sensorial modality' is described as "The physical and mental acts of perceiving the interface of the medium through the sense faculties" and its most important modes given as "seeing, hearing, feeling, tasting, smelling" (2010: 36). This schematic organisation allows for a critical perspective and a procedure—one based on recognising specific features and distinctions and accounting for media operations as, continuously, a combination of modalities and modes (see Elleström 2010: 24).

The model is geared towards elucidation and concerns communication that is itself, Elleström observes, always about conveying "cognitive import" (2020: 12–13; the revised version of the essay provides greater emphasis on this aspect). The four modalities combine within media processes that are at the heart of signification, understood here in terms of 'representation' in its semiotic sense—that is, as a means to transact perception on the part of the perceiver, in a domain of cognition. The fulcrum is the communicative act. A medium in this sphere of analysis is only effectual insofar as its operation permits what Elleström calls a "media product", defined as "the intermediate stage that enables the transfer of cognitive import from a producer's to a perceiver's mind" (2020: 13). This is to an extent metaphorical for, as Elleström goes on to say, the model envisions not solely a single line of transfer from one individual to another but the possibility of multiple producing elements that converge into media products conveyed to multiple perceivers. In any case:

A media product is a single physical entity or phenomenon that enables inter-human communication […] [it] may be realised by either non-bodily or bodily matter (including matter emanating directly from a body), or a combination of the two. […] other bodies, such as the bodies of actors, may be used as media products. (2020: 8, 13, 14)

Let us remain with the bodies of actors. As Elleström suggests, a media product "requires some sort of perceptible physical phenomenon to come into existence" (2020: 33)—and such phenomena he calls (in a slightly self-deprecating way, noting the "cumbersome" nature of the term) "technical media of display of sensory configurations" (2020: 34). The actor is one such, for she fulfils a central criterion of the technical medium of display—she "*mediates* sensory configurations in the context of communication; [she] realises and displays the entities that we construe as media products" (2020: 34).

In what way can the actor be described as a medium of communication? We might start by considering the actor as character, probably the dominant paradigm of Western representational performance, whether in theatre, film, television or radio drama, or other kinds of performance such as appearance in adverts or short-form online videos. The communicating aspects are manifold, which means that the actor can be thought of as a technical medium comprised of multiple internal technical media of communication: the actor's voice, convening signifying information both through sonic inflections and verbal utterance; the actor's gesture, communicating through learned socio-cultural codes and aesthetic codes; the actor's costume, but also the manner in which costume items are worn (a jaunty angle to a hat, an overweening deportment of a uniform, an embarrassed donning of a tie by a teenager); the rhythm of movement; the signifying aspect of entrances, exits and other positional manoeuvres; the nature of interaction with other actors and scenic objects; and any significant liaison—by eye contact, gesture or direct address—with the spectator. Elleström, following Peircean semiotics, describes such elements variously as iconic, indexical and semiotic (2020). These prospective signifying aspects can variously be conceived as media products if and where they effect perception on the part of the spectator. The actor operates within the structures and processes of the medium in which she performs—this could be film, theatre, television and so forth (for there are others) which will itself be comprised of media products that depend upon the organisation provided by, for example, script, direction, lighting, sound design and other features that help construct the medium. The actor is thereby herself a complex technical medium, disporting media products by way of (for instance) gesture, utterance and movement, within a larger media product that may form what Elleström calls "one perceptual gestalt" (2020: 76). There may be differences to be drawn between the appearance of an actor in a live situation to a co-present audience; an actor in a live situation appearing to a remote audience; and an actor in a situation that was recorded previously, whether through analogue or digital technologies, accessed at a different time by one or more audience members. In terms of Elleström's model, we are in the realms of distinct qualified media types that operate through slightly different arrangements of technical media of display. For present purposes, however, I suggest that we consider 'actor/performer' as a technical medium comprised of multiple internal technical media of communication and move from this to a modal description of the actor's work and function.

How do we make sense of the actor's performance as an isolable feature of mediation within this larger mediation machine? Elleström addresses Yeats's question: "How can we know the dancer from the dance?"

On one hand, the dancer and the dance are inseparable in the sense that they are the same material entity occupying physical space and time. On the other hand, they are two different things. Whereas the dancer is a body acting as a technical medium of display, the dance is a function of the material body a media product. (2020: 36)

This also puts me in mind of ripples from the throw of a stone into a pond or the reflections of one mirror in another—a set of interconnected iterations that *layer* perception as an act, not only as a single instance but as a composite in which we can find signification in the part as well as the whole. This gives us a clue as to how performance might be 'read'—we can find its agency and affect in something as simple as a change of eyeline or the travel of a finger on the part of an actor; as multiple as the interactions of a group of actors within a sustained scene of physical interchange; and as multi-layered as the transactions achieved when (for example) a film captures a relation between actors within a *mise en scène* that draws attention to its signifying devices while narration 'lands' for the spectator in a moment of clarity, exuberance or awful realisation. This is perhaps to say that the actor's performance in itself will in some manner always allow only a *partial* reading; for it cannot be self-contained but will necessarily require contextual coordination with the film, staging or other set of circumstances in which the performance is presented. No performance is innocent of the medium in which it appears. That said, we must start somewhere, and for the purposes of this chapter, the actor's body is as good a place as any.

# 3.2 On Analysing Acts of Performance (in a Multimodal Situation)

My proposition here is that we can transpose Elleström's idea of modalities and modes from a consideration of specific media to a consideration of specific acts of performance. I will come back to how we might do this, but first I should address the obvious question: Why would we want to do it in the first place?

One answer is that it would be helpful to have a system to analyse performance that can accommodate the range of performances that now inhabit the contemporary cultural scene. This includes character-based acting (whether in film, television, online performances, radio or theatre); acting that slides between presenting a character and presenting the self; the appearance of performers in a wider array of events, situations and installations; the performance of singers and other artists in diverse kinds of real-time and recorded presentation; the appearance of individuals, who may or may not be 'celebrities', in reality-based entertainments and scenarios; the actions or interactions of members of the public (for want of a better phrase—we might also say 'spectators' or 'participants'), who become active in events in more or less significant ways; and the appearance of politicians and other figures in the public sphere that can be assessed through the critical instruments of performance studies.

You might reasonably suggest that this is too disparate a list, gathering dissimilar kinds of performance, and I wouldn't necessarily disagree except that a second answer to the question ('why consider modes of performance?') is that performance takes place in this highly interconnected and pluralised environment, where we are often witness to (and sometimes party to) different kinds of performance every day and where a cultural slippage across media and types of performance is now routine rather than irregular. It is no accident that Elleström developed his model expressly for an intermedial situation, with distinct media operating in interdependence (Elleström 2010: 12). The field of acting and performance lies in a contemporary performance scene that is heavily mediated and profoundly plural—not just in the sorts of performance that we see but in its varying registers and slippage across forms. Indeed, there has been a growing consensus in both communications studies and performance studies concerning the mixed and plural nature of the cultural and communicative sphere. In *Multimodality*, for instance, Bateman et al. describe an increasingly interdisciplinary communicative environment, in which "we no longer have separate media; we have instead media that are capable of doing the jobs of many" (2017: 14). The instances given of this include an iPad showing newspaper pages or a website playing music—and while these may be more prevalent in some parts of the world than others, the acceleration of cross-medial communication is indisputable. Performance theorist Shannon Jackson makes a related point from a different disciplinary location, in her discussion of "the hypercontextuality of performance", which she sees as both a condition (performance operates across genealogical fields) but also a challenge (performance is subject to dispersal and has an "intensely contingent status") (Jackson 2004: 6). The figure of the actor/performer takes on a more protean form in this environment (see, for instance, Dunbar and Harrop 2018: 13). In which case, it becomes useful to consider whether there is a common way of calibrating performance amid such varied and disparate instances.

There are many extant systems for analysing performance. One way of taking a perspective on this new project of 'modal' analysis is to consider the viability of established systems for current performance situations. This is not to argue that all the old models must be done away with—far from it. Rather, it is a check upon the aesthetic- and context-specific aspects of acting/performing systems, which tend to be more adequate for some kinds of work than for others. Every system of performance and performance analysis is located culturally and has its own specific history. To turn first, briefly, to the most celebrated of all (at least, in a modern Western context), Konstantin Stanislavski's system of actor training, developed over the first part of the twentieth century and consolidated broadly between the 1940s and the 1970s. Stanislavski's ideas had a troubled journey of transmission into print, subject to delays and editorial partialities. His influential volume *An Actor Prepares* (first published in Russian in 1936), for example, was described by Jean Benedetti as "a pale shadow" of the larger intended oeuvre, *An Actor's Work* (Benedetti 1999 [1988]: 366). The approach and practices that Stanislavski recommended likewise enjoyed a mixed journey into different spheres of influence. In *An Acrobat of the Heart*, Stephen Wangh provides a summary of a widely argued historical trajectory, tracing selective strands of Stanislavski's 'system' through their migration to other territories. Stanislavski, Wangh suggests,

searched for a method that would depend on *inner*, psychological practices. […] [He] developed the sense-memory and 'affective memory' exercises. It was these 'internal' techniques that Stanislavski's students Richard Boleslavsky and Maria Ouspenskaya brought from Russia to New York in 1923. And it was this work that they taught at their American Laboratory Theater where Harold Clurman, Stella Adler, and Lee Strasberg came to study. (Wangh 2000: xxxiii)

As students of theatre know, this set of practices provided the structure and impetus for what would become the American Method system, geared around the inner exploration on the part of the actor of the individual and psychological dynamics of the character, drawing extensively on personal experience and individual intuition. Meanwhile, as Wangh describes, "Stanislavski […] realized that by concentrating so completely on the actor's mind, he had ignored the actor's body. In his later years Stanislavski developed a system of what he called 'physical actions'" (xxxiv). The diffusion of influence is complicated by way of the inflections made by acolytes, including Michael Chekhov's work during the 1930s. Chekhov (nephew of playwright Anton Chekhov) had been a member of Stanislavski's First Studio and had worked on Stanislavski's more physically oriented approach. He disseminated his own adaptation and interpretation of this, focusing on the 'psychological gesture', in his work in Germany, Lithuania, England and the US. 'Media products' associated with Stanislavski's approach, then, are not uniform.

Stanislavski's teaching had a fraught history in its own right, both in terms of its codification through published writings—which took time and required extensive editing that Stanislavski himself was only partly involved in—and the differing set of understandings that ensued, as the work gained traction in different countries and through the work of different disciples. I do not mean to unpick that legacy and its differences here (for useful discussion, see Carnicke 2009 and Pitches 2006). The larger point is that however you nuance it—whether as predominantly concerned with internal psychology, the actor's interior state, or external gesture and the physical work of the performer—Stanislavski's system applies expressly to the presentation of character that typically derives from work in relation to a playtext. And that simply doesn't apply to the context, artistic type, mode of production, or technical requirement of many instances of performance today. Stanislavski's system remains useful for the development of performance in narrative-based drama in which characters interact within a broadly realist aesthetic. It is inadequate as a means by which to prepare or explain performance in, for instance, Heiner Müller's *Hamletmachine* (1977, first presented at the Théâtre Gérard Philipe, Paris, France, in 1979) or Punchdrunk's *Sleep No More*, the company's site-specific version of *Macbeth* (presented at the McKittrick Hotel in New York from March 2011, and still running as I write)—let alone a repertoire of performances in the wider field, including gallery-based events, virtual reality projects and postdramatic dance-based pieces. Stanislavski's methods, with their apparatus of 'magic ifs', intentions, circles of attention, and units and objectives, are applicable in some of these instances, but not to all and certainly not uniformly.

The same is true of other systems for producing and analysing performance. Let us consider Michael Kirby's celebrated spectrum (1972) describing different modes from acting to performance—an important contribution to understanding performance in a gathering postmodern context and an account whose attempt to cover a range of performance manifestations evinces a not-dissimilar ambition to the present essay. Kirby remarks that his work was inspired in particular by the Happenings of the early- to mid-1960s, where

every aspect of theatre in this country has changed: scripts have lost their importance and performances are created collectively, the physical relationship of audience and performance has been altered in many different ways and has been made an inherent part of the piece, audience participation has been investigated, 'found' spaces rather than theatres have been used for performance and several different places employed sequentially for the same performance, there has been an increased emphasis on movement and on visual imagery. (Kirby 1972: 12)

In response to this situation—which has become a norm in many different locations—Kirby proposed a performance spectrum from 'nonacting' to 'acting', with (for example) 'Non-matrixed Performing' at one end of the spectrum and 'Complex Acting' at the other (Kirby 1972: 8). The value of Kirby's spectrum lies in its attempt both to incorporate and distinguish between a wide range of performance activity and its insistence that the project is one of classification rather than evaluation (value judgements as to whether the acting is 'good' or not, Kirby insists, are irrelevant). Its difficulty, however, is that it doesn't elaborate a set of technical calibrations of any degree of subtlety for use by either the critic or the performer—rather, it describes the place of a performance on a spectrum that is largely determined by the extent to which the individual performer can be seen to be engaging in representation-based characterisation, which thereby provides a norm against which things are judged. Kirby suggests that the "simplest characteristics that define acting […] may be either physical or emotional. If the performer does something to simulate, represent, impersonate and so forth, he is acting" (1972: 6). 'Acting' is the standard against which other kinds of performance—non-acting—are adjudicated.

The spectrum is yet more problematic in relation to contemporary performances where we may well observe 'acting' and 'non-matrixed representation' in the same performance—as has been the case for over a generation, at least. Twenty-five years after Kirby's "On acting and notacting", Philip Auslander's influential collection of essays entitled *From Acting to Performance* was published (1997). Auslander was attempting to assess performance in and amid the ascendant paradigm of postmodernism. The New York-based company The Wooster Group provides a reference point. As Auslander says, "Wooster Group performances […] are less representations of an exterior reality than of the relationship of the performers to the circumstances of performance. Their style of performing, which at once evokes and critiques conventional acting, could be described as performance 'about' acting" (1997: 41). Auslander's essays on this new performance scene are perceptive but, as with Kirby, provide a theoretical perspective rather than a set of calibrations for use by contemporary performers or that enable precise technical descriptions of contemporary performance. From the 1960s onwards, the incursion of modes of performance other than acting has been noted, analysed, and indeed taught. Can we conceive of a rubric for this extensive diversity of performance that might help us to place it, critique it and teach it by way of a single continuum of analysis?

One further brief detour, by way of a partial affirmative. In 1928, Rudolf Laban published *Kinetographie Laban*, which set out the basis for 'Labanotation', his system for describing and classifying human movement. Laban describes four categories: Body, Shape, Space and Effort (Dynamics). These key components can be organised by way of eight efforts (as outlined below) that each has four components: Space/Focus (Direct or Indirect), Time (Quick or Sustained), Weight (Heavy or Light) and Flow (Bound or Free). The system is represented in tabular form in Table 3.1.

Laban's system addresses human movement, so while it has proved to be consistently useful for actors and performers, it cannot provide a comprehensive means of classifying performance activity. You might also suggest that this modernist effort at exhaustive categorisation is in any case now inappropriate to the mixed and messy, hybrid and fluid scene of contemporary performance. The task before us is to outline a scheme that allows for close analysis of a wide range of instances and that recognises variety within a system designed to accommodate difference. Allow me to attempt just such an effort in a post-postmodern moment, based on a reworking of Elleström's modalities and modes. As we saw above, Elleström described the relation between modes and modalities in tabular


**Table 3.1** Rudolf Laban's eight efforts of movement and their four components (the system is tabulated in various forms; the one given here is from Espeland 2015)

Source: Todd Espeland, The Drama Teacher, 2015; used by permission from Theatrefolk Inc

form, and I propose to retain this schematic approach in the section that follows—noting in passing that it provides a means of calibrating interrelations between components that is not entirely dissimilar to Laban's system outlining categories of movement and their respective modal expressions.

The challenge, then, is whether we can elaborate a scheme for analysing performance that is adequate to a range of contemporary performance acts. I am aware of the dangers, for any system that seeks to be widely inclusive may then appear to be context-free (so apolitical) and only vaguely specific (so inappropriate for a detailed analysis). I would prefer a cultural materialist approach that accommodates local specificity within a common scheme, one that seeks to provide a useful tool for analysis of apparently diverse objects of performance. The effort is not to reduce everything to a single paradigm but to move towards a system that is sufficiently flexible to recognise new forms and combinations whilst both *allowing* and *accounting for* their differences.

# 3.3 Modes and Modalities of Performance

Elleström describes four key modalities that provide the underpinning skeleton upon which any media product is conveyed. As we have seen, these can be represented schematically, along with the respective modes that attach to the modalities (see Elleström 2010: 36). I do not propose a direct reading of the work of the performer by way of Elleström's four modalities—although such a reading would be possible, allowing that the actor herself is a medium of communication. Rather, I suggest that we can identify four key modalities that apply to any human act of performance, taking performance here in its presentational/representational aspect what Jon McKenzie describes as 'cultural performance' rather than 'organizational performance' or 'technical performance' (McKenzie 2001). In other words, we are concerned with the scope of performance that in Kirby's spectrum moves from character-based acting through to nonmatrixed performance—but we are looking for a set of coordinates to help describe the composition and effect of each sort of performance before us.

The notion of modalities of performance provides us with such coordinates. We can assign four key modalities: the *emotional*, the *physical*, the *discursive* and the *contextual* (these are further defined in Table 3.2). Each can be addressed in relation to specific modes that apply to the modality some one or more of which will be present in the performance modality in question. All four modalities will be describable in any performance act, even if one or more are not predominant. The flexibility and plural potential combinations mean that the system of analysis can be applied to a wide range of performance instances. You might protest that this is like creating a map of the world that is as large as the world, if it allows for any and all possible performance—except that this is a portmanteau scheme. The four modalities are always structurally defining in some manner, their interconnection and combination variable, the specific arrangement of modes flexible.

In Elleström's table, the modalities of media are accompanied by respective modes that apply to the modalities. It is worth noting that Elleström has removed the table from the extended version of the article presented in the present publication (2020). In correspondence during the editing process, he suggested that this was "not at all because I think it's wrong as such to use such a table, but because it prompted some simplified and misleading uses of the concept of media modalities". Informed by Elleström's model—but mindful of his caution—we can outline something similar in relation to modalities and modes of performance, as seen in Table 3.2.

I should emphasise that the model is presented initially as a way of *reading* and *categorising* performance. It may subsequently inform ways of preparing for and producing performance—it would certainly be possible to conceive a range of exercises and rehearsal-room approaches that focus on a modal approach to the work in hand, in order to fine-tune presentation within a particular modality and interrelations between modes. This


**Table 3.2** Modalities of performance (Lavender, after Elleström 2010: 36)

(*continued*)


**Table 3.2** (continued)

is not my immediate priority in the present essay, where I am approaching performance as a critic after the act, rather than from the perspective of an actor or director of the act. That said, the two perspectives are by no means irreconcilable, and the work of developing a modal approach to performer training could be highly productive.

We can tabulate the relationship between modalities and modes as seen in Table 3.3. The modalities operate *in relation to the perceiver*. That is to say, for example, the emotional modality is relevant insofar as the emotional state presented by the performer is ascribed by the perceiver. It matters less what the actual emotional state of the performer is (although this may also be relevant). If a character played by an actor, for example, appears morose, the emotional condition of the actor herself is a secondorder concern. The first-order concerns are the suitability of the apparent emotional state to the material in hand and (for this is not quite the same thing) the effect (potentially both in cognition and affect) of the apparent emotional state on the perceiver.

The physical modality likewise may involve techniques of performance that separate the appearance of physical effort from the actual physical effort of the performer (for example, being breathless after running may


**Table 3.3** Relationship between performance modalities and modes

belong to the character rather than the actor). A range of techniques are employed in drama schools across the world to help performers demonstrate subtlety and range within emotional and physical modalities. The point of the system presented here is to delineate how the modes of these modalities might be read in the performance act.

The discursive modality is to do with the meaning and significance that the spectator ascribes to the performance that is presented. This will depend upon additional information presented by (for example) other characters/performers in the piece, the arrangement of narrative, and scenic organisation such as, for instance, the configuration of lighting to create a threatening environment. The performer herself does not necessarily produce the communicative material that circulates here, but her performance is inextricably bound up with it and is enmeshed in a meaningrepertoire such that every instance of her performance is readable in and through the discursive modality. This applies even in more abstract pieces, where story is not at issue but nonetheless figurings of performance dynamics produce information for the spectator.

Something similar applies in relation to the contextual modality. For example, it matters to some members of the audience who go to see Benedict Cumberbatch playing Hamlet that Cumberbatch is a film star whose performances carry their trace in the memories of the spectator. It matters to some to see a 'first night', which comes with the added frisson of performers under the pressure of press night, or the sense of being among the first to witness a new piece. It matters to some to see a performance at a specific venue (one of my pilgrimages of a kind was to the Theater am Schiffbauerdamm in Berlin). It matters to some that the performer is one's daughter in a school play. If the discursive modality describes signifying details and meaning-ensembles that are generated by or presented *within* the performance itself, the contextual modality describes significant circumstances and situational details *surrounding* the performance that may have a bearing for perceivers.

In order to explore how the modalities and modes might be deployed, I will examine briefly four diverse instances: a performance by the actor Olivia Colman in the film *The Favourite* (character-based acting working from text); an excerpt from the opening scene of debbie tucker-green's play *ear for eye* (a theatre performance that is text-based but without the sense of character depth that is entailed by more narrative-based drama); the appearance of the pop star Miley Cyrus in the episode of the Netflix drama series *Black Mirror* entitled 'Rachel, Jack and Ashley Too'—Cyrus plays an intertextual version of herself as before-the-camera pop star in a narrative in which she is behind-the-scenes victim of her own celebrity; and the appearance of performers in Brett Bailey's installation *Sanctuary*. The selection deliberately moves across different sorts of characterisation, different dramatic genres and distinct modes of presentation. With all the analyses that follow, I will restrict myself to a fairly limited palette of modes, in order to suggest predominant features.

#### *3.3.1* **The Favourite** *(2018)*

Written by Deborah Davis and Tony McNamara, produced by Fox Searchlight and directed by Yorgos Lanthimos, *The Favourite* is a movie set in 1708 and thereafter in the Court of Queen Anne, monarch of a newly unified Great Britain. It focuses on the triangular relationships between Anne and the cousins Sarah Churchill, Duchess of Marlborough, and Abigail Hill (later Baroness Masham), depicting a rivalry between the two women for Anne's favour. As such, the film is both a period drama and a character study, exploring in particular the interpersonal dynamics between the three women: Olivia Colman, who played Anne, won the Academy Award (Oscar), BAFTA Award and Golden Globe Award (among others) for best actress for her performance in this role.

The scene that I focus on here is between Anne and Abigail (played by Emma Stone). Anne is presented as a tetchy and capricious monarch afflicted with physical ailments (she notably had gout) and emotional vulnerabilities that are in part ascribed to her loss of seventeen children in or close to childbirth. She has seventeen pet rabbits, one for each child, and the scene shows Abigail manoeuvring her way into the Queen's attention and affection by way of her interest in this peculiar colony, housed in the Queen's bedchamber.1 The modal analysis in Table 3.4 is geared to Colman's performance—we could undertake something similar for Stone's performance, but for the sake of concision, we will focus on Colman.

The larger part of Colman's work in this scene, I suggest, is in the emotional modality. She conveys a range that moves from being bored with having rather too much time on her hands, to having her interest pricked by questions concerning the pets—reinforced, we understand, by Abigail's presumed empathy at the loss of her children, which opens into a reading of the character of Anne as defined by sustained and partly repressed grief. The emotional registers here are conveyed in part physically—alterations of eyeline, tilts of the head, flashes of the eye and so forth—along with a rhythm to the performance that evokes its emotional throughline. Colman presents the queen precisely as a regal figure, with a characteristic physicality to do with monarchical sway and inherited position, but also suffused with individual attributes derived from illness, age and circumstance. There is, too, the particular reading of persona in which the actor's deportment meets the imagined corporeality of the character.

**Table 3.4** Modal analysis of Olivia Colman's performance in an excerpt from *The Favourite*


The 'contextual' modality is differently pertinent. The film is based on historical circumstances, so there is inevitably a negotiation in the viewer's mind concerning the balance between an envisaged actuality and the conventions and tactics of dramatization. Colman's appearance as the lead character in an international movie means that we read the performance in relation to conventions and comparators pertaining to Hollywood movies, and not least the trajectory provided by female leads in films dealing with similar monarchical topics: Judi Dench as Queen Victoria in *Mrs Brown* (1994) and *Victoria and Abdul* (2017), Cate Blanchett in *Elizabeth* (1998) or Helen Mirren in the mini-series *Elizabeth I* (2005), for instance. We may also bring to the movie foreknowledge of previous performances by Colman, Stone and Rachel Weisz (who plays Sarah Churchill), so that the performance mode is coloured by *relationality* arising from other screen appearances in the lineage of each particular actor.

#### *3.3.2* **ear for eye** *(2018)*

*ear for eye* is a play by the British playwright debbie tucker green (who prefers the lower case for her name and the titles of her works). Its inaugural production opened on 31 October 2018 at the Royal Court Theatre in London with tucker green as director. The play is in three parts. The first focuses on exchanges between small groups of characters, specified as African Americans or Black British—for example, a son and his parents, and an activist and an older mentor. The second focuses on a discussion (in part about a mass shooting in the US) between a white male academic and a black female student. The third presents filmed segments in which Caucasian individuals in friendship or family groups recount, verbatim, protocols from the Jim Crow laws affirming racial segregation in the US, and British and French slave codes. The play as a whole, then, presents a series of scenes and sequences in which characters (none of whom we get to know in any close or detailed way) discuss or negotiate nuances of interrelation with others, particularly elaborating on racial perspectives and prejudices and individuals unpack historic and schematic constructions of racial oppression. Here is the beginning of the first scene:

**PART ONE Scene One** *US*.


(tucker green 2018: 4; the excerpt from *ear for eye* by debbie tucker green is reprinted courtesy of Nick Hern Books www.nickhernbooks.co.uk)

Given the stage direction—mother and son are African Americans in the US—we understand the scene in the context of race relations in the US but also more generally. It becomes clear in the scene that from the mother's perspective there is no physical attitude on the part of her son that could not be construed as threatening or provocative by a (presumed white) policeman who chose to interpret his posture in this way. How might we describe the performance envisaged by this text (see Table 3.5)?

The modes calibrate differently depending on which performer we observe, although the scene as a whole can be characterised in relation to the attributes above. In terms of the emotional modality, the scene depends on our understanding that the mother is *concerned* for the safety of her son, while he becomes *angry* at the lengths to which he has to go to avoid her (or an imagined other's) ascription of threat. In the physical modality, both characters are gesturally *reactive* in response to utterances of their interlocutor, and the physicality of the scene is generally *tense* and held,


**Table 3.5** Modal analysis of performance in an excerpt from *ear for eye*

rather than, say, relaxed and fluid. Discursively, the scene (as with the play as a whole) does not follow key characters in order to tell a story, but rather presents a nugget of exchange that thematises the lived experience of black people amid racism; hence the work of the performers is *informational* and expressly requires that we read the scene *metatextually* as an emblem of black experience in the face of oppressive social and civic authority structures. Contextually, the performance provides aesthetic pleasure through the jagged, schematic and non-naturalistic dialogue, thereby asking spectators to recognise it as overtly *compositional*, while simultaneously it invites recognition of the larger social situation that it depicts, so that the performers' bodies are literally *representational*—representing (through a set of transparent dramatic devices) the perspective of black citizens—and *intertextual* in calling to mind the Rodney King beating and other instances of racially motivated police aggression.

The modes and modalities are not entirely self-contained—the *intertextual* designation, for example, could also belong to the 'discursive' modality. Modes within modalities may vary for individual spectators. For example, the 'contextual' modality might contain aspects of meaning that arise from watching a performance at the Royal Court Theatre, which may be different for a serial visitor interested in 'new writing' (the theatre's programming focus) than for a first-time visitor interested specifically in the topic of the play. Whilst we should be careful to acknowledge that reception can vary widely, the modal ensemble allows us to be reasonably precise in categorising the operation of performance in this piece as distinct from other sorts of performance in other situations.

# *3.3.3* **Black Mirror***—'Rachel, Jack and Ashley Too' (2019)*

*Black Mirror* is a dystopian science fiction series created by Charlie Brooker. Its first two series were broadcast by Channel Four in the UK (the first episode aired on 11 December 2011), before the programme was bought by the video streaming service Netflix. Each episode is selfcontained, typically telling a story with a twist or playing out the envisaged logical conclusions of a future development of contemporary technology—the programme is largely geared around the interface between human agency or insufficiency in relation to the imagined use or exploitation of technology. The episode that I consider here, entitled 'Rachel, Jack and Ashley Too' (directed by Anne Sewitsky), was released as part of Season Five on Netflix and first aired on 5 June 2019. It brings two parallel storylines together. Ashley O, played by Miley Cyrus, is a pop star (like Cyrus in real life). Ashley's management team develops a small robotic toy doll that can interact (largely by talking in a motivational and saccharine way) with its owner. When Ashley starts to rebel against her controlling manager, she is incarcerated. Meanwhile the teenage Rachel (played by Angourie Rice), an Ashley fan, receives one of the dolls as a present. After a device malfunction, the doll speaks not in marketing platitudes but with the 'authentic' voice of the incarcerated star. Various adventures of liberation follow.2 In Table 3.6 I suggest some of the predominant modes of Cyrus's performance across the piece as a whole.

'Rachel, Jack and Ashley Too' is inherently multimodal, in the intermedial sense, in that it features pop performance by Ashley alongside scenes showing her being interviewed on television and 'actuality' scenes out of the eye of any diegetic camera. This makes Cyrus's performance as a whole more complex than might be the case in a less modally diverse format. There is—as we might expect—a wide range of modes in the physical modality, covering Ashley's performance as a singer, her appearance as 'pop star' and the more private (and in places comically vulgar) behaviour of the character in a domestic setting. The discursive aspect of Cyrus's performance is not the least complicated part of this piece. Functionally, it reveals plot information; operates across generic registers while indicating a witting manifestation of these registers; presents a character who does not know what is happening to her and who seizes the initiative in a way


**Table 3.6** Modal analysis of Miley Cyrus's performance in *Black Mirror— Rachel, Jack and Ashley Too*

that resonates with feminist and #MeToo discourse. Contextually, the episode gains traction precisely for the casting of a music star in role as a music star within a dark narrative of exploitation. Meanwhile Cyrus performs a set of modal negotiations (of pop iconicity, trauma, liberation narrative and postmodern intertext) within a single drama—a black mirror indeed.

#### *3.3.4* **Sanctuary** *(2017)*

Conceived and directed by South African artist Brett Bailey and presented by Third World Bunfight, *Sanctuary* was first presented at the Fast Forward Festival in Athens on 3 May 2017.3 I visited the installation on 9 June 2017 at the Theater der Welt Festival in Hamburg, Germany, and write about it in Lavender (2019: 57–59). *Sanctuary* is an installation for spectators to walk through. It creates the figure of a maze—the myth of the minotaur is invoked—as the holding pattern for a series of 'stations', **Table 3.7** Modal analysis of performance in *Sanctuary*


each of which features a performer playing a fictional character presented as if actual. The performers do not speak and barely perform any physical action, instead simply sitting or standing *in situ*, so the human figures here are not so much *characterised* as *displayed*. Accompanying segments of text give us to understand that, for example, one character is 'Mahmoud, 36, dress shop owner' and another is a 23-year-old make-up artist. All are connected by the theme of migration and immigration, and they variously inhabit their stations amid scenographic devices (fences, police tape, orange life jackets) that evoke the holding pens and arrival centres of the contemporary refugee situation. How might we categorise the work of performance in this piece (see Table 3.7)?

The performance of the human figures is modally limited, strikingly so, particularly in the emotional and physical modalities. That is not to say that it is less important to the effect of *Sanctuary* than the performance in our previous instances. It attains its potency through the discursive and contextual modalities, as we read into it a set of circumstances and histories that are geographically specific while also responding to the larger civic and political crisis of migration (amid systemic failure on the part of nation states to find an adequate set of responses). You might argue that some of the modes that I have included in (for instance) the contextual modality—such as *scenographic*, *compositional*, *presentational*—are openended and would apply to any kind of performance. I do not necessarily disagree, but the point is that these modes *come to the fore*, and the performance thereby attains its distinctness through the specific use of the figure as a symbolic scenographic presence rather than, as with two of my previous examples, a (so to say) fleshed out character. The *environmental* and *relational* aspects of the piece partly concern its figurings of actuality and fiction, both thematically (it draws on actual instances of migration, even if it alters these within the fictional scenarios of the event) and by way of the audience's encounter, navigating an actual space which is also designed as a fictive labyrinth. In many ways, *Sanctuary* is a good example of what Mark Crossley describes as work that is "both distinctly fictional *and* real" and as "hypermediated theatrical signification" (Crossley 2020: 104).

*Sanctuary* does not feature character-based or narrative-driven performance, so it is useful here in helping to delineate the scope of a modal approach. In her contribution to the present publication, Kate Newell considers how adaptations of Margaret Atwood's *The Handmaid's Tale* and in particular accompanying graphic representations such as cover art and illustrations—shape particular responses to the material. As she suggests, "each adaptation foregrounds certain modalities to lead perceivers toward particular interpretations of the communication transfer" (2020: 36). Whilst I have transposed modalities from those outlined by Elleström, a similar principle applies, in that the foregrounding of specific modes within the modalities of performance leads spectators towards particular interpretations. The issue is not that any particular mode is unique or requires definition in an entirely bespoke way in relation to the performance in view. Rather, it is that the modal ensemble defines the performance, and a modal analysis provides insights as to how this ensemble is prepared and presented.

#### 3.4 Towards a Multimodal Performance Analysis

Elleström suggests that

Every medium consists of a fusion of modes that are partly, and in different degrees of palpability, shared by other media. […] Since the world, or rather our perception and conception of the world, is utterly multimodal, all media are more or less multimodal on the level of at least some of the four modalities. (2010: 24)

If we transpose this to a consideration of modes and modalities of performing, the shared principle is that in performance acts we also observe a fusion of modes, to different degrees, where performance is multimodal with respect to interrelating modes across the four modalities (emotional, physical, discursive and contextual). A focus on modes and modalities allows us to set aside questions about the psychology of characters or the interior drives of actors, in favour of a more interrelational exploration of actions, interactions and conveyances of communicative information. It provides a way of calibrating—either in advance (for training, rehearsal, preparation) or after the event (by way of analysis and critical review)—the constitutive features of any particular performance act.

I have presented here a speculative set of starting points, both as a form of analysis and as a potential set of coordinates for performance preparation. A next step would be to test this in workshop and rehearsal situations, with actors and performers preparing work for public presentation. Might performance be helped, or changed, if the performer pays attention to the respective modalities? Might it take on different nuances if the performer explores the relevant fusion of modes and how the multimodal dynamic shifts as the performance moves along? This is for fresh exploration. We can pause meanwhile at the observation of Bateman et al. that "Modes presented together then need to be interpreted with respect to one another and so cannot be considered independently" (2017: 17). A multimodal approach to performance requires this critical disposition, for the performers in front of us present a complex set of communicative possibilities. They are a channel for intracommunicational elements within the world of the drama or performance event; they operate in relation to extracommunicational features derived from our knowledge of the world; and they are phenomenal figures in their own right, summoning and shaping a communicative repertoire that is inherently multimodal.

### Notes


#### References


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Electronic Screens in Film Diegesis: Modality Modes and Qualifying Aspects of a Formation Enhanced by the Post-digital Era

*Andrea Virginás*

#### **Contents**


# 4.1 Screens and Frameworks

Examining phenomena of intermediality (with)in moving images requires supplementary methods imposed by the current post-digital era that "no longer seeks technical innovation or improvement, but considers digitization something that already happened and can be played with" (Cramer

A. Virginás (\*)

Sapientia University, Cluj-Napoca, Romania

<sup>©</sup> The Author(s) 2021 141

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_4

2013). Besides the most basic substratum of media history and/or philosophical aesthetic theory—as introduced by Paech (2011), Pethő (2011) and Bruhn and Gjelsvik (2018) in the analysis of filmic intermediality insights from semiotics, communication theory and narratology need to be invoked and combined systematically. Media history and philosophical aesthetics allow for cataloguing the cases of twenty-first-century communication technological and media cultural developments being incorporated, with a relative easiness and rapidity, by filmic diegesis1 originally formatted for the analogue platform. It is of the latter process that Joachim Paech wrote in the early 2010s that "[w]e shall see what happens with the intermediality of film in those new media surroundings where film cannot be distinguished any more from what it is not" (2011: 19). Trying to answer this question, descriptions of cases of intermediality in/of film need to be complemented with a communication and media theoretical meta-framework which I outline here. Thus, a space may be configured where the poetic methods generated by the relatively quick transformations along 'the analogue to the digital to the post-digital' axis may be conceptualized and structured, simultaneously accounting for the extraordinary 'multimodal heteromediality' of the cinematic medium (cf. Elleström 2020: 73–75).

Theorizing the specific condition of moving images in the post-digital era Thomas Elsaesser envisages a formation that "does not project itself as a window on the world nor requires fixed boundaries of space like a frame", but "it functions as an ambient form of spectacle and event, where no clear spatial divisions between inside and outside pertain" (Elsaesser 2016: 133). In a similar vein, and based on the analysis of moving-image art installations in the twenty-first century—among them Pipilotti Rist's *Layers Mama Layers* from 2010—Giuliana Bruno observes that "We no longer face or confront a screen only frontally but rather are immersed in an environment of screens" (Bruno 2014: 102). Rist's 2007 installation *Dawn Hours in the Neighbour's House* definitely fits Bruno's description of the process "where one becomes an integral part of a pervasive screen environment in which it is no longer preferable or even possible to be positioned in front of the work" (Bruno 2014: 102). From the windowpanes of the terrace, on to the plasma TV screen, through the floor and the edge covers of the books on the shelf in *Dawn Hours* every element functions as a screen that lights up and then fades in the dark, creating a "a fluid, haptic world of surrounding screens" (Bruno 2014: 102).2

The argumentation of this chapter starts from the observation that it is possible to isolate an intermediary screen(ic) formation that may be situated somewhere amongst analogue photographic cinema shown on a fixed (canvas) screen that necessitates a fixed spectator; Elsaesser's postphotographic, possibly digital cinema without clearly fixed, window-like boundaries and finally Bruno's surround screen environments that are fully immersive. This intermediary screenic formation may be described as the narratively significant embedding of electronic screens in film diegetic worlds designed for vertical cinematic screens—be they fixed analogue or mobile digital ones. It may be exemplified with television sets that the characters watch, with computers or mobile phones used by characters in action, or indeed CCTV cameras that convey unusual angles on otherwise well-known diegetic spaces. While the filmic narrative which embeds electronic screens to the extent that even "multiple diegetic worlds" (Elsaesser 2016: 69) may be generated was already present in the television and video era, our post-digital age and its givens of digital image-making, image-processing and image-display have led to its enhanced proliferation. Analysing the proposed intermediary screenic formation will constitute the discussion part of the chapter, governed by the hypothesis that its features are most adequately understood in the above sketched multidisciplinary framework, that is therefore demonstrated to be a suitable one to examine the changes in phenomena of intermediality pertaining to film "in those new media surroundings where film cannot be distinguished any more from what it is not" (Paech 2011: 19).

The framework of philosophical aesthetics allows us to observe that such embedded electronic screens tend to be neutralized as pro-, or even afilmic objects,3 which are there to emanate Roland Barthes' "effect of the real" (1968).4 In this capacity, these intermediary screenic formations mirror the numeric increase of electronic digital screens as conditioned by the technological changes along the turn of the twenty-first century. Furthermore, and as suggested by Roger Odin's observation, such electronic screens are understood as frames that aestheticize, and also re-order levels of reality (2016: 183). This aspect is also supported by my analyses of such electronic screens in Euro-American arthouse films that create Second Cinema-type filmic diegeses adhering to conventions of (hyper) realism, non-hypermediation and character-centred storytelling (Virginás 2018). These screens also focus, in a hypnotic manner, the viewers' attention, as Dominique Chateau so convincingly argues (2016: 197). Finally, thanks to what Jacques Derrida names "the labour of the frame", such embedded electronic screens "[labour] (travaille) indeed [and generate a] structurally bordered origin of surplus value, overflowed (debordée) on these two borders by what it overflows, it gives (travaille) indeed" (Derrida 1987: 75). While these observations definitely may be invoked to characterize diegetic electronic screens that introduce frames and edges in the diegetic worlds as constitutive backgrounds, they can be used, with the same validity, to describe paintings or photographs hung on film diegetic walls too.5 In order to account for the medium specificity/ies involved in constructions that involve electronic screens in the process of building the film diegetic world, semiotics and its offshoot, communication and media theory are to be invoked.

Lars Elleström's "The Modalities of Media II" (2020) offers itself as an adequate framework in this respect due to its multi-level design, its multiangle medium sensitivity6 and its reckoning with historical change. In a draft of his contribution to the present publication, Jørgen Bruhn identifies "the model's main strength" as "encompass[ing] all imaginable material units that enables communicative interaction", thus "cut[ting] through the oft repeated discussions whether it is the canvas or the motif of an oil painting that is the 'medium' or whether a mobile phone is a medium or a technical device" (2020). Its adequacy is also signalled by Bruhn and Gjelsvik's building on it in their recent *Cinema Between Media: An Intermediality Approach* (2018) or indeed by the successful application of the media modalities model to the examination of moving images by authors in the present publication (Crossley 2020; Lavender 2020; Lutas 2020; Newell 2020; Simonson 2020; Tseng 2020). Obviously, the application of any model also involves its testing on fuzzier cases, thus extending its validity, or, conversely, suggesting its limitations and a number of such adjustments must be signalled already at the outset of the present examination. The scope of the current endeavour is definitely broader than the basic entity of analysis in Elleström's media theory, constituted by "the transfer of cognitive import from a producer's to a perceiver's mind" through "the intermediate stage" named "media product" (Elleström 2020: 13). At least two ways must be mentioned in this respect: the higher number of producer and perceiver minds as well as the complexity of the media products themselves involved in film(ic) communication which is dependent on interlaid electronic screens.

The fundamental importance of the first aspect—namely, that "the minds of scriptwriters, directors, actors and many others combine to create the motion picture, [while] the audience consists of a multitude of perceiving minds" (Elleström 2020: 25)—must be acknowledged, even if it is only fleetingly touched upon in this analysis. Second, the complexity of the media products involved in the current examination means that "transfers of cognitive import" need to be accounted for. The "clusters of media products" or "media types" (54) are conceived of as "realized by either bodily or non-bodily matter" (35) and ultimately shown to be dependent upon "technical media of display" for their realization (35). The electronic screens present in film diegetic worlds could be easily overlooked as the "technical media of display" par excellence, or indeed simply categorized as metaleptic devices allowing for the change of narrative levels,7 at the same time contributing to creating the filmic diegesis and consequently forming a part of the filmic medium.

Thanks to Elleström's model, a more detailed scrutiny of these formations becomes possible, and in order to proceed in this direction, Sect. 4.2 is devoted to characterizing diegetic electronic screens as "basic media types", which are defined as the combination of four "media modality modes" (Elleström 2020: 55–58): "at least one material mode (as, say, a solid or non-solid object), at least one spatiotemporal mode (as threedimensionally spatial and/or temporal), at least one sensorial mode (as visual, auditory or audiovisual) and at least one semiotic mode (as mainly iconic, indexical or symbolic)" (Elleström 2020: 46). My specific task in this respect is in many ways similar to how Mark Crossley examines theatre performances with the aim of "establishing the significance of the spatiotemporal and sensorial modalities, alongside the material modality, in realising the hypermedium and to shed greater light on what this specific hybridised theatrical signification may look like and what it may accomplish" (Crossley 2020: 109).

Though the mentioned four modality modes are evidently interrelated,8 applying this grid to the specific case of electronic screens embedded in film diegetic worlds highlighted the strong interdependence of the material and the spatiotemporal modes, as well as the chain-reaction triggered in all the four modality modes by one of the modes being changed. These changes in the modality modes of electronic screens may be demonstrated to have a connection to the 'analogue to digital to post-digital' platform and paradigm changes, especially since the media modalities model also includes historical change through the differentiation between "basic" and "qualified media types" (Elleström 2020: 54–66). Thus, Sect. 4.3 in the present study will focus on the embedded electronic screens' "qualifying aspects" in order to offer a historically grounded characterization.9

The proliferation of television screens, video monitors, computer or mobile screens (with)in film diegetic worlds is an apparently simple numeric increase of certain objects within the filmed space, a phenomenon conditioned by, and thus mirroring technological changes during the twentieth and twenty-first centuries. According to the main argument of this chapter, this intermediary screenic formation should be considered a dense, complex and versatile audiovisual and narrative method that could have emerged only in our current post-digital era. With the aim of finetuning the model of media functioning presented in "The Modalities of Media II" (Elleström 2020) for this specific phenomenon, while simultaneously hoping to achieve a systematic description of electronic screens in film diegetic worlds, Sect. 4.4 will aim for a description of the intermedial processes at work in such examples.

# 4.2 Diegetic Electronic Screens as "Basic Media Types"

Media products and their ensuing communicative effects are characterized by three such modalities—the "material", the "spatiotemporal" and the "sensorial"—that are considered "presemiotic", as compared to the fourth, "semiotic" modality (Elleström 2020: 41–54). While the author stresses that all four modalities are equally relevant, the semiotic modality is seen to somehow sustain all the others since "if the sensory configurations with material, spatiotemporal and sensorial properties do not represent anything, they have no communicative function, which means that there is no media product and no virtual sphere in the perceiver's mind" (Elleström 2020: 49). In line with this observation, we can conceive of the electronic screens in film diegetic worlds as always being—partially or fully—within the semiotic modality. The content that these inlaid electronic screens display might be graphs, texts, videos, television programmes and, evidently, other films: thus all the three semiotic modes (iconic, indexic and symbolic) might characterize their functioning in communicative situations. However, as the diegetic electronic screens are par excellence "technical media of display" as well, a more precise description of the process along which the three presemiotic modalities morph into the semiotic one becomes possible. Hence, after a description of embedded electronic screens from the perspective of the presemiotic modality modes, a focus on how these screens assume their semiotic modality modes within the diegetic worlds will be discussed in the next subsection.

#### *4.2.1 Changes in the Material, the Sensorial and the Spatiotemporal Modality Modes of Diegetic Electronic Screens*

The (electronic) screens that I deal with are generally solid as for their material modality and are made of inorganic canvas, plastic, steel or glass. Very diverse examples fitting the above characterization may be cited: a televisual screenic image watched by the main protagonists and showing an undressing Bette Davis in Joseph L. Mankiewicz's 1950's *All About Eve* as embedded in the credit sequence of Pedro Almodóvar's 1999 *All About My Mother*. The final 'love or death' duel from Billy Wilder's 1944 *Double Indemnity* appears in a somewhat similar manner in Brian de Palma's 2000 *Femme Fatale*, on a television screen on which the female protagonist's profile is mirrored simultaneously. Finally, reference is made to the projection on a portable canvas of a moving image excerpt from a 1940s Veronica Lake-movie in Curtis Hanson's 1996 *L.A. Confidential*, again unfolding under the watchful eyes of the hero couple in the film (Virginás 2019).

The standard screenic materiality is, however, disrupted in the genre (or "submedium"10) of science fiction, which may be described as predicated upon the "main formal device [of] an imaginative framework alternative to the author's empirical environment" (Suvin 1972: 375). In this context, non-solid, near-plasma and even liquid screens as well as organic screens may be mentioned. In Steven Spielberg's 2002 *Minority Report*, the computer screens—as objects in the first-level diegesis—look like translucent windowpanes hanging horizontally, resembling air or water drops as for their texture and mode of existence. They are easy to manipulate, information may be organized and grouped, or processed through hand gestures and also by voice. These *Minority Report* screens may be turned off and integrated seamlessly in the background or they may shine full of information when needed—recalling Rist's rhythmically lighting unusual screenic surfaces in *Dawn in the Neighbour's Room*. As for the organic screen, in David Cronenberg's 1984 *Videodrome*, the bulky TV set in producer Max Renn's bachelor apartment is developing veins and lips in the hallucinatory scene of its transforming into the producer's lover, Nikki Brand (or her body).

However, the conception of non-solid and fluid screen surfaces is perhaps nowhere exploiting to a more astonishing degree the multimodal "heteromediality" (Bruhn 2010) of the cinematic medium than in Denis Villeneuve's 2016 *Arrival*. Here, the screen's mediality can be read as possibly indexing and symbolizing the perception of the alien entities landing on Earth, simultaneously with conveying the perplexed emotional state of the protagonist, Dr Louise Banks, a linguist establishing contact with the outer space creatures. Even on the first occasion of its appearance the giant screen interface separating the aliens from the humans is definitely displaying gas, smoke and plasma-type materiality, being focalized primarily by the members of the human crew and, occasionally, also framed by the more than three-dimensional spatial perception of the aliens (Fig. 4.1).

This aspect of non-solid materiality is further emphasized thanks to a number of elements conditioned by the digital cinematic medium's specificities: the gut-deep roars and fluid movement of the heptapod aliens created through composite animation-and-CGI techniques; the detailed view of the vapour blinding Louise's view from within her spacesuit; and finally, the specific mode of writing that the aliens have, which deforms, disperses and flows away after it has performed its basic role of (possibly) creating cognitive import in Louise and the team's minds. On the occasion of the third visit in the aliens' tower-like spaceship, Louise takes off her astronaut suit and advances towards the two alien entities, in an effort to make them integrate the (written/symbolic) word 'Louise' with the

**Fig. 4.1** De-solidifying alien and solid human screens in *Arrival* (dir. Denis Villeneuve, 2016). All rights reserved

object of her (self/body). Shown from one side, as a small figure dwarfed by an aquarium-like screenic entity containing the heptapods, Louise's human figure serves to counterbalance the liquid, non-solid, possibly organic materiality that makes the aliens perceptible to the human eye as if through a screen interface (Fig. 4.2).

The fully transparent materiality of this dividing screen is re-configured as partly solid when—after Louise's having placed her palm on it—the heptapod also sticks a floral-shaped body member to it. This scene in the film is fundamentally based on how media theorist Sybille Krämer sees the "material modality modes" of all media as dependent on what she defines as transparency: "[m]edia are indeed bound to materiality, but their transparency is practically required: air, water or crystals are thus the most favourable materials for media of perception", she observes (Krämer 2015: 32).

However, it is not only the dividing screen within the spaceship of the alien heptapods—where earthly physical laws of gravity and threedimensionality do not apply—that de-solidifies. When we are shown the army and the scientific team's common efforts at deciphering the heptapod auditive strings on the large computer screens positioned inside the earth base, these electronic screens' content is effortlessly transferred and complexly mirrored on the transparent plastic dividing sheets of the military tents (Fig. 4.3). A similar effect is created by such set design when several large screens are positioned side-by-side to ensure simultaneous

**Fig. 4.2** Transparency as an essence: Louise facing the alien creatures in *Arrival* (dir. Denis Villeneuve, 2016). All rights reserved

**Fig. 4.3** Human screens losing solid materiality in *Arrival* (dir. Denis Villeneuve, 2016). All rights reserved

reception of developments on all the twelve earthly sites where the aliens have landed.

This process of de-solidifying the interface screen between humans (Dr Louise Banks and the team) and aliens (the two heptapods, Abbott and Costello) may be positioned thus as one striking marker of how space, time, memory and ultimately identity will de-solidify in *Arrival*. The process reaches its climax in the scene when Louise is transported to the alien ship within a capsule that they sent for her. Here the screenic interface is first suggested—or indeed "transmediated" (Elleström 2020: 81–83) through a number of non-(electronic) screenic entities: smoke that is later shown to be emanating from the frost-like cubes Louise lands upon; Louise's slowly fluttering hair; and even her sentiment of angst and extreme fright from maximum exposure in a possibly hostile environment being perceptible to her (and to us) as layers of clouds where the alien heptapods move/swim/fly freely. However, the ultimate screen frame, that of the cinematic image, remains firmly in place, as suggested by the total view of a small Louise facing a giant heptapod, while both of them are limited to the right by a black rectangle, recalling the initial screen that separated the two worlds all throughout the alien–human contact narrative.

These embedded and (generally) electronic screens are characterized by two spatial coordinates: height and width, and by the temporal coordinate when solid, with their sensorial (multi)modality an audiovisual one. However, when the embedded electronic screens are shown to acquire non-solid and/or organic materiality traits, the fourth spatiotemporal dimension of depth is added and activated as well. When screens desolidify as in *Minority Report* or *Arrival*, or are attributed organic qualities as in *Videodrome*, they open towards depth. An interesting case where depth is added to a materially "solid and flat" and sensorially "audiovisual [electronic] screen" is to be found in Ridley Scott's 1984 *Blade Runner*, in the famous sequence when Deckard, the detecting figure, is analysing a photograph he found in replicant Leon's apartment. Deckard sits opposite a computing device that seems to be a mix of a scanner, a printer, a computer and a television set, on which he performs the analysis of the found photograph. The device is governed by Deckard's voice, and he quarters, zooms in and out on the originally printed photograph, up to the point when, among details reminiscent in their figurative manner of old Dutch masters, a new figure, unseen up to now in the mentioned setting, appears: a female replicant known as Zhora. That this computer screen in *Blade Runner* is a passageway in depth to an equally important, yet different(-level) diegesis is also suggested by the last element Deckard discovers on the analysed photograph: the fake scales of which club dancer Zhora's shawl is made, which will become the next element in advancing the investigation for the rebellious replicants, among them Leon and Zhora (Virginás 2014). Yet, this opening of a solid screen towards depth is accompanied by a change in one of the other modality modes: a change of proportions within the sensorial modality—this is a voice-governed, rather than just watched screen—engenders depth being added to the other three spatiotemporal coordinates in this scene from *Blade Runner*.

In these scenes quoted from *Minority Report*, *Videodrome* or *Arrival* we can observe that besides watching and hearing the sensorial mode of tactility is also added to the functioning processes of the screens within the examined diegetic filmic situations. Simultaneously we can notice another process of how change in one of the modalities—in this case the sensorial one—entails changes in at least one of the other modalities too. This might be the material one: de-solidifying the 'audiovisualtactile' screens as in *Minority Report* or *Arrival*; or attributing them organic qualities as in *Videodrome*—or, indeed change occurs in the spatiotemporal modality, with tactile screens opening towards depth. To hint at one of the main conclusions of this analysis, material modality changes of the examined screens seem to trigger changes in the spatiotemporal and sensorial modalities too and vice versa. These processes support the displacement of (qualified) media boundaries that we have been witnessing between analogue filmic, analogue electronic, digital filmic and digital electronic media—a phenomenon sometimes referred to as 'the death of cinema' and dealt with in detail in the next section (entitled "The Qualifying Aspects of Electronic Screens").

### *4.2.2 Diegetic Electronic Screens on the Verge of the Presemiotic and the Semiotic Modalities*

As demonstrated in the previous subsection, diegetic electronic screens allow for conceptualizing the interdependences present between the three presemiotic media modality modes: the material, the spatial and the sensorial. Furthermore, electronic screens inlaid in film diegetic worlds somehow bridge over the difference between the Elleströmian categories of "technical media of display" not creating cognitive import and the "media types" that create cognitive import. Thus, they are adequate units of analysis on which to base a description of the passage from the three presemiotic media modality modes to the semiotic one which covers iconicity, indexicality and symbolicity in Peircean terms. Andy Lavender's work pondering on this aspect is also illuminating since he conceives of the material, the spatiotemporal and the sensorial modalities as "rather describe[ing] structuring aspects that will *then affect* cognition", with the semiotic modality evidently originating from cognition (2020: 115, emphasis in the original). Interestingly, presenting a clear case of academic serendipity, as the articles in the present publication were written simultaneously and independently, Tseng's contribution also touches upon these issues. Using the umbrella term of "digital mediated images" she considers that it "should be read as a broader conception than that of just the new digital media used diegetically by fictional characters in the film […]: in this chapter, it describes various forms of added realism, among them news footage, intra-diegetic camera, and computer screen" (Tseng 2020: 175–176). My contribution adds to this observation the categorization of embedded screens to be presented in what follows.

The mediating capacities of the electronic screens examined here may be conceptualized, and also categorized starting from the observation that in the absence of communicative function and a virtual sphere "created in the perceiver's mind" the sensory configurations will not become media products and thus do not represent anything (Elleström 2020: 21). Based on this, I differentiate between three types of diegetic and non-cinematic (electronic) screens embedded in film diegetic worlds: *decor screens*, *diegetic screens* and *metadiegetic screens*. Screens belonging to the first type constitute a background or an atmosphere-like environment in Bruno's sense, existing in a presemiotic condition always on the verge of bursting into semiotically meaningful surfaces of communication: the diegetic television, video, computer or mobile screens may form part of the decors and will be called *decor screens* henceforth. The second type of diegetic screens are watched, manipulated or otherwise used by diegetic characters, thus illustrating the activation of the semiotic modality too, besides the other three; these will be nicknamed *diegetic screens*. The third type of screens is primarily there for the afilmic/actual sphere/extracommunicational domain viewer to watch and create cognitive import based on it, these might serve the narration and be visible, evident or meaningful only for the actual viewer; therefore I will shorten them as *metadiegetic screens*. Obviously, the positionings of the respective screens may change from scene to scene and within the same filmic narrative.

My classificatory scheme may be seen as somewhat bordering on what Gérard Genette defines as "the main types of relationships that can connect the metadiegetic narrative to the first narrative, into which it is inserted" (1983: 232). Obviously all three types of embedded electronic screens are capable of carrying metadiegetic content with respect to the first diegesis as unfolding on the cinematic screen. However, this aspect must not be equated with these screens assuming a fully semiotic modality within the respective diegetic scene: as we shall see, there are interesting correlations between embedded electronic screens as chiefly characterized by the presemiotic modality modes (or the *decor screens*), inlaid electronic screens as chiefly characterized by the full emergence of the semiotic modality within the diegetic reality (*diegetic screens*), embedded electronic screens as chiefly characterized by the full working of the semiotic modality in the extracommunicational domain of the actual viewer (*metadiegetic screens*) and finally the three types of relationships as described by Genette.

#### *Decor Screens*

When in the first case, the respective diegetic electronic screens might serve the purpose of connoting a family, private environment and its social positioning. As an illustrative example, we can think of the rugged TV set that Carla Jean Moss—the declassed girlfriend of one of the chief protagonists—is watching in their even more derelict cabin home in the Coen brothers' *No Country for Old Men* (2007). In contrast, such screens might index an institutional, thus public context, perhaps a secret headquarters with magnificent-scale operations as in the case of Q's base in the Bond-sequel *Skyfall* (Sam Mendes, 2013). We can categorize *decor screens* as belonging to the sphere of profilmic reality—"the reality photographed by the camera" (Buckland 2003: 47) and with evident links to afilmic reality, which "exists independently of filmic reality" (Buckland 2003: 47). Souriau and Buckland's afilmic reality resembles what Lars Elleström defines as "the extracommunicational domain" preceding and surrounding ongoing communication (Elleström 2020: 27–33)—in our case everything pertaining to the film—containing electronic screens—a viewer is in the process of watching. Relating such *decor screens* (be they televisions or surveillance camera images) connoting a- and profilmic reality to the extracommunicational domain is even more pertinent in the light of the observation that "[v]ital parts of the extracommunicational domain are constituted by perception and interpretation of media products" (Elleström 2020: 28).

The second type of relationship that Genette conceives of between the first narrative and the metadiegetic narrative "consists of a purely *thematic* relationship, therefore implying no spatio-temporal continuity between metadiegesis and diegesis: a relationship of contrast […] or of analogy" (1983: 233). Interestingly enough, it is *decor screens* which are foremost in connoting and also indexing afilmic reality and the extracommunicational domain that are bound to perform this Genettian "thematic, contrastive or analogical" relationship between the filmic diegesis and the (Genettian) metadiegetic level as embodied by the electronic screens. What I name *decor screens* constitute a 'presemiotic screen environment': thus they draw attention to the aspect of the "mediation" rather than that of "representation" (Elleström 2020: 38–40)11 while communication is going on, and this feature is mirrored in the Genettian model as non-existent "spatiotemporal continuity". An adequate example in this respect may be cited from David Cronenberg's *Maps to the Stars* (2014): in a scene Agatha, the evil-doer incognito who is working as a personal assistant to Hollywood star Havana Segrand, arrives at her employer's home. In the luxurious, English country-style kitchen the "vertical viewing dispositif" (Strauven 2016: 144) stands out through its minimalist, technologically up-to-date outlook, while showing a live television talk show in which Havana repeats the story of her long-dead actress mother, with essentially no new piece of information added to what has been presented up to now in the filmic diegesis unfolding on the cinematic screen. However, the superficial flatness of the television talk-show as mediated through this *decor screen* is in a Genettian 'thematic contrast' to Havana, the actress' inner torments regarding her abusive mother, and, in addition indexing the hardships of her getting the role about which she is interviewed.

#### *Diegetic Screens*

Apparently the same objects that functioned or will function as *decor screens* may re-appear as *diegetic screens* having further function(s) and role(s) within the diegetic world/reality (or "the fictional story world created by the film" (Buckland 2003: 47)). A survey of what I call *diegetic screens* could start with examples of diegetic characters being interpellated by televisual screens: like director Max Renn being addressed by his secretary through a televisual screen in *Videodrome*. Or indeed manipulating data through screens: as journalist Mikael Blomkvist does when examining the digitized celluloid photographs taken on the occasion/day of a fourdecade-old crime in *The Girl with the Dragon Tattoo* (Niels Arden Oplev, 2009). Thus, if the *decor screens* highlight the extracommunicational domain, *diegetic screens* will direct our attention to the "intracommunicational domain" or "the formation of cognitive import in ongoing communication" (Elleström 2020: 27).

Those scenes where the characters watch, examine, analyse and dissect—usually digitally—stored and displayed audiovisual moving images are detailed examples of representing and conceptualizing the perceivers' minds and those brief moments of perception as followed by lengthy processes of interpretation. The moments with the embedded electronic screens not only draw attention to the various and active modality modes of the involved qualified media types but also dramatize and thus prolong "the act of perception" which "is brief and quickly channelled into interpretation" otherwise (Elleström 2020: 18). Such *diegetic screens* result in the creation of communicative situations where cognitive import might emerge, with the representation conceived of as always already dependent on the material modality of the video, the television or the computer. However, as "[t]he mediated sensory configurations of a media product do not transfer any cognitive import until the perceiver's mind comprehends them as signs", and therefore "the sensations are meaningless until they are understood to represent something through unconscious or conscious interpretation" (Elleström 2020: 50), scenes with *diegetic screens* present us the mess of creating cognitive import while faced with electronic screens. Characters using or watching television or mobile screens, video monitors or laptops may be positioned as providing detailed analyses of the "border zones" between the material modality ("the latent corporeal interface of the medium"), the sensorial modality ("the physical and mental acts of perceiving the present interface of the medium through the sense faculties") (Elleström 2010: 17) and, evidently, the semiotic modality (which is necessary to create cognitive import in communication).12

Genette delimits "direct causality between the events of the metadiegesis and those of the diegesis, conferring on the second narrative an *explanatory* function" (1983: 232, emphasis in the original). This is evidently the case when *diegetic screens* are employed to show content that founds, explains or perhaps precedes the diegetic events, thus performing a temporal re-ordering as well, on the level of the plotline. One of the most striking examples is provided by Alex Garland's *Annihilation* (2018), the story of a five-member female expedition sent to an alien dominated zone, the so-called Shimmer. The main protagonist, Lena, only accepts participation in the dangerous trip to help somehow her soldier husband, who returned from a similar previous mission deeply hurt and deranged. When quite advanced in the territory and also in their process of understanding how the Shimmer decomposes DNA, the group finds a memory card, which they will watch on the minuscule screen of their portable digital video camera. The activity repeats itself when Lena enters the dangerous Lighthouse, where a similar video camera on a tripod faces a sitting corpse covered in ash. Both occasions contain sequences from the previous expedition's experiences and therefore their accumulated knowledge; thus, the electronic *diegetic screens* inlaid in this filmic diegesis reveal the past of, and therefore explain, the diegetic world itself. The small video screen often morphs into covering the whole cinematic surface (screen) in a creative effort to convey to the actual audience the extraordinary destructive effects of the Shimmer, but also as a method to represent the emotional involvement, sadness and painful reminiscences that Lena, as a focalizer character, goes through. Lena is able to see her moribund husband, possibly genetically transformed by the alien forces in the diegetic present, as a fully human, yet already seriously damaged person in the diegetic past as framed by this small electronic *diegetic screen*, which therefore directly re-connects to the cinematic diegetic level too.

#### *Metadiegetic Screens*

The third type of represented screens, *metadiegetic screens*, are there only for the afilmic/actual sphere/extracommunicational domain viewer to watch, who is quite different from the intracommunicational domain viewer existing within the diegetic reality of the given film, such as Lena in the previously quoted *Annihilation*. These screens might serve the narration and be visible, evident or meaningful only for the actual viewer of a given film: no character in the film diegesis is possibly or fully sensing what I name *metadiegetic screens*, and therefore no diegetic character is capable of creating semiotically meaningful cognitive import based on them. An interesting example for such a screen may be recalled from Olivier Assayas' 2014 *Clouds in Sils Maria*. The last part of the film presents the theatrical performance of the play entitled *Maloja Snake*: the story of a powerful firm executive (Helena as played by an older actress in the diegetic world) and her painful lesbian love story with her ruthless young assistant (Sigrid as played by the rising star, Jo-Ann). Sigrid enters the cubes signifying the company offices, takes files from the desks of the office workers, and at the end of the theatrical scene, but also that of the filmic sequence, she exits the geometrical, sterile office space towards the audience, stopping at the extreme edge of the stage. The camera focuses on Jo-Ann-as-Sigrid's angry, disillusioned, tired and sad face: this female face is filmed in realtime and projected on the huge canvas of the stage in magnified proportions, with a bluish lighting effect superimposed on it. The view created is that of a beautiful female head squeezed through the grid of pixels and geometrical lines that define such a body in a digital environment of 1s and 0s. The analogue narrative filmic image of an actress performing a role in the sketchy environment of a theatre play is transmediated into the digital filmic image of the same theatre actress in the front of our very eyes, creating a hybrid representation that is neither analogue filmic image, nor filmed theatre scene, or digital filmic image but all at the same time.

Such (intra)diegetic shots transforming into (meta)diegetic, longduration, fixed shots, which often are close-ups, exemplify what Roger Odin calls "inclusion", for example, those moments when "the mental cinema screen encompasses and somehow erases the physical space" (Odin 2016: 179). These long-duration shots ambiguous as for their diegetic status—no focalizer character's optical point of view matches them—turn into moments of true spectacle offered to the afilmic, extracommunicational domain film viewers in a digital era, staging the process of immobilizing animate images, of which Gaudreault and Marion write that "within the flow of digital visual media and through the widespread animation of these media, the 'moveable' image has become almost the norm and the still image the exception" (2015: 77). The urge towards an aesthetic attitude that framing entails is also definitely present in such moments: as Roger Odin argues, "the desire to see something 'framed' reflects a will to transform the world into an aesthetic" (Odin 2016: 183).

This scene with Jo-Ann, the young actress' face projected on the huge theatre canvas—doubling as an embedded screen—while shown as a super close-up for the cinematic viewer, also exemplifies the third type of relationship between the diegetic (in this case, the cinematic) and metadiegetic (in this case, conveyed through the [electronic] screenic embedded within the diegetic world). This is described in *Narrative Discourse: An Essay in Method* as "involve[ing] no explicit relationship between the two story levels: it is the act of narrating itself that fulfils a function in the diegesis, independently of the metadiegetic content—a function of distraction, for example, and/or of obstruction" (Genette 1983: 233). Thus, the narratological roots of an apparently intermedial analysis have become evident: besides their evident function as (afilmic) indices of our post-digital era, capable of conveying what Elleström calls "extracommunicational truthfulness", the embedded television or video screens also create "intracommunicational coherence" (2018) through complex metaleptic narrative structures that constitute the fictional spatiotemporal continuity of the film. Thus, a dual functionality may be attributed to them: as a- and profilmic indices and also framing devices that re-order narrative levels. In this respect, this line of analyses may be added as a further argument, achieved through semiotic and narratological methods, to Tseng's statement that "[i]t is the contextualization of these digital frames in the broader narrative structures, which achieve specific narrative functions" (Tseng 2020: 181).

However, such a clear-cut differentiation of *decor, diegetic* and *metadiegetic screens* is a conceptual possibility rather than an always functional method of practical analysis. While it offers a semiotic and narratological basis for understanding the multitude of embedded electronic screens, it is also an adequate tool for describing more fuzzy examples. Spike Jonze's 2013 *Her*, for example, introduces us to a futuristic world where humans occupy the cinematic diegetic space, and the digital Artificial Intelligence inhabits the diegetic computer screens. This is how the romance of ghostwriter Theodore Twombly, surrounded by muted sounds and warm colours, and operation system Samantha, a sensual voice and computer screen operations, unfolds in a fully metaleptic manner, jumping from cinematic to computer screen(ic), from diegetic to metadiegetic level and back. As Liviu Lutas formulates it, "metalepsis should be the violation of the frontier between different levels of representation" (2020: 155). To Theodore's understated question referring to her functioning, Samantha confesses that "basically I have intuition. The DNA of who I am is based on the millions of personalities of all the programmers who wrote me". Significantly, when Samantha utters this sentence, we leave the spatial parameters of an interior with a human figure seated in front of a computer desk, and we get a view from behind a glassy, transparent surface—a possible space-divider in Theodore's apartment, but perhaps we get outside his apartment's windows. The effect is that contours lose their sharpness, light effects and colour patches become more expressed, and the cinematic filmic image and screen transform into a screenic surface with abstract forms and patterns. So, parallel to the digital objects, graphics and consciousnesses pertaining to the embedded electronic *metadiegetic screens* asking for and getting their place in the cinematic realm, the diegetic filmic image starts to acquire pixelated qualities. Thus, an interesting composite moment of transmediality is offered between analogue (scanning) representation and digital (sampling) representation. It may be suggested to be variation on the process that Joachim Paech describes as "the repetition or retake of characteristic cinematographic forms in digitally produced films" (2011: 18) as here we witness a further layer of digital characteristics overimposed on it.

### 4.3 The Qualifying Aspects of Electronic Screens

As already suggested, the complexity of the media products involved in the presently examined "transfers of cognitive import" needs to be accounted for. As such a reference to the concept of "qualified media [types]" realized through "technical media of display" (Elleström 2020: 33–37) is an important aspect of this complexity, it shall be dealt with in this section. The "qualifying aspects" of the media types—previously described as based on the four media modalities—refer to "all kinds of aspects about how we produce, situate, use and evaluate media products in the world" (Elleström 2020: 55). There are two qualifying aspects: the so-called contextual and the operational aspects.

The "contextual qualifying aspect" involves "the origin and delimitation of media in specific historical, cultural and social circumstances" (Elleström 2020: 60), and it is in reference to film that Elleström notes that "[t]he combination of these features is no doubt a historically determined social construction of what we call the medium of film, but given these qualifications of the medium, it has a certain essence" (2020: 59). The fundamental aspects of film—described as "a combination of visual, predominantly iconic signs (images) displayed on a flat surface and sound in the form of icons (as music), indices (sounds that are contiguously related to visual events in the film) and symbols (as speech), all expected to develop in a temporal dimension" (Elleström 2020: 59)—come into question when technological change is as quick and self-evident as in our analogue-to-digital era. Thus, one needs to acknowledge that "some basic modal groupings are commonly distinguishable at a certain time and in a certain culture, and that the future may hold new habits and technical solutions that make novel basic media types relevant" (Elleström 2020: 56). Friedrich Kittler, among so many others, has been right in drawing attention to the diminishing chances of separating film, video or television with the advent of the digital (era) when he observed that "[i]f the historical synchronicity of film, phonograph, and typewriter in the early twentieth century separated the data flows of optics, acoustics and writing and rendered them autonomous, current electronic technologies are bringing them back together" (Johnston 1997: 5–6). Meanwhile, our present is still characterized by the culturally (and perhaps also cognitively) funded differences—or the contextual qualifying aspects—among the mentioned technical and electronic media.

These differences might also be sustained by such constructions in film diegetic worlds where these various media, indexed by the corresponding screens, are present as apparently afilmic, but actually profilmic objects with serious functions in the narrative development. In the framework provided by the concept of the "contextual qualifying aspects", the embedded electronic screens may be definitely described as contributing to fixing the specificity of the media involved, especially in such cases when these screens convey moments of glitch and noise, de-neutralizing television or video as in David Cronenberg's *Videodrome* or David Lynch's *Lost Highway*. With the aim of supporting the general hypothesis of medium specificity being sustained, a number of close readings of noisy nonneutralization of a medium through diegetic electronic screens follows.

*Videodrome* sets up the rules of its diegetic electronic screen use aiming at making the medium visible and filling it with noises of all kinds already in the introductory credit sequence. First, animated letters fill the cinematic screen, their candy colours and rudimentary design disturbing, evidently, the cinematic immersion, and a shortly visible screenic glitch of a black-and-white nonfigurative formation informs the actual viewer that the sensible surface of this screen does not bear messages as usual according to norms. In this context, the decrease in image quality may be identified as a Krämerian (media) noise that makes the medium—in this case, video and television image and apparatus—apparent and perceivable to the actual cinematic viewer, as "only noise, dysfunction and disturbance make the medium itself noticeable" (Krämer 2015: 31). The opening sequence from *Videodrome* ends with a cinematic close-up on producer Max Renn's hand and face, while the television screen in the background recedes, its texture and sensible surface losing features, becoming a simple patch of colour in the diegetic space, illustrating Krämer's formulation that "[a]t the same time that media bring something forth, they themselves recede into the background; media enable something to be visualized, while simultaneously remaining invisible" (Krämer 2015: 31).

In David Lynch's 1997 *Lost Highway,* the ominous video cassette left on the dream's pair's villa staircase definitely presents a differently scaled virtual world (Manovich 2001: 112), hypnotically capturing its diegetic, and the actual viewers' attention too (Chateau 2016: 197). The content of the cassette and therefore that of the television screen is full with visual glitches and auditive noises (Fig. 4.4) that often cover the whole cinematic screen. As if an effect of the noiseful video and televisual medium, in *Lost Highway* most prominently the whole cinematic screen becomes blurred and is covered with nonfigurative patches of light, recalling Florian Cramer's observation that "the characteristics of any medium only reveal themselves in its misbehavior at the low end" (2013).

**Fig. 4.4** When noise specifies a diegetic electronic screen: *Lost Highway* (dir. David Lynch, 1997). All rights reserved

Noise is introduced to (re)present the cinematic medium, "unaisthecizing it" to use Krämer's thesis: "*The implementation of media depends on their withdrawal*. I will call this 'aisthetic self-neutralization'. […] The invisibility of the medium—its aesthetic neutralization—is an attribute of media *performance*" (Krämer 2015: 31, emphasis in the original). These instances where diegetic electronic screens are scattered within the filmic diegetic spaces examined are non-neutralizing the media involved, making them 'visible' according to the Krämerian model, also demonstrating their non-noise-free use primarily for the actual viewer and occasionally for the diegetic spectator too.13

Meanwhile, "[t]he second of the two qualifying aspects is the general purpose, use and function of media, which may be termed the *operational qualifying aspect*. This aspect encompasses construing media types on the ground of claimed or expected communicative tasks" (Elleström 2020: 61, emphasis in the original). In their co-authored volume, *The End of Cinema?*, André Gaudreault and Philippe Marion set up a system based on twentieth-century media history, taking as a principle the substitution of the cinema silk screen by the electronic cathodic television screen, and then by the electronic portable small computer screen.14 They argue that "[w]e might even view the emergence of the small (but highly cathodic) screen as the point of rupture between a 'hegemonic cinema' and this 'cinema in the process of being demoted and shared,' which is often called 'expanded cinema' but which we believe would be more appropriately described as 'fragmented cinema'" (Gaudreault and Marion 2015: 11, citing Guillaume Soulez's conference intervention). Thus, "hegemonic cinema" would denote the first part of the twentieth century when the cinema theatre silk screen was the sole framed surface which displayed electronically mediated and also always pre-recorded moving images. "Expanded cinema" should denote developments of the second part of the twentieth century, when television and then video-camera screens appeared as electronic surfaces where cinematic worlds and narratives would expand, obviously altering the nature and the significance of framed storytelling based on moving images. Finally, the twenty-first century brought us into the era of what Gaudreault and Marion (2015) name "fragmented cinema", with the same cinematically constructed narrative worlds scattering further on "the electronic portable small computer screen", becoming compatible with such surfaces.

Pertaining to how the "operational qualifying aspects" of screen-based technical media evolve in the post/digital era, such constructions as

**Fig. 4.5** Caught between *decor screens* and *diegetic screens*: *Loveless* (dir. Andrei Zvagintsev, 2016). All rights reserved

(smaller) electronic screens (with visible frames) inserted in film diegetic worlds may be attributed the role of training the film viewers for experiences of expanded and fragmented cinema. They force the audience to constantly shift between the actual cinematic screen conventions and the mental screen (Odin 2016) of smaller formats. This may be exemplified with further and more recent examples: news presenter Anna living through her diegetic marital (melo)drama on the television screen while she presents the news in Thomas Vinterberg's *Commune* (2016); the emotional happenings banished on mobile or television screens as opposed to the rigidity and frozenness of the diegetic cinematic world in Andrei Zvagintsev's 2016 *Loveless* (Fig. 4.5) or indeed the most important viral video of the diegetic world as encaged on the museum curator Christian's mobile phone screen in Ruben Östlund's *The Square* (2017).

# 4.4 The Intermedial Processes at Work in the Examined Filmic Sequences

As signalled in the introduction, the identifying of intermedial processes at work in such filmic construction involving electronic screens must close and also generate the analysis that has just been performed. What are the media that are interrelated in film scenes where characters appear on a black-and-white television screen as news presenter Anna in *Commune* or where they watch low-resolution videos on their mobile phone screens as the museum curator Christian in *The Square*? Can we meaningfully assert that an entangled intermedial happening is at stake when Anna's flat, desaturated, electronic TV image morphs into a cinematic close-up of fine-grained texture, with brighter colour qualities (Fig. 4.6)? Is it possible to argue that the disappearance of the frame belonging to the smaller electronic screens in such instances induces intermedial tensions between film/cinema and other electronic artistic screens of television, video or mobile phone—such as in the scene where the guerrilla marketing video made to promote the contemporary art museum leaves behind Christian's mobile screen to widen and cover the whole cinematic screen area?

The previously presented characterization based on the four media modalities—and the two qualifying aspects of the cinematic, the television, the video, the computer and the mobile (phone) screens—should guide us in this respect. Answering this string of questions requires one to establish the media borders that are crossed whenever (non-cinematic) electronic screens are inserted into film diegetic worlds. This can be achieved via the two operations proposed by the media modalities model: first, "'finding' or identifying media borders between dissimilar basic media types" and second "'inventing' or construing media borders between dissimilar qualified media types based on similar basic media types" (Elleström 2020: 72). The first operation would leave us with "intermedial relations in a narrow sense", while the second with "intermedial relations in a broad sense" (Elleström 2020: 71–73).

Ours is evidently a case of 'broad intermediality', when borders between "dissimilar qualified media types based on similar basic media types" are

**Fig. 4.6** A meta/diegetic embedded electronic screen in *Commune* (dir. Thomas Vinterberg, 2016). All rights reserved

crossed, with the medium of film being the paradigm-case for television, video and computer screen-based audiovisual media. However, this border is not simply the one apparently existing between analogue (cinematic/celluloid, analogue television and electronic video) and post/digital (computer and mobile media) qualified media types. As the examples of *Videodrome* and *Blade Runner* show, the modality changes that de-solidify or enliven the diegetic electronic screens, adding depth to their otherwise flat spatiotemporal modality, or sound and tactility to their sensorial modality, cannot be fully identified with the analogue/digital/post-digital divide—even if the post-analogue and post-digital eras present us with more numerous screens that share these characteristics. Therefore, what become pertinent are the changes which seem to exist in the sphere of the so-called presemiotic modalities: the material, the spatiotemporal and the sensorial modalities of these predominantly electronic screens that play a role in the film diegetic worlds. Thus, one of the chief results of the analyses performed is to have demonstrated the mutual chain reactions between modalities or that change in one (presemiotic) modality of the examined interlaid screens triggers changes in the other two as well. Thus, the interconnectedness of solid materiality/non-organic materiality/twodimensional spatiality/audiovisual sensorial modality and that of non-solid materiality/organic materiality/three-dimensional spatiality/audiovisualtactile (synaesthetic) sensorial modality with respect to electronic screens embedded in film diegetic screens should have become evident.

The above summarized and interrelated modality changes—with the mobility of screens a subcase in this respect—may be in turn understood as routinely employed to argue for the 'fluctuating qualifying aspects' that separate the television/video era from the digital one. This is a further argument for the case of 'broad intermediality' at work whenever diegetic and non-cinematic electronic screens appear in film diegeses, as the constructedness of these media borders is relatively easy to reveal. Or, as Kate Newell states in her reading of *The Handmaid's Tale* adaptations in various media in the present publication: "such borders, while useful theoretically, are always constructed and perceptual. That is, no material 'border' exists between, say, the animated and live-action segments of a particular film, yet audiences perceive aesthetic differences, and articulate that difference in terms of juncture and border crossing"(Newell 2020: 35).

The cases presented definitely draw our attention to how the borders between the qualified media are displaced, since even if they may "have a certain degree of stability, their defining features are formed by fluctuating conventions" (Elleström 2020: 57). In the framework of the media modalities model, these questions pertain to the sphere of mediation, or "the material realisation of the media product, made possible by a technical medium of display", as opposed to representation, or "the semiotic conception of the medium" (Elleström 2020: 40). Thus, the analyses pursued in this chapter offer proof of what I consider an important axiom of "The Modalities of Media II", namely that "[a]lthough mediation and representation are clearly entangled in complex ways, it is vital to uphold a theoretical distinction between them" (Elleström 2020: 40). Through the examination of the electronic screens dispersed in film diegetic worlds, a distinction between mediation and representation may be shown to exist, as well as fixed through/in the analyses. Furthermore, analysing the changes in the modalities of the examined screens also allows us a more precise or even more fine-grained examination of how "the transfer of cognitive import among media is restricted by the modal capacities of the technical media of display" or of cases when "technical media allow of modal expansion" (Elleström 2020: 79). Thus, we can have a better grasp of what happens when we see the same thing on a filmic image, as a happening or a view in a film diegetic world, and with the embedded electronic screen's more pixelated, more blurry image, in a mise-en-abyme-type structure.

Both the cinematic screen and the diverse electronic screens dispersed within film diegetic worlds may be situated at the intersection of the categories presented previously: "[b]asic and qualified media [that] are categories of media products" and the "technical media of display" which are "physical entities needed to realise media products and hence media types" (Elleström 2020: 9). To some extent, an analogy to Friedrich Kittler's system of media functioning may be shown to exist. Kittler emphasizes that storage and information manipulation are interweaving with transmission in the case of media as "[t]here are, first of all, media of transmission such as mirrors; secondly, storage media, such as film; and thirdly […] machines that manipulate words or figures themselves" (Kittler 1997: 132–133). Within this context, screens may be described as framed spectacles related to electronic and technical media: film, video, television and computer or mobile (phone). These media not only produce or store but also distribute content, in accordance with Lars Elleström's definition of a technical medium, which "should consistently be understood not as a technical medium of production or storage but of 'distribution' in the precise sense of disseminating sensory configurations" (Elleström 2014: 14). This definition allows one to fix the screens in the moment of "distributing/disseminating sensory configurations" according to the various media(l) apparatuses they are the endpoint of. It is this aspect of the electronic screens embedded in film diegetic worlds that quite blatantly shows their transitory or hybrid position between "technical media of display" as "[d]evices used for the realisation of media products" and media types with semiotic qualities (Elleström 2020: 34). This hybrid nature is also a manifestation of the fact that although "[c]inema, written narrative literature, and sculpture are examples of qualified media types […] it is important to stress that not all qualified media are aesthetic" (draft of Bruhn 2020).

In order to position the examined phenomenon—the functioning of the electronic screens in film diegetic worlds—as one worthy of "careful analysis and interpretation" and also to argue for its presenting a form of "media interrelations" (Elleström 2020: 86), I have crossed a number of checkpoints. I characterized the media products and media types that film and the embedded electronic screens cover according to the corresponding framework of the media modalities model; I presented the filmic examples and established a taxonomy of embedded electronic screens based on the previous descriptions; and finally, I showed that the media borders that are crossed need to be construed (Elleström 2020: 66–68). However, this does not mean that the crossed media borders are arbitrary; moreover, a finely tuned system of interrelations on the level of the presemiotic modalities of the embedded electronic screens has been revealed, and this may be suggested as feeding the currently upheld differences between the various qualified media types—cinema, television, video and computer—involved.

#### Notes


functions are activated and representation is at work. For instance, the heard sound may be interpreted as a voice uttering meaningful words" (Elleström 2020: 39, emphasis in the original).


#### References


*Arrival*, directed by Denis Villeneuve. 2016. Universal Pictures UK, 2017. Blu-ray. Barthes, Roland. 1968. L'effet de réel. *Communications* 11: 84–89.

*Blade Runner*, directed by Ridley Scott. 1984. Warner Home Video, 1997. DVD.

Bruhn, Jørgen. 2010. Heteromediality. In *Media Borders, Multimodality and Intermediality*, ed. Lars Elleström, 225–236. Basingstoke: Palgrave Macmillan.

———. 2020. Towards an Intermedial Ecocriticism. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 2*, ed. Lars Elleström, 117–148. Basingstoke: Palgrave Macmillan.


*Theoretical Reassessment*, ed. Dominique Chateau and José Moure, 186–199. Amsterdam: Amsterdam University Press.

*Clouds in Sils Maria*, directed by Olivier Assayas. 2014. Artificial Eye, 2015. DVD.

*Commune*, directed by Thomas Vinterberg. 2016. Artificial Eye, 2016. DVD.


———. 2020. The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 1*, ed. Lars Elleström, 3–91. Basingstoke: Palgrave Macmillan.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Truthfulness and Affect via Digital Mediation in Audiovisual Storytelling

# *Chiao-I Tseng*

#### **Contents**


# 5.1 Introduction

This chapter investigates the intermedial relation of different media frames blended in narrative film and in what ways the intermedial blends, particularly the digital mediated images in films, impact on the two narrative functions—namely, the viewer's interpretation of message truthfulness and affective engagements. The term 'digital mediated images', as it is applied in this chapter, should be read as a broader conception than that of just the new digital media used diegetically by fictional characters in the

C.-I. Tseng (\*)

University of Bremen, Bremen, Germany e-mail: tseng@uni-bremen.de

<sup>©</sup> The Author(s) 2021 175

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_5

film (cf. Tseng 2018b): in this chapter, it describes various forms of added realism, among them news footage, intra-diegetic camera, and computer screen.

We live in a culture that is increasingly reshaped by transformations in audiovisual media. Within the last 20 years, our film experience and its presence in everyday life have changed rapidly—the film experience now encompasses a larger, ever-shifting human interaction with technologies of perception and expression. Against the backdrop of our increasingly technologically mediated society, this chapter addresses why digital mediation in film is a valuable resource for exploring the persuasive and rhetoric capacities of audiovisual storytelling itself and how it addresses questions of perception, affect, and truthfulness.

For decades, new media features have been frequently blended in the making of narrative films. In the 1960s, McLuhan (1964) already proposed that new inventions and technologies produce variations in sensory input that customarily require adjustments from our sense organs. More recently, Brown also argues, "digital technology has expanded cinema and the psychological sciences have expanded our understanding of perception to such a degree that new theories of cinema and our perception(s) of it are urgently required" (Brown 2013: 8). In summary, it is generally believed that the emergence of new media forms leads to the constant modifications to the way we experience audiovisual storytelling. Drawing on these technology-centered proposals, one focus of this chapter is precisely to interrogate *to what extent* evolving technologies blended in narrative film modify the viewers' narrative interpretation process and viewing experiences.

Academic discussions of digital mediation in film have often been focused around particular genres, for instance, found footage in horror films (Heller-Nicholas 2014; Sayad 2016), computer screen and diegetic camera in thriller or war films (Stewart 2009; Pisters 2010). This chapter poses the question from a different perspective and instead broadly asks how the aesthetic choices of digitally mediated images generate the seemingly paradoxical narrative impacts and interpretative outcomes—subjective affective intensity and objective narrative truthfulness. In principle, mediated images in film often function to enhance the viewer's interpretation of truthful storytelling. For example, historical docudrama or biopics often use news reports or historical archive footage for endorsing the authenticity of information source. Thrillers or horror films use the strategy of found footage, namely, images shown via the character's handheld camera or computer screen, to add narrative realism through simulating the visual aesthetics we are familiar with in our day-to-day life. In many other cases, using mediated images in film are dominantly subjective and evoke a viewer's affective attachment rather than detaching the viewer to reflect on objective fact. Section 5.2 will first review perennial paradox of two seemingly mutually exclusive narrative effects: emotional engagement and message truthfulness. Section 5.3 will then tackle the paradox by proposing a multi-leveled, semiotic framework synthesizing cognitive research findings and semiotic conceptualization (cf. Tseng 2018a; Tseng et al. 2018), distinguishing the two analytical levels: presemiotic level of media properties and semiotic level of narrative semantic structure (cf. Elleström 2020). This chapter will then employ the framework to analyze the various forms of digital-mediated images in film. The analyses will shed light on how the affective and interpretative functions are closely intertwined rather than paradoxical.

# 5.2 Perennial Paradox: Achieving Affective and Truthful Impacts

For decades, affective engagement and cognitive-logical interpretation in audiovisual narratives have been regarded as incompatible processes. Several film scholars, for instance Deleuze (1986), have argued that the transformative and immersive power of cinema specifically arises from an ability to produce non-cognitive affective immersion that can persist beyond the conscious consideration of narrative interpretation. While cognitivism sees cinema as naturally conducive to human systems of meaning-making, the affect-centered perspective argues that film narrative is able to destabilize the schema of human thought of logical order of event perception and action, and this non-cognitive, bodily experience is the basis for generating affective response in film Deleuze (1989: 169).

The paradox of emotional engagement and cognitive-based interpretation of message truthfulness has also been raised in the broader realm of communication research. Recent empirical evidence has shown that affective engagement in the character-based narratives can powerfully achieve narrative persuasive impact, and the narrative focusing on cognitive reasoning of truthful events decreases the strength of such impact (Green et al. 2002). One explanation for this is that the fundamental constituents of human memory and social communication rest on a person's experiences and the experiences of others, rather than the cognitive activities of knowledge reasoning (Schank and Abelson 1995). In other words, research suggests that it is the default mode of human thought that a narrative way of thinking drawing on someone's personal, emotion-laden experience endows individual cases and anecdotes with significant weight toward evaluating evidence (Bruner 1991).

Moreover, the paradoxical relation between affective engagement and logical reasoning of truthful events and actions is also the center of debates in the context of educational impact. Some scholars believe that emotional engagement in fictional story might distract the learning of socially and politically important issues, because by engaging a narrative in an affective way, the communication fails to reach the goal of engaging scientific, logical reasoning (Bogost 2007, 2017). Moreover, scholars also indicate that, since affect-centered storytelling can be highly persuasive regardless of the validity of the underlying truthful claims, the use of narrative messages becomes an oversimplification at best or manipulative at worst (Dahlstrom and Scheufele 2018). Even an otherwise desired outcome (such as correct knowledge about climate change) could be viewed negatively if the desired reasoning process of message truthfulness was not engaged. Hence, in terms of educational purposes, the paradox comes into focus: affective storytelling can engage people and make educational information relevant to personal experience, while simultaneously encouraging a narrative way of thinking that places scientific stories on a similar level to any other plausible story that may or may not support message truthfulness.

In sum, it is believed that the processes of affective engagement and the representation of message truthfulness are mutually exclusive. One decreases the impact of another, that is, the more affective engagement is triggered, the more uncertainty, less truthfulness of the narrative becomes. And vice versa, the more authentic the message one intends to communicate, the more affect-neutral narratives one needs to construct. The next section tackles the issue by showing the interconnectedness rather than mutually exclusiveness of the two processes. This tricky relationship can be best unraveled by analyzing the coexisting functions of affective and truthfulness in film using the techniques of mediated images.

# 5.3 Tackling the Paradox via Semiotic Approach to Narrative Functions

As various digital, dynamic forms of mediated images emerge in cinema, film researchers have argued that inserting multiple frames stirs up the viewer's awareness of non-diegetic manipulation of filmmaking and hence destabilizes the coherent semantic contents of the narratives (cf. Stewart 2009; Ecke 2010; Poetzsch 2012). Nevertheless, our previous work has insisted that the multiple frames in the film actually do not disrupt the narrative construction at the semantic level (Tseng 2016, 2018b). In other words, despite the ever-present framing devices of digital mediated images, such as timecodes, hyperlinks, shaky images—the linearity and sufficient cohesion mechanisms of the narrative semantics still construct a straightforward meaning comprehension path for the viewers. Our analysis untangled the confusion of narrative incoherence and disruption of digital mediation, and this is based on the semiotic framework distinguishing the two fundamental analytical levels: media materialities and semantics of narratives.

For tackling the paradox of affective engagement and meaning interpretation in film, this chapter expands our previously proposed semiotic framework and puts forward a multi-leveled approach to narrative functions, integrating semiotic and cognitive findings, and addresses how the media affordances of digital mediation can actually synchronize different narrative functions at the narrative semantic levels.

In particular, we discuss these issues by exemplifying and analyzing the *beginnings* of several recent films substantially using digital frames, such as *Redacted* (2007), *Cloverfield* (2008), *Searching* (2018), and *Profile* (2018). We focus on the beginnings of the films because beginnings in all films function specifically to establish a hypothesis and emotional expectation, to provide first impressions that later developments of the narrative will be measured against (Hartmann 2009). In psychological terms, the function of the initial portions of a film has been described according to the primacy effect and priming (Luchins and Luchins 1962) or anchoring bias (Tversky and Kahneman 1974). A distinctive structuring function has also been theorized in studies of text linguistics, for example, by Martin (1992), who develops the notion of "macro-themes" to describe a communicative function that serves the role of signposting the organization of the text following.1

In addition, we select these films for analysis because they represent the application of evolving media technology in narrative films within the recent two decades. Some films are made in the 2000s when portable digital camera is widely used for documenting and preserving events, while others are made ten years later, when online communication and social media become a crucial part of the communication and information distribution. As the following sections will show, despite the evolving technologies, the main narrative functions of truthfulness and emotional engagement remain stable, while different media affordances between camera and computer screen indeed bring about some distinctive ways of realizing these narrative functions.

#### *5.3.1 Multi-leveled, Semiotic Approach to Narrative Functions*

Several film scholars have proposed that a general distinction needs to be made between the process of narrative interpretation and the perception of individual media techniques. This distinction is particularly significant when examining the evolution and functions of film style. More specifically, it has been frequently argued that a high degree of narrative and discourse stability is the basis of narrative inference and genre expectation for spectators, despite the gradually dynamic deployment of audio-visual devices in recent decades. For instance, Bordwell (2006) identifies major features of spatial-temporal styles that have been astonishingly robust throughout the evolution of filmmaking. Jones (2015) compares 3D and 2D formats and shows how they function similarly in terms of narrative coherence and effect. Bordwell and Jones both argue that despite the evolving visual techniques over time—such as shorter average shot lengths, the use of wider range lenses of 3D format and computer-generated images—the composition of "space, time, and narrative relations (such as causal connections and parallels)" in mainstream films remains straightforward to identify and leads the viewer to effortless comprehension and prediction of film narrative (Bordwell 2002).

The distinction between filmic semantic meaning and media properties is shown in the diagram in Fig. 5.1. The analytical levels are developed for film, building on the theoretical notion of semiotic stratification, in particular, on the distinction between a stratum of discourse and one of form (Martin 1992). The left part of the diagram is the multi-leveled framework generally divided into media techniques at the bottom level and narrative semantics, social-cultural domains at the higher levels. The

**Fig. 5.1** Strata of narrative functions in film analysis

distinction is reminiscent of stratification that occurs in semiotic approaches such as the distinctions of expression plane and content plane by Hjelmslev (1953).

The more important notion, as far as meaning making in film is concerned, is the interrelationship between these strata. These strata are interrelated by realization: that is, film semantic configurations are 'realized in' the contextualization of media properties, and at the highest level, socialcultural aspects such as genres or aesthetic styles are realized in the contextualization, configuration of semantic structures. On the basis of this framework, what needs to be particularly noted is that a single type of material or film devices does not lead directly to any specific kind of meaning interpretation. For instance, as mentioned in the previous section, the media property of 'digital mediated frames' does not directly lead to any dynamic or demanding meaning interpretation processes. It is the contextualization of these digital frames in the broader narrative structures, which achieve specific narrative functions.

Similar distinction has also been explicitly pointed out by Elleström (2020), when pointing out the crucial categories of presemiotic phenomenon of mediation and semiotic phenomena of representation. In his words:

Mediation is the display of sensory configurations by the technical medium (and hence also by the media product) that are perceived by human sense receptors in a communicative situation. It is a *presemiotic* phenomenon that should be understood as the physical realisation of entities with material, spatiotemporal and sensorial qualities—and semiotic potential. For instance, one may hear a sound. Representation is a semiotic phenomenon that should be understood as the core of signification, which I delimit to how humans create cognitive import in communication. When a perceiver's mind forms sense of the mediated sensory configurations, sign functions are activated and representation is at work. For instance, the heard sound may be interpreted as a voice uttering meaningful words. (Elleström 2020: 39)

Applying the analytical strata to examine digital mediation in film, the rest of this chapter will address the hypothesis: The two seemingly paradoxical narrative functions, message truthfulness and affective impact, can actually complement each other, when the presemiotic property of digital media frames are contextualized/semiotized with the narrative strategies, such as restricted narration or media channels widely used in the societies of the audience.

To address the hypotheses, the stratified semiotic framework put forward above needs to consider the latest findings of the cognitive research, in order to explain how the functions of truthfulness and affective engagement are achieved by blending digital media frames in films. This is elucidated in the following subsections.

#### *5.3.2 Media Frames, Human Memory, and Truthfulness*

In general, blending conventional cinema with the media frames, which the viewers use in their day-to-day life, increases the viewers' perception of message authenticity and enhances the persuasive and rhetoric function of narratives. This hypothesis draws on the recent findings of neuro-cognitive sciences (Zacks 2015: 101–107). The empirical evidence shows that a piece of information does not necessarily have a persuasive impact in the moment we perceive it, but its significance may grow in our memories over time. Moreover, our memories about the media channel which carries the information may be blurred over time. For instance, we have probably all experienced this before: you can remember the content of a certain piece of information but do not quite remember where you saw, read, or heard it. Hence, the plausibility and truthfulness of a message framed by mediated images in film may increase over time. This is precisely the empirical foundation of mediated image and the narrative function of enhancing truthfulness in film.

Using mediated images to increase the truthfulness has been applied in cinema for decades. For instance, Oliver Stone's film *JFK* (1991) cuts between actual footage of the alleged assassin Lee Harvey Oswald and the staged images of actor Gary Oldman who plays Oswald. It mixes fact with fiction to propagate the idea that Kennedy was the victim of a conspiracy. Phyllida Lloyd's *Iron Lady* (2011) uses a similar strategy, intercutting between close-ups of Meryl Streep and real news footage, which sometimes includes archival images of Margaret Thatcher, in an attempt to blur the boundary between factual frames and fictional frames. Apart from biopics, several recent war films also use online news report, YouTube clips, Skype chats, and diegetic camera to add the realism and truthfulness of the representation of soldiers' experiences and trauma. Along the same lines, the mediated images enhance the persuasive function, message truthfulness, and ideological impact through blending fictional narrative with the media frames that we use in our day-to-day life (Tseng 2018b: 54–55).

#### *5.3.3 Distinguishing Embodied and Contemplative Affects*

The narrative functions of affective engagement and truthfulness can be seen as intertwining rather than paradoxical if we consider the multilayered research framework of human emotion recently put forward by several affective psychologists (cf. Asma and Gabriel 2019).

In general, human emotions are filtered through the three inter-relating layers of mind: At the lowest primary level, affective intensity, such as fear, lust, thirst, prod human beings for the exploitation of resources. At the middle level, human brain creates a close link between these primary affective systems and the experiential learning and conditioning that we undergo in our daily life. At this secondary level, fear, for instance, becomes more specific due to the day-to-day encounter with people and other objects. For example, we tend to be afraid of the dark, we have fear of height, we feel uncomfortable in a restricted, cramped space or when hearing grating sounds. At a higher level, emotion is enmeshed with higherlevel conceptual and narrative thinking. At this level, we arrive at social-cultural related emotions, such as ruminations and elaborate, contemplative feelings. The empirical evidence also found that, although higher-level emotions are still rooted in the lower level of human primary affective instinct, nevertheless, the higher level plays a crucial role in the cognitive executive functions of the mind, slowing and policing our more automatic primary responses.

We can generally map these basic layers to the affective functions of audiovisual storytelling reviewed above. The affective types, which are bodily grounded and regarded as non-cognitive, can be categorized as emotions at the two lower layers. For instance, embodied affective response triggered by restricted space in film, distorted images and creaking sounds. The digital media frames can trigger these embodied affective responses when they are contextualized in particular narrative strategies. This is exemplified in the war film *Lebanon* (2009). Digital media frames are contextualized with the strategy of confined space. This film is a compact war film focusing on a group of Israeli soldiers operating a tank in hostile territory during the 1982 conflict in Lebanon. Example screenshots of the film are displayed in Fig. 5.2. Many scenes of the film use the limited, rounded-off perspective of the frames (implying the soldiers' perspective from the tank toward the battlefield outside). In other words, *Lebanon* blends the frames of the soldier's telescope with restricted space to embody a cramped and suffocating affective response.

Finally, the higher-level emotions, namely, the more elaborated, empathy-related emotions are triggered by cognitive reasoning within particular social-cultural contexts. Mapping this to affective functions in film, this emotional level is the 'product' of cognitive, semiotic process, generated by the interpretation of film narrative events. This is precisely the level where the function of message truthfulness complements with emotional engagement—the viewers' contemplative emotion is supported by the metaphorical link between film events and the viewers' truthful, factual life experiences and is in turn the crucial factor for perspective

**Fig. 5.2** Selected screenshots of *Lebanon* (dir. Samuel Maoz, 2009). All rights reserved

taking, reflecting on the significance of social-cultural, educational issues dealt within the film.

Several recent films composed substantially of computer screen communications make use of the affective engagements based on this higher level. The film *Profile* (2018) is such an example. Shown in Fig. 5.3, this film follows a British journalist who dives into the online propaganda machine of the so-called Islamic State. The entire film is composed of online communication between the journalist and an ISIS fighter. In the beginning of the film, journalist Amy Whittaker creates a new Facebook profile under the alias of Melody Nelson, in order to investigate the recruitment of young European women by ISIS. She creates a persona online of a woman who has recently converted to Islam. Soon Amy is contacted by Bilel, an ISIS fighter from Syria. They begin to talk to each other via Skype, before she dangerously develops real romantic feelings for him. Throughout this film, the use of computer screen and online chats establishes the metaphorical link to the truthful communication form that is widely used in our digital age. The link then enhances the empathetic affective engagement drawing on the audience's familiarity with Skype and other social media. This then supports the contemplative emotion triggered by issues and stories depicted in the film.

### *5.3.4 Forms of Digital Mediation in Film and Affective Engagement*

It is often argued that blending mediated frames such as diegetic camera or computer screen in film enhances the viewer's emotional attachment to the character due to the dominant use of point-of-view shots. Despite the embodied affective response possibly triggered by point-of-view shots, as

**Fig. 5.3** Selected screenshots from *Profile* (dir. Timur Bekmambetov, 2018). All rights reserved

analyzed in the *Lebanon* example, nevertheless, empirical investigation has suggested that the film technique of point-of-view shot itself does not automatically trigger any higher-leveled empathetic emotion (Andringa et al. 2001). In other words, the use of the point-of-view shot does not directly function to evoke particular emotional engagement in film (Smith 1994: 39). The chapter argues that this aspect again needs to be unpacked with the distinction of presemiotic media properties and semiotic structures. The affordances of point-of-view shots to generate empathetic, affective engagements can only be highlighted when these media properties are contextualized in the particular semantic structures of narrative, for instance, when *restricted narration* (Bordwell and Thompson 2010: Chapter 3) is constructed.

Restricted narration refers to confining the viewer's narrative knowledge via a character's first-person documentation, and observation is the narrative strategy that has been used in filmmaking for decades before the emergence of digital mediation. Ever since the emergence of the portable camera, films such as *Georg* (1964), *The War Game* (1965), and *84 Charlie MoPic* (1989) have made use of the technique of mediated images to present a compilation of first-person observations. The use of footage from a mediated first-person perspective confines the audience's knowledge to what a specific character knows. One major affective function of such restricted narration is that it powerfully builds suspense, uncertainty (this is why handheld camera footage or compute screen footage is widely used in horror films) and forces the viewer to empathize with the character's experiences in the story world. In other words, restricted narration through first-person observation is an effective channel for locating the audience to empathize the character's developments contextualized within the narrative structures.

The horror film *Cloverfield*, shown in Fig. 5.4, is a particularly interesting example of restricted narration. The entire film is composed of camera footages filmed by characters. The film begins with a clear exposition, endorsing the truthfulness of the footage resource—it is framed as a government SD video card. The official-looking writing tells us we are about to see video recovered from a camera found in Central Park. When the tape starts, showing the main characters in happy times in the bedroom of her apartment overlooking Central Park, its readout date of April plays the role of an omniscient opening title. In the course of the film, the digital readouts explicitly tell the viewers when the story events take place in the earlier phase of their love affair and when we are seeing the horror attack

**Fig. 5.4** Selected screenshots from *Cloverfield* (dir. Matt Reeves, 2008). All rights reserved

by a monster in May. In other words, although the footage looks fragmented in order to represent the truthfulness of recovered footage by the government, they are cohesively bound together into a conventional horror genre structure.

Due to the restricted narration throughout the film, the viewers know no more than the characters being attacked by a monster. But the viewers are first given glimpses of the main characters' relationships and emotional responses to their peril and are able to build empathy drawing on the character-centered narrative developments. The film also gives more wideranging information about what is happening outside the characters' immediate situation, by showing newspaper reports, radio bulletins, and TV coverage of action occurring elsewhere.

Another form of digital mediation draws on simulating the interactive experiences that we encounter each day, for example, social media on computer screen, smartphone interaction, and YouTube videos. As mentioned in the analysis of *Profile*, this form of digital mediation has the potential for triggering our empathetic attachment drawing on the metaphorical projection from the communication that we encounter within our daily social circle. Nevertheless, along the same lines, the affective intensity of digital mediation is built upon combining our familiarity of the media forms and the contextualization of these digital media properties within the narrative semantics of story contents. Often the affective impact substantially relies on the dramatically emotional story themes that the characters are dealing with. In other words, digital frames might add a layer of emotional attachment to the overall emotional impact due to our daily experiences with these media platforms; nevertheless, the overall arch of story forms and contents remains the main trigger of empathetic, affective intensity.

This proposal can be best exemplified in recent films such as *Redacted* (2007). This war film is based on a real event that took place in Samara in 2006, when a group of young American soldiers raped and murdered a 14-year-old girl and killed her family. In an interview at the 45th New York Film Festival, the film director Brian de Palma explained that he was adopting an experimental method to tell the story by using footage he found on the Internet. However, in bringing these fragmented media stories together, he had to fictionalize and restructure that existing material, and it is this process that ultimately gives the film a classical, coherent narrative structure.

The film begins with a video diary recorded by Private Angel Salazar's camcorder, which provides the main media source for the events depicted in the film. This is followed by several other mediated formats: a French documentary with voiceover narration, reports from Arab news channels, camera recordings played on Al Qaeda sites, embedded journalist reports, clips from 'Soldiers' Wives' and the 'Get Out of Iraq' campaign websites, military surveillance cameras, recordings of military hearings, and so on. Pisters argues that in this film, "all these different formats and screens are entangled in complex ways and present different points of view of the same events" (Pisters 2010: 238). The dominant strategy of the film, namely, representing points of view, is used right in the beginning when the film opens with footages of soldier's handheld camera.

As discussed above, these media devices are contextualized in the soldier's war narratives drawing on the discourse of restricted narration right in the beginning of this film, we are confined to see through the soldier's eyes and are put to interpret truthful experiences of these American soldiers. The main character and his interaction with other soldiers are presented by him filming himself and his surroundings, shown in Fig. 5.5. This is the technique of mediated first-person point-of-view within restricted narration defined by Bordwell and Thompson (2010: Chapter 3)—although the character captures story events for the viewers, the footage nevertheless fit the premise of video recording to the demands of coherent, conventional narrative structures.

A decade later, several films start to use computer screen to represent first-person point-of-view. While several film critics celebrate the new way of filmmaking, nevertheless, the semantic strategies of restricted narration, for representing truthful message sources and emotional response drawing on characters' narrative developments, remain similar to other forms of digital media frames used decades ago.

**Fig. 5.5** Selected screenshots from *Redacted* (dir. Brian de Palma, 2007). All rights reserved

**Fig. 5.6** Selected screenshots from *Searching* (dir. Aneesh Chaganty, 2018). All rights reserved

*Searching*, shown in Fig. 5.6, for instance, deals with a father who breaks into his missing daughter's laptop to find out everything he possibly can about her sudden disappearance. The entire film takes place from the perspective of computer screens, which not only truthfully demonstrates the crime-solving potential of the Google search engine and social media information but also shows just how much of ourselves exist in online spaces. The film begins with several video clips and pictures shown directly from the main character's computer. These videos and pictures linearly depict story background: the family Kim's life and finally their loss of the mother/wife. The film beginning, like *Redacted* and *Cloverfield*, first anchors the viewer's emotional attachment to the main characters by showing their intimate life experience and character features, before the traumatic events unfold.

As our example analyses show, film substantially using digital mediation often starts with endorsing the truthful media sources and lead to fragmented first-person point-of-view; nevertheless, these films still construct conventional and coherent narrative structures, providing sufficient background knowledge for the viewers to interpret the linear, straightforward story events, which evoke empathetic, emotional attachment to the main characters.

# 5.4 Final Remarks

To date, several film scholars have mixed the perception of media and narrative functions. This chapter pointed out the problems caused by the mixture of media properties and meaning interpretation process. It proposed to distinguish levels of mediation and semiotic representation drawing on the findings of cognitive and semiotic research. This chapter particularly uses the film technique of digital mediation to show the process of contextualizing mediated frames in order to realize semiotics-based narrative functions, above all, representation of message truthfulness and affective engagements, which have long been regarded as paradoxical.

Mediated images have been used in cinema for decades for enhancing the affective engagement and to endorse the first-person truthful narration. On the basis of the multi-leveled framework delineated in Fig. 5.1, digitally mediated frames is a presemiotic property—it can be contextualized in horror films to achieve the semiotic functions of affective intensity; it can also be semiotized in drama or war films for enhancing first-person truthful narration. As the film examples presented in this chapter, digital mediation in film can often interconnect the affective and truthful narrative functions. Moreover, in the last two decades, digital mediation has moved the mediated first-person point-of-view from portable camera to computer screen. As we could see in the above example, analyses, despite the change of media frames, the narrative functions of restricted narrative, representation of truthful message resources and straightforward character engagement remain dominant.

Nevertheless, the distinctive media affordances of computer screens used in film bring about other narrative functions which camera footage does not construct. While camera footage reflects a medium that bears narrative actions, computer screens simply bear narrative actions. Computer screens add a touch of *immediacy* to the first-person point-ofview. For instance, the character's typing and deleting the message on the screen or dialing Skype calls, clicking on webpages, and so on are all part of immediate point of view. Essentially, the viewers are just watching words and dialing, clicking, browsing actions on a screen, but the actions feel relatable and human. In our real lives, we do not just passively watch computer screens like we watch camera footage; in our lives, we constantly interact with them as a means of communication which, in turn, has grown new forms of human behavior in our age. Computer screen-based, immediate point of view in narrative film depicts this new interactive format of ours and new affordance of media which has been explored in cinema today.

Nevertheless, the comparison between computer screen and camera footage holds only when we base the analysis on the multi-leveled semiotic framework—the function of enhancing immediacy in computer screens can be achieved only when particular semiotic structures are realized: for instance, when the character's actions of clicking, typing directly on the computer screen are shown to the viewers within the first-person film frames. Simply showing a computer screen without co-patterning these event actions would not achieve the same effects. For instance, in the war film *Redacted*, analyzed in the previous section, several websites, YouTube videos, and Skype chats are also shown via computer screens (see Fig. 5.7). However, the semiotic structures of these scenes in *Redacted* do not reflect the character's interaction with the media features of computer screens.

**Fig. 5.7** Selected screenshots of computer screen scenes from *Redacted* (dir. Brian de Palma, 2007). All rights reserved

Here, the computer screen is a platform similar to the intra-diegetic camera screen which carries the mediated actions. This might still endorse the message truthfulness due to the digital frames the viewers are engaged in daily, but the first-person perspective is not realized in this kind of event construction.

In summary, this chapter has analyzed films made in the last two decades, entirely composed of digital mediated images. The analyses compared the different media affordances and narrative functions of camera footage and computer screens. While the narrative functions of truthfulness and affect remain stable, the affordances of computer screens add the immediacy of actions to the first person point of view in film. Through the analyses, I hope to have shown that subtle comparisons of semiotic representation, media perception, different types of affective response, and emotional engagements can be more effectively unraveled with a multileveled framework, which encompasses sufficient research synthesis across empirical findings of cognitive studies and semiotic theories.

### Note

1. The application of 'macro-theme' to film narratives has also been discussed in our previous work, which unravels how puzzle films or narrative complex films construct beginning structures for the viewers' narrative affective and narrative predictions (Bateman and Tseng 2013).

#### References


———. 2017. Video Games Are Better Without Stories: Film, Television, and Literature All Tell Them Better. So Why Are Games Still Obsessed with Narrative? *The Atlantic*. https://www.theatlantic.com/technology/ archive/2017/04/video-games-stories/524148/. Accessed 31 January 2020.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Reading Audiobooks

*Iben Have and Birgitte Stougaard Pedersen*

#### **Contents**


This chapter is a revised and translated version of the article "At læse en lydbog" in *Litteratur mellem medier* (Aarhus University Press, 2018). Reprinted by permission of Aarhus University Press. Translation: Christopher Sand-Iversen.

I. Have • B. S. Pedersen (\*)

Aarhus University, Aarhus, Denmark

e-mail: birgittestougaard@cc.au.dk

<sup>©</sup> The Author(s) 2021 197

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_6

#### 6.1 Introduction

Smartphones, tablets and computers make texts available in a number of different technological formats and have changed the way we read in the digital age. This raises the question of whether we need to understand reading on new terms. It can be done by theoretically embracing and investigating the multimodality of texts as a precondition for mediaspecific analysis, as suggested by Lars Elleström (2020). It can also, in continuation hereof, be investigated in relation to the multimodal aspects of reading as well as to the technological conditions for reading. To illustrate the current changes in digital reading this chapter discusses a medium that has gained huge global popularity because of the development of digital technologies. It is also a medium that challenges a traditional conception of what it means to read a book because—can you read with the ears?

An audiobook is an electronic book format which is listened to instead of being read in the traditional sense. Long before e-books became available, literature appeared in the form of audiobooks in electronic and digital formats. Historically, the audiobook has been described as a kind of by-product of the printed book and as a service for readers who for various reasons have difficulty reading printed books—either because they do not see well, have not learned to read (yet), or because they are dyslexic.

This has changed with the advent of digital media: first, the audiobook is no longer a by-product which, if the sales figures for the printed book are high enough, is recorded long after the book is printed. Today, the market for audiobooks is so big that they are often published at the same time as both the printed and the e-book, which creates a flexibility of reading choice from the moment of publication. Second, audiobooks are no longer for the few, but for everyone. The digital audiobook appeals to a much broader group of consumers than audiobooks did previously. As early as 2006, an American study showed that people who listened to audiobooks on an average became younger, compared to earlier years, and more well off (Audio Publishers Association 2006: 1). In addition, around half of audiobook customers are men, who otherwise only buy one in four books sold (Arvin 2010). By definition, an audiobook is a recording of a printed, published book (Have and Pedersen 2016), but the explosion in usage of audiobooks has caused a detachment from the printed original, so the audiobook is recognised as a medium in its own right. The mobility of the audiobook and the possibility for readers to engage with literature at the same time as they are, for example, doing exercises or commuting by bike, train or car, also fits well with a modern lifestyle.

The digital audiobook raises a number of interesting issues regarding its modal aspects—not at least compared to the experience of reading a printed book. We have previously discussed the distinct features of the experience of book reading and audiobook reading, building on Lars Elleström's ideas of the modalities of media (Elleström 2010; Have and Pedersen 2012, 2016). Here we have been highlighting that according to Elleström's model for understanding media (2020), the digital audiobook and the printed book differ in a number of aspects, which makes it evident that we need to understand the different literary experiences in a media sensitive way. This means that we underline the importance of technology and the context of the reading situation, while being sensitive to the specifically auditive sensory aspects of the audiobook (e.g., the voice and the temporal aspects of the experience).

The chapter takes a context and user perspective on the experience of audiobooks and asks the fundamental question, to what extent we can say that we 'read' an audiobook. Reading is conceptualised as an institutionalised skill which is learned in school and which is connected to the cooperation between sight and cognition. Research on reading in schools is often tied closely to national contexts, and in that sense, you will find both similarities and differences when it comes to national research on reading. It seems, however, that concepts like phonemic awareness, phonics, fluency, vocabulary and comprehension are among the ingredients of reading definition, in this case, the national reading panel of the United States (Read Naturally 2018).

If reading in everyday usage means, among other things, visually "perceiving the content of written or printed texts", as Gyldendal's Den Store Danske dictionary (2019) defines it, we cannot say that we read an audiobook. If, on the other hand, reading is about "recreating mental images on the basis of identification of the text's words", as it is defined in the same source (2019), we may argue that we can read by listening to literature in the same way that we also use the concept of reading when speaking of Braille. In a Danish context, the aspect of written word decoding has for many years been at the core of the reading definition put forward by leading reading researches which makes it difficult to include use of audiobooks as a reading practice.

We wish to emphasise the sensorial as well as the technological aspects of the reading activity; in that sense we regard the audiobook reading activity as multisensory in itself. This also can be said to be the case regarding printed book reading: you read a book in a specific setting, using your sight to visually and cognitively perceive the written text; in that sense, also reading a book takes place in a situation where sounds, smells and other bodily senses are important. The audiobook reading activity, however, not using sight for the reading activity, creates new possibilities for mobile reading experiences, listening while walking, bicycling or driving, and the mobility takes part in the meaning creating, semiotic process of reading a book. The multisensory aspects act beyond language; however, they take part in the meaning creating process. In this sense, we displace the focus of interest in the analytical strategies from a more classical close reading of a text to include also the technological aspects as well as the sensory, situational and context-oriented aspects of the reading situation.

The first two sections of the chapter are about what sort of medium the audiobook is, how we use it, and what sort of reading experience it affords. Thereafter three sections follow which suggest, on the basis of Helle Helle's novel *Ned til hundene* (*Down to the Dogs*) (2008), how the audiobook experience as a whole may be analysed with regard to 'technological framing', 'reading situations' and 'the voice', respectively. After these three sections, the reading of audiobooks is discussed in relation to the experience of time and depth, before the chapter concludes with a brief résumé.

#### 6.2 The Formats of the Audiobook

Whether it makes sense to call an audiobook a book is an open question, since it as a medium, as experience and in usage is fundamentally different from the printed book (Have and Pedersen 2016). Technologically and materially, the audiobook has nothing in common with the printed book; rather, it shares its technology and formats with music. Thus, the technological histories of the audiobook and of recorded music run parallel. The starting point was Edison's invention of the phonograph in 1877, the original aim of which was to record speech. Later, in around 1900, the vinyl record was invented, and in the 1970s, the cassette tape became the audiobook's primary storage medium, so that it could now be listened to on cassette recorders, Walkmans and the inbuilt tape decks in cars. It was also with the invention of the cassette tape that the term 'audiobook' began to be used about recorded books (Rubery 2011: 8).

In the 1980s, the digital CD slowly began to take over the market for audiobooks. First as digital audio CDs to be played on traditional CD players, later as MP3 CDs. MP3 CDs can be played in the CD player at home or in the car, but the compressed files can also be transferred via a computer to most computer-based playback media such as smartphones (Sterne 2012). Even though audiobooks are today still in circulation as both CDs and cassette tapes, audiobooks have become less tangible and are now primarily disseminated via the Internet as downloads or via streaming. Due to technological developments, audiobooks have thus become easier to use. To take an example from Matthew Rubery's introduction to the book *Audiobooks, Literature, and Sound Studies* (2011), the development of storage media for audiobooks means that Tolstoy's *War and Peace* in an unabridged edition has gone from demanding 119 vinyl records, 45 cassettes or 50 CDs, to having become today, with the MP3, weightless (Rubery 2011: 9).

That it nevertheless makes sense to speak of an audio*book*, in spite of the technological, aesthetic and usage-based differences from the printed book, is due to it requiring, according to our definition of an audiobook, a prior or contemporary printed book and an institutionalised literary context in the form of authors, publishing houses, bookshops, libraries and so on. This means that not all recordings of texts read aloud are audiobooks and that a recorded oral tale without a written source is not an audiobook either. This also means that there are differences between audiobooks, talk radio and podcasts—even though they all more or less consist of texts read aloud—because the two last-named typically arise from media institutions and 'on-demand blogging culture'. At the same time as the audiobook is part of the literary ecology, it is also part of the culture surrounding mobile sound media—that which Michael Bull described as "iPod culture" (2007), but which has today become part of a broader smartphone culture. With the smartphone as the primary platform for listening to audiobooks, the discussion of the audiobook as a medium is also inscribed in a broader discussion about media convergence, where it merges with various other everyday private and social digital activities (Schulz 2004). By defining the audiobook as a sound recording of a literary or academic book which is read aloud, usually by professional actors or the author him/herself, we understand the audiobook as a remediation of the book (Bolter and Grusin 2000), which underlines that the auditive mediation of literature adds substantial new aspects to the work. The narrative and its structure are the same, but in the audiobook the way in which it appears, and thus is experienced, changes radically (Bednar 2010: 80). Seen historically, the audiobook is not just a remediation of the printed book but also refers back to the oral tradition of oral tales and the reading aloud of novels, long before literature became an institutional concept (Ong 2002).

# 6.3 Do We Read an Audiobook?

Today, we access texts in a number of different ways, among them in books, on paper, smartphones, tablets and computers. When we listen to an audiobook, are we then reading the book, or are we listening to it, and are these two activities fundamentally different? In 1994, Sven Birkerts wrote in the chapter "Close listening" in *The Gutenberg Elegies* (1994) of both a fascination and an aversion to the audiobook experience:

[O]nce we grant the audio book its attractions, we are still confronted with the question of its whatness. This is no mere epiphenomenon; it is a fullfledged trend. As life gets more complex, people are likely to read less and listen more. The medium shapes the message and the message bears directly on who we are; it forms us. Listening is not reading, but what is it? (Birkerts 1994: 145)

Audiobook researchers today do not necessarily agree with Birkerts that we do not read an audiobook when we listen to it. In order to discuss the differences between listening and reading, it is however necessary to speak of the activity of listening as something other than reading understood as a visual decoding of writing. One argument for such a differentiation is that in our everyday usage of media, we also change between platforms when 'reading'. Don Katz, the founder of Amazon's audiobook service Audible, who has a promotional rather than research-based perspective on the matter, says:

We're moving toward a media-agnostic consumer who doesn't think of the difference between textual and visual and auditory experience […] It's the story, and it is there for you in the way you want it. (Don Katz cited in Alter 2013)

The individual semiotic and sensorial expressions—individually or together—function, according to Katz, to a greater degree than previously as channels which mediate stories. Katz is perhaps right in saying that in everyday usage we think less about the specific sensorial medium (e.g., sound or writing) in which the story appears. It may be argued, on the other hand, that the technological medium, the experience and the physical usage are different depending on whether you are reading Scott Fitzgerald's novel *The Great Gatsby* from 1925 as black ink on white pages with a hardcover, whether you are listening to Jake Gyllenhaal reading the novel aloud through headphones on a smartphone or whether you are allowing yourself to be absorbed in the darkness of the cinema together with other cinemagoers while watching Baz Luhrmann's film version from 2013. The medium is of importance, and this transposition is not a frictionless transition, but it alters the object itself. An analysis of an audiobook must therefore be medium-specific and medium-sensitive. A story changes when it is moved to another medium, and strategies of analysis must therefore be developed which are sensitive to these material and technological differences (Hayles 2004). The concurrent presence of both audio and visual text at, for example, the smartphone, offers the technological possibility to read with both the eyes and ears on the same platform—for example, through Amazon's feature *Whispersync for Voice*, which makes it possible to change 'seamlessly' (as they describe it) between an e-book and an audiobook version, which supports Katz's argument about the 'agnostic' media consumer.

In everyday speech, it is still widespread to speak of the activity of reading an audiobook as listening, but in our experience, consumers of both audio and paper books do not specifically differentiate between which books they have *read* and which they have *listened* to (cf. also Bednar 2010: 81). When you have listened to Jake Gyllenhaal reading *The Great Gatsby*, or to Helle Helle reading her own *Ned til hundene* (2008), have you then read that book? In one sense, the person reading aloud has read the book for you, but in this chapter, we want to insist that reading should not be reduced to the visual decoding of writing but can also be an auditive decoding of an audiobook, which offers a different form of literary experience.

Using the Danish author Helle Helle's novel *Ned til hundene* as a recurring example, we will in the following describe which parameters are important to include in an analysis of the audiobook as a medium, and of particular audiobook experiences. The specific technology, access and reading situation are important, being bearers of meaning in the analysis of medium sensitivity. In this method, we refer to the American philosopher of technology Don Ihde's so-called postphenomenology, an analytical strategy which seeks to integrate a material perspective into an experience-oriented phenomenological philosophical position. Ihde writes that "*postphenomenology* is a modified phenomenology hybrid" (2009: 23). The fundamental premise of a postphenomenological position implies that we understand technology as objects which act and together with the situation as a whole constitute a dynamic understanding of our lifeworld. In this way, this chapter also tries "to probe and analyze the role of technologies in social, personal, and cultural life" (Ihde 2009: 23).

If we analyse audiobook reading and book reading in continuation of Elleström's conception of modalities (2020), it is not possible to accept the idea of a seamless transition. Instead, taking into account the material, the sensorial, the spatiotemporal, as well as the semiotic modality, the audiobook performs the reading experience on distinct and almost totally different terms from the printed book as we have elaborated on earlier (Have and Pedersen 2012). In this sense, the theoretical framework for analysing the audiobook is inspired by Elleström (2010, 2020), however adding some perspectives. We are studying the audiobook reading situations from an everyday, sociologically oriented perspective as well, also including discussions of how the audiobook takes part in circuits of cultural value, renegotiating also the production side of digital publishing (Have and Pedersen 2019). As we see it, reading an audiobook takes place in a triangulation between everyday practice, specific technological formats/conditions and specific aesthetic or modal literary experiences of reading.

# 6.4 Narrative and Themes in *Ned til hundene* by Helle Helle

Helle Helle's novel *Ned til hundene* came out in 2008 and was published by Gyldendal Lyd (Gyldendal Sound) the same year, read by the author herself. Helle Helle usually reads her texts herself, and she has often been singled out as exemplary with regard to successful author recordings. Following a brief presentation of the novel's narrative and its narrative characteristics, some important analytical focus points will follow, which attempt to encompass some of the audiobook's particularities as a medium and reading experience.

The novel takes place in the countryside somewhere in Denmark, where we in the opening scene meet a 42-year-old female main character, the author Bente, who at the beginning of the novel finds herself with a suitcase in a not defined place. We understand that the main character has left her home without a clear aim and that she is waiting for a bus in a corner of provincial Denmark. The bus is cancelled because of a storm warning, and the main character is invited by Putte and John into their home, where they live with their dogs. We are, on the one hand, witness to a tense situation, where someone has left something behind for an unclear but potentially problematic reason. This reason is never quite revealed, but the reader follows the main character's life with John and Putte and their friends, while we are presented with glimpses of Bente's past: her husband Bjørnvig, who is a dermatologist, and her woes as an author with writer's block, who has left her disintegrating marriage—and perhaps oeuvre—behind.

The novel follows life with John and Putte, with whom the author has taken up residence, taking part in everyday life by feeding the dogs, filling in lottery tickets, playing a number of board games and visiting John and Putte's family. In general, the story describes this scenario from the Danish provinces where fundamental existential questions probably lurk beneath the surface but are rarely exposed or taken up explicitly in the text. Finally, the narrator receives a phone call from her husband, and Putte puts this potentially existential question to the protagonist: "Do you want to be found?" (or "Do you want to exist?". In Danish the sentence is "Vil du findes?" and 'findes' both means to be found and to exist).

Whereas the introductory scene suggests a kind of formative journey, the story ends up taking place in and depicting the everyday provincial scenario in a peculiarly laconic and loyal tone of voice. The situation which we as readers are thrown into from the beginning of the novel intones a tension which suggests a dramatic arc, but the expectant position remains stationary, among other things *qua* the story's tone of voice, its static plot and its preference for registering details.

A fair amount of direct speech is used in the novel, broken by descriptive, registering and reflective passages. It is built up around a number of apparent contradictions, which in terms of narrative technique appear in the shifts between interior monologue and a plan of action consisting of descriptions and direct speech. In terms of context, details such as the difficulty of eating a croissant with chicken salad, and dwelling on kneelength socks, are in this way contrasted with a tacit existential crisis and an apparent escalation of the drama: a coffin is, for example, described as well-suited both for storing board games and for trying out how it must feel to be dead. The concluding remark about whether she wants to be found or not suggests a short-circuiting of meaning or an existential doubt, which lurks in the wings of the novel, but which is seldom articulated.

#### 6.5 Technological Framework

*Ned til hundene* is a relatively short novel of 157 pages, and as an audiobook it only lasts 3 hours and 39 minutes, which is short for an audiobook. If you buy the audiobook as download from one of the major Danish Internet bookshops such as Saxo, you pay half the price of the printed edition. Once the audiobook is bought and the e-commerce confirmed, it is sent to the buyer's email address as a compressed ZIP-file. *Ned til hundene* can now be downloaded and unpacked in order to gain access to the 53 MP3 files which make up the audiobook itself, which takes a few minutes. It is now up to the consumer to name the file folder, which in this case comes with a visual book cover, which is important in giving the purchase a recognisable aesthetic representation and 'materiality'. The audiobook can thereafter be transferred to the consumer's other devices, such as a smartphone, but cannot be legally shared with others, unlike the printed book which can be lent to friends and family.

The audiobook can also be accessed via a streaming service, for example, Mofibo, which is the North European counterpart to Audible where you pay a monthly subscription in order to be able to stream or download audiobooks. When streaming, you can only read audiobooks when you have an Internet connection and therefore do not have the same feeling of owning the audiobook as you do with a download. Recently, the Internet bookshop Saxo also introduced a streaming service. From their webpage, you can now decide whether you want to buy *Ned til hundene* as download or subscribe to 'Saxo Premium' and get access to the book from an app downloaded to your smartphone or tablet.

On eReolen, a service offered by the Danish libraries, it is possible to both download and stream electronic e-books and audiobooks for 30 days at a time. The lending of audiobooks via eReolen has been such a great success that in 2015 some authors and several large publishing houses, among them Gyldendal, chose to recall all publications from eReolen for the sake of their own turnover and the authors' honorariums. *Ned til*  *hundene* is therefore no longer to be found on eReolen. The unlimited digital copies of books on eReolen have of course caused debate, since the free access of Danish citizens to literature stands in opposition to the ability of the producers to earn money. Attempts have been made to limit the lending by limiting the number of audiobooks a person can lend at any one time to three and by limiting the number of digital copies of new releases that are available for lending.

Once *Ned til hundene* has been transferred to the telephone, it is saved with the other sound files in the phone's music library. In iTunes in manifests itself as 53 'songs' on the 'album' *Ned til hundene*. In contrast to both downloads from eReolen and streaming services, the book is not saved as a long sequence but as 53 separate files, nor does iTunes remember where you got to if you listen to other sound files in between the audiobook. As the technology stands at the time of writing, a form of virtual bookmark is lacking, which works a lot better with regard to streaming. Apart from this, the interface looks like a traditional playback device, with icons for 'play', 'stop', 'rewind' and 'fast forward', as well as an indication of the volume and the ability to mix the order of the files during playback, which is hardly desirable with regard to a linear narrative. Audiobooks are often listened to through headphones, which also allow other audio signals from the telephone to break into the reading. In this way, the story stops or briefly fades out when someone calls or texts. Audiobooks are usually not recorded in stereo, so the same signal comes out of both sides of the headphones. It is obvious that small *earbuds* allow in more ambient sound than large headphones, and they may therefore be better suited to situations in which, for example, you want to be able to orient yourself in the traffic at the same time as listening to the audiobook.

In an analysis of the audiobook (both a particular audiobook and the audiobook as a technical medium), it is initially important to take into account the actual technologies and forms of distribution which frame listening, by interrogating the following aspects:


Each of these also includes some visual representations and functional structures which can be further described: book covers linked to MP3 files, interfaces, functions and organisation on the providers' Internet sites or apps and icons for using the playback medium.

The technology behind the actual production of audiobooks lies outside the focus of this chapter and will not be treated in more depth, but in the following, we will give some examples of how various situations of consumption also have an influence on the experience of an audiobook.

### 6.6 Reading Situations

The act of reading an audiobook can occur in a relatively large number of ways. When reading a paper book, we may find ourselves in all sorts of different locations, but the activity of reading always holds the attention of our sight as long as the activity is taking place. We can read at the beach, on the train, in bed, in a comfortable chair or on the bus. When reading a digital audiobook, we can furthermore read in a car, and when using a mobile technology, we can read on the way to work, in the fitness centre, while knitting, gardening, running, biking or while at work (if doing primarily manual labour).

These differences in possible reading situations at the same time point to great differences in modal spaces of experience. For example, does something happen to our experience of Helle Helle's novel if the reading takes place on a long walk through a town, in comparison to if we read the book in bed in the dark before going to sleep? If we walk through an urban area and then along a beach and the edge of the woods while listening, the everyday spaces we find ourselves in will of course have an influence on the space of reading. The internal images of the woman waiting for the bus, of the chicken salad on her sleeve, of dwelling on the kneelength socks, of the lottery ticket in her hand, will evoke a different response when they are experienced in a bodily and mobile relation to a lifeworld. There is a transfer of impressions from the real to the literary universe, and at the same time, we can imagine that the characters and the locations in the book potentially also influence our experience of the physical surroundings. The character of the landscape in the actually sensed world can evoke a response in the literary space of imagination, or the book can add an atmosphere to what we are in the process of doing. The body in motion can also be experienced as a factor in the whole experience. It lends rhythm and respiration to the reading, which can speak either with or against Helle Helle's diction, pauses and vocal qualities.

When reading in the dark while falling asleep, other image-forming processes may arise. We can in this case imagine that the detailed nature of the language, and the potentially absurd situations described, will stand out more clearly, possibly in interplay with an increased emphasis on the diction of the voice and changes in tone from description and reflection to direct speech. Potentially, the atmosphere may here be more strongly tied to a stylistic sensibility, both with regard to the character of the language and—perhaps in particular—to an increased feeling for the voice of the performing narrator.

The use of the terms listening, reading and experience are related to ideas about concentration in the reader. There may be differences in the degree of identification with and immersion in an audiobook when driving a 12-hour route in a lorry, compared to walking around the house doing housework, activities that require differing degrees of concentration. The audiobook has suffered under the idea that listening is a more distracted form of reading than reading a paper book, among other things, because we can do other things at the same time (Kozloff 1995; Have and Pedersen 2016). We would like to argue, however, that the attentiveness to the world and bodily participation has a potential for leading to more 'deep', 'immersed' readings. In the article "Reading on the Move", Lutz Koepnick suggests that:

To read between an audiobook's lines—to read an audiobook deeply means to open your minds and senses to the productive interplay of ears, eyes, and bodily motions during the act of attending to the movements of a text. (Koepnick 2013: 236)

Things which may be perceived as disturbing or distracting elements in the listening experience—the landscape which passes by, an overtaking bike—may actually contribute to reinforcing rather than impairing the feeling of identification and immersion in a story. At the same time, we wish to reject the idea that listening to an audiobook is a more passive form of reading than reading a paper book—it can be just as captivating and gripping, but it is different and therefore requires different methodological approaches than the analysis of a traditional reading experience. Analysing sensorial aspects of the reading situation are here taken beyond the relationship with the text as implied positions to also include the real reader as well as the specific context for use.

# 6.7 The Voice

The voice is the technical medium of display (Elleström 2020: 33–40), which most clearly points to the difference between the paper book and the audiobook, and an analysis of the audiobook ought therefore to include the role of the voice. The voice delivers an interpretation of the text and in so doing becomes a new medium for literature. In implementing such an analysis, it is relevant to include the five points listed below, which can flow into one another and be combined as needed (Have and Pedersen 2016: 87).


The recording of *Ned til hundene* is experienced as a very soundproof production. Only a few cuts are heard in the recording, which we notice when the intonation of the voice changes. A few sounds of the performing narrator's mouth opening and closing can be perceived, as well as a few words which are spoken with a quiet whistling or lisp.

Helle Helle's voice has a pleasant sound, which resonates in the midtones. Her voice is neither particularly compressed nor especially airy, but functions well sonorously, is calm and pleasant to listen to, in the sense that it is a voice which does not draw attention to itself in a disturbing way. The reading is delivered with a fine and balanced diction, slightly dramatised through direct speech, during which the voice is raised to a slightly higher tone. When descriptive passages are read, the voice becomes more monotone. Generally speaking, Helle Helle has chosen to emphasise the ends of words, creating a pleasant sense of rhythm. Her feeling for dramatic pauses works in support of the narrative without demanding too much attention. Generally, a relatively transparent reading has been aimed at, in which the attempt has been made to stay loyal to the written word by not drawing too much attention to the mediation of the reading.

Rhetorically, the reading style seems relatively monotone with controlled fluctuations, which supports the minimalist, soft-voiced style of the book's narrative. Stylistically, it seems almost deliberate that the contradictions in the text are not marked by pauses and/or changes in the intensity of the reading voice: the sentences "Osteoporosis—hell on earth", "It's Bente", "Do you want liver?—There's a fine smell of onions here", for example, are delivered in a flow without any particular expressivity or dynamic fluctuation. The existential difficulties are not elaborated on or interpreted in the text but revealed in the cracks of, for example, everyday observations, whether they be about lottery tickets or foodstuffs. This quality is supported here by the use of the voice, the intentional pauses of which typically seek to indicate changes in the register of the text.

The enunciation of the text is primarily about who is speaking. In a text there may be several narrators, who may be reliable or unreliable. These narrators are as we know not identical with the author, but when the author herself records her audiobooks, something happens with the voice's relation to the text (see, e.g., Have and Pedersen 2016: 116). The text's narrator and the author are physically tied closely together, and the text's positions are thus negotiated in new ways, which in Helle Helle's case has several interesting implications. Partly there is the metafictional layer of the novel, the narrator of which is an author with writer's block and who writes about 'ordinary people who drink coffee and chat and that sort of thing'. This metafictional quality is reinforced in the audiobook version. Partly there is the relationship between the author who reads aloud and the figures the author describes. Is the performing narrator and author Helle Helle operating with an ironic stance towards the figures depicted, or is the voice in the novel loyal to John and Putte's relatively humble everyday universe? This is in a way undecided, but the ambivalence—that we both laugh at and with John and Putte—seems perhaps less ambivalent in the audiobook version, since the reading actually comes across as relatively loyal to the novel's cast of characters.

When looking at the voice in context, it could in this case be a question of gender and social class. The aforementioned discussion of ambivalence could be continued, since Helle Helle's novels to a certain degree speak to a world of high culture, while the environment she describes lies far from the literary world's ideas about itself. Even though the audiobook's recording thus demonstrates a loyalty to or love for the figures it describes, exposes and lives with, it still encompasses an explicit collision between what is described and the literary world of which Helle Helle's novels are part.

Above, we highlighted three important parameters in the analysis of a specific audiobook experience. The combination of the specific technology, the reading situation and the reading voice contributes to defining the reading experience in the meeting with the individual listener's personal disposition and experience, which we, in parenthesis, have not considered methodologically in this analysis.

By including technology, situation and the voice in the analysis, a reading experience is sketched out which is significantly different from reading the book as a printed book or e-book. According to how the three parameters combine in the individual reading experience, it can be further analysed in relation to two general aspects of experience, which focus on how the use of audiobooks negotiates the relationship of the experience to time and on the experience's degree of immersion—or with metaphorical designations: a horizontal and a vertical aspect of the appropriation.

# 6.8 The Aspects of Experience in Reading an Audiobook: Time and Depth

Empirical studies have shown that the reading of audiobooks has an effect on our experience of time (Dalsgård et al. 2015; Have and Pedersen 2016). Although this is also the case with visual reading, the popularity of the audiobook in the last few years perhaps has something to do with the modern person's relationship to time. Reading audiobooks gives waiting and wasted time a positive aspect by adding an extra mental layer of experience to the bike ride, walk or waiting time on the platform, so that we feel we are making the best possible use of time.

An American questionnaire from 2012 showed that 62% of those asked mentioned that they chose audiobooks rather than printed books because they could be read while they drove their car, 46% highlighted the mobility of the audiobook and 31% ticked the box "it helps me multitask". Only 11% answered that they chose audiobooks because they had difficulty reading printed books (Have and Pedersen 2016: 101). This was confirmed by a qualitative study we ourselves undertook, in which reading audiobooks qualified the time our informants otherwise felt was wasted on, for example, transport or cleaning and was seen as a choice that added value to everyday activities (Have and Pedersen 2016: Chapter 6). Our informants also confirmed a point from an earlier American survey (Audio Publishers Association 2006: 1) that audiobook readers typically also read many printed books and therefore do not replace the printed books with audiobooks but rather create new routines for reading in situations in which they cannot read with their eyes (Have and Pedersen 2016). As the report *Når Danmark læser* (*When Denmark Reads*) also confirms, the audiobook offers new possibilities for reading in a time when many people do not feel that they have time to read printed literature (Dalsgård et al. 2015). People with routine jobs or commuters also have additional time or attention with which to listen to literature (Dalsgård et al. 2015: 33).

There has been a tendency to view listening to audiobooks as a distracted and superficial form of reading and getting to know literature, but when audiobook readers say that they make wasted time pass faster by reading audiobooks, it is also a question of compressing time through immersion. We can therefore differentiate between two forms of listening to audiobooks, each with their qualities. It is an analytical differentiation, since the two forms in practice enrich one another: *atmosphere-oriented audiobook listening*, which is reminiscent of listening to music and which emphasises the aesthetic aspects of the voice and a 'thickening' of the linguistic in a sonorous-stylistic sense, and *content-oriented audiobook listening*. Immersion may occur in both cases, depending on whether we immerse ourselves in the audiobook's narrative or in the atmosphere created by the sonorous qualities of the language and by the narrator's voice, which in, for example, Helle Helle's case can create presence and mental calm. It is, then, a question of two different ways of reading 'deeply', and the audiobook can be said to increase sensibility to the *connection* between the sonorously atmosphere-oriented and the narrative-oriented levels. Sound is experienced in real time, but audiobook reading can either make time pass more quickly or give time an atmospheric and content-based wealth. The temporal dimension is therefore closely connected to the vertical dimension, which is about immersion and concentration. Depending on which reading situation we find ourselves in, reading an audiobook 'deeply' may mean that we are absorbed in a story driven by a hermeneutic desire but can also mean that the audiobook's tone creates a, in part, sensorially based atmosphere for the experience, which affects the reader's relation to his or her surroundings.

Like the printed book, the audiobook can be read with varying degrees of attention, and like traditional reading, it demands practice to read an audiobook concentratedly, to understand and immerse oneself in the stream of words that meets the ear at a high tempo. One of the points we wish to draw out in this chapter is that it is possible to speak of different ways of reading an audiobook and that habituation and discipline may be required to approach audiobooks in different ways, in the same way that the reading of writing presupposes practice and discipline. In this sense, the audiobook reading practice negotiates the relation between spatiotemporal aspects of reading on new terms, not at least questioning the hierarchies of cultural values when it comes to conceptions of close- and deep reading.

#### 6.9 Conclusion

In this chapter, we have tried to give the digital audiobook a voice in the analysis of the reading of literature. We define the audiobook as an audio recording of a printed book (the technical medium of display), which is read aloud, but do not see it as a by-product of the book; it is not only a remediation but also an independent medium, which offers other and expanded forms of reading literature. The large circulation of the digital audiobook calls for new methods, and by presenting some different parameters of analysis which can supplement the traditional literary analysis, we have sketched an analytical method which is sensitive to the medium and points out the importance of the technology, reading situation and voice for the reading experience. Together with the narrative content of a fiction-based audiobook, the various combinations of these parameters make possible a broad range of audiobook analyses, which can be further qualified by discussing them in relation to two types of reading, the atmosphere-oriented and the content-oriented reading, respectively, which in different ways build upon the audiobook's particular potentials within time and depth as aspects of experience. By insisting on the fact that we *read* audiobooks, we also hope that we can contribute to challenging the preconceptions that may still exist about the reading of audiobooks as superficial and compensatory.

#### References

Alter, Alexander. 2013. The New Explosion in Audio Books: How They Re-emerged as a Rare Bright Spot in the Publishing Business. *The Wall Street Journal*, August 1.


———. 2020. The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 1*, ed. Lars Elleström, 3–91. Basingstoke: Palgrave Macmillan.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Language in Digital Motion: From ABCs to Intermediality and Why This Matters for Language Learning

*Heather Lotherington*

### **Contents**


# 7.1 Introduction

The context with which this chapter grapples is *mobile language learning*. Recent research (Lotherington 2018) indicates that top commercial mobile (m-)learning apps tend to rely heavily on outmoded structural

York University, Toronto, ON, Canada

H. Lotherington (\*)

e-mail: HLotherington@edu.yorku.ca

<sup>©</sup> The Author(s) 2021 217

L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1\_7

models of language and tap dated behaviouristic pedagogies in their lessons, typically applying gamification veneers to attract and maintain users. Popular m-learning apps predominantly feature levelled vocabulary study, and to a lesser extent, language structure, aka *grammar*. Linguistic communication, however, has moved from simple alphabetic encoding to multimedia design in digital environments; this challenges the fit of structural theories of language to digital language learning contexts. In language teaching and learning literature, *multimodality* describes communication employing diverse semiotic resources in texts not limited to alphabetically (or logographically or syllabically) encoded language. In reality, all communication is multimodal; however, professional awareness of *multimodal communication* in language teaching emanates particularly from digital texts and discourses, which are created using a broader palette of meaningmaking resources than static print texts, ergo, multimedia texts. The predominating trend has been to approach digital multimodal communication from a *social semiotics* paradigm (e.g., Bezemer and Kress 2008, 2016; Jewitt 2008; Kress 2000, 2003, 2005, 2009, 2011).

Shortly after the turn of the century, Kress (2003: 1) predicted: "the combined effects on writing of the dominance of the mode of image and of the medium of the screen will produce deep changes in the forms and functions of writing." *Mode*, though, is not defined. Jewitt (2004: 84), clarifies the concept of *mode* as indexing "technologies of representation (the modes of 'multimodality')," and contrasts mode with *media*, which indexes "technologies of dissemination (the media of multimedia)." As social semiotic conceptualizations of multisemiotic composition have grown in concert with increasingly sophisticated, grammatically differentiated digital communication, the terrain of multimodal communication can be seen to overlap considerably with that of intermediality theorizing. The fundamental building blocks of *mode* and *media*, which, in a social semiotics reading based in linguistics, rely largely on cultural interpretation and exemplification, gain precision from an intermediality analysis based in interart studies.

Elleström's landmark intermediality model (2010, 2020) delineates the concept of a *mode* from a semiotic perspective, illuminating how *modality* characterizes *media*, a generally nebulous and capacious concept. Elleström's model offers a valuable resource for examining *multimodality* in digital communication from an innovative theoretical stance grounded in art, rather than linguistics.

Taking an intermediality lens, this chapter parses two selected features of mobile digital communication that are only vicariously captured in linguistic analysis. The analysis draws on Elleström's (2020) intermediality paradigm to help delineate how diverse meaning-making resources could be conceptualized from a perspective decentred from linguistics and leveraged in reconceptualizing the m-learning of language as it is used in contemporary interactive, multimedia texts and discourses.

#### 7.2 Language and Literacy

Language has traditionally been described as a *medium*: of communication, and of learning. Language is, in fact, an abstract until it is materialized: mediated physically, in speech and signed conversations, and technologically, in printed documents, social media sites, roadside signs, movies, games, and suchlike. Gershon and Manning (2014: 539) point out that curiously "the mediality of language is rarely explored in media studies."

Language is often complexly remediated. Ong (1982) theorized *secondary orality* as the remediation of speech into writing with subsequent re-voicing, as in scripted televised newsreading (also see Eide and Schubert 2020). Have and Pedersen (2020) describe contemporary remediation in books: traditional media products that can be remediated as audiobooks and *read* aurally. In this chapter, remediation of the human voice is described as material in artificial intelligence (AI) programs that create conversational digital agents.

Language is innate to human beings, hardwired as individual cognitive capacity, though the developmental trajectory of the child learning to speak (or to sign in the case of deaf children) requires appropriate socialization to activate. Languages, thus, live in society as well as in the minds of individuals, social use constituting their lifeblood.

Speech is dynamic in character, spoken language evolving in tune with the social community/ies in which it has currency. Different accents signify regional location in spoken language populations, for example, English as spoken in Toronto, Canada, as compared to English as spoken in Edinburgh, Scotland, or in Cape Town, South Africa. Literate norms, though, tend to be much more fixed, coded into a body of literature relying on common conventions that developed for the social expectations and the mediating technologies of the time of publication.

The technological interface encoding language into literate forms for most of the history of mass literacy was the printing press, which, through pioneering use, forced the conventionalizing of spelling and sentence mechanics to make printed literature easily readable. Times have changed, however, and the predominant canvas for communication is now the screen, and increasingly, a screen that is individualized, mobile, and ubiquitously wifi-connected. Communication is now ineluctably and indelibly multimodal.

#### *7.2.1 The Literate Bias of Education*

In education, there is a distinct bias towards literate learning. At the grade school level, literacy is seen as the keyhole through which school learning progresses. In the province of Ontario (Canada) which is the policy jurisdiction for children in Toronto, *language* is conflated in curriculum documents with *literacy*. As such, literacy references language written down. This makes for a messy transition to digital multimodal communication.

In the case of sequential language learning (language learning following initial child language development), sometimes referred to as *second (or foreign) language learning*, the characterization of language competence in terms of four skills: *speaking*–*listening*–*reading*–*writing*, ostensibly mirroring child language development (as theorized in pre-digital, majority language, middle-class contexts) has tenaciously persisted for decades. Though the validity of discrete language skills (reading as separable from writing, and so forth) has been in contention for decades, formal language testing in gate-keeping tests, such as the TOEFL has been complicit in cementing a four-skills model into social and economic benchmarks, so this limited twentieth-century thinking about what language comprises and entails for learners continues to be reinforced in many teaching contexts.

### *7.2.2 Mobile Language Learning*

Mobile language learning is geared to second and foreign language learners. Despite myriad possibilities in using mobile devices for language learning, the evolution of m-learning tilted towards app-based learning following the release of the iPhone in 2007. Apps are third-party software packages that are directly downloadable to mobile devices on a trial costfree or low- cost basis. Apps for language learning take a variety of generic approaches from language courses to games to vocabulary resources to memorization drills (Lotherington 2018), but each offers a proprietary package, marketed by a software developer. The digital marketplace for language teaching apps is a veritable wild west of unregulated products, educationally speaking, in which *learners* are treated as *users*. The user is, in fact, the product in app design, their learning behaviours, collected and sold as *data*, though there may also be a more direct profit motive in paid course upgrades, unlocked learning resources, and so forth.

Mobility in the context of digitally mediated communication references capacity for ubiquitous interactive communication untethered to physical location. Contemporary m-learning uses smart mobile devices that embed powerful multifunction toolkits for digital text making. Because of this, mobile connection has the potential to invite dynamic pedagogical approaches that put agency in the hands of the learner for collecting and creating resources towards directed language learning, contextualized in space. Mobile devices, however, frequently meet with disapproval in formal education sites, if traditional curricular agendas of language and literacy learning dominate. Worse, smartphones are banned in some schools, precluding creative classroom m-learning.

Top-selling apps for language learners, recently surveyed and roadtested (Lotherington 2018), revealed a tendency to import dated structural models of language and behaviourist pedagogies into the fluid mobility and complex functionality of a smart device. Features of current interactive multimedia communication are used here and there; most apps apply a gamification engine to mask the tedium of vocabulary drills. Some apps also utilize a social media feature to connect random chat partners selected on the basis of the home country, which is an unreliable indicator of language proficiency in any case. However, generally speaking, social media affordances were poorly utilized and digital discourse forms largely avoided.

This chapter presents data from an extended review of the research literature on how language has morphed in form and function in coevolution with technological change. This literature review forms the basis of an active exploratory pedagogical design study to build mobile production pedagogies (Thumlert et al. 2015) for language learning that utilize the powerful resources of smart devices and activate a contemporary palette of semiotic resources. Our research team's aim is to invite interactive multimedia textual composing in ways that work agentively for language learners. To do this, we need to understand the kinds of changes language has undergone and how we can document these changes from a theoretically rigorous standpoint. Elleström's (2020) intermediality paradigm offers analytical specificity and categorization that is helpful in understanding contemporary communication from a perspective not grounded in linguistics that can be intelligently merged with what we understand about how language works to create meaning.

# 7.3 The Expanding Borders of Language in Digital Communication

Ignoring the paralinguistic information embroidering the borders of spoken language (e.g., body language, facial expressions, and tone of voice) was pedagogically reinforced in both school and second language learning. A related principle was held in books where nonlinguistic material, such as illustrations and charts, was acknowledged but looked at as supportive of rather than integral to overall meaning contribution. This was particularly the case in second language learning, where the focus was traditionally on the structural elements of language. In grade schools, where language is more holistically approached, books were nonetheless assessed as to content and nature according to the density of language carrying, as it were, the message; image-centred genres, such as comic books, were derided as insufficiently serious. Nowadays, the graphic novel has taken on a new life, imparting content taught in grade school and university classes, for example, *Persepolis* (Satrapi 2003).

Digital mobile connection has redrawn the literate borders of language, challenging the print-centred skills of second language teaching and learning, and the linear linguistic encoding of text. With Web 2.0 technological upgrades, enabling interactivity, literacy has morphed from discrete reading and writing of the static page into a *multimedia read/write (R/W) capacity* underpinning social media posting, collaborative authorship (wikis); tweeting (microblogging), texting, and so forth. Furthermore, the shift from page to screen entailed radically new encoding capabilities that have dislodged the stuff of literacy from the letter to the pixel, which indifferently encodes still and moving images, sound files, and spoken language (Cope and Kalantzis 2004). The limits of theories focused on written instantiations of language: linguistic structuralism and literacy theories, focused on linear encoding and decoding of the static page, have become inadequate to imagining the task of understanding and creating multimedia textual products.

Though the evolution of page to screen has taken place over decades, the period of most rapid change in language form and function has taken place over the past 15 years, consistent with Web 2.0 and 3.0 technological advancements. The initial iteration of the public World Wide Web in 1991, retroactively referred to as WWW 1.0., was essentially a digital bulletin board where content could be publicly posted. It was Web 2.0 circa 2004 that opened the flood gates to rapid co-evolution in social practices, economic opportunities, cultural life, and associatively, language form and function. Web 2.0, or the *semantic web*, gave birth to social media forums, for example, Facebook, LinkedIn; collaborative authoring tools or wikis; video-streaming services, such as YouTube; and novel sites for information sharing and social networking, such as blogging, podcasting, and microblogging, for example, Twitter. The evolving capacity for integrated artificial intelligence in Web 3.0 enabled *smart technologies* that yield new functionalities. These include *conversational digital agents*, AI programs that use natural human language in spoken form to respond to user voice or text inquiries in environments such as mobile phones, business call centres, and GPS systems in cars.

#### *7.3.1 DIY Language Norms and Conventions*

Historically, lexical and grammar conventions were based on literature, which, having met publication standards, provided the guide rails for language standards. Dictionaries and grammar books encapsulated accepted spelling, grammar, punctuation, and lexis. Figure 7.1 provides an amusing excerpt from an eighteenth-century grammar of English, which would be met with general hilarity by even the most tolerant of language teachers today.

**Fig. 7.1** An excerpt from Henson's grammar (1744)

The established quality controls (and built-in biases) of commercial publication, however, are far less influential in an interactive digital environment enabling self and open publication, collaborative authoring and editing, and user-developed generic conventions, co-evolving with technological affordances, yielding novel spelling norms in texting (*txting*), for example, brb, omg; tweet grammars (280 characters, incorporating semantic punctuation such as # and @ as well as other digital media), and suchlike. These new forms challenging the standard spelling, punctuation, and grammar relied on in schooling, however, are *user-driven*, emerging, ironically, from the same historical basis as literary and grammatical standards: *publication*. This creates a conundrum for formal education, which educates learners for tomorrow, not yesterday, because the stuff of literate learning that underwrites formal education continues to draw, in the main, from yesterday's norms and standards.

## *7.3.2 Language in Mobile Digital Context*

How exactly is language outgrowing print era borders? In an ongoing study of evolutionary changes in language form and use co-evolving with digital technologies,1 our team is documenting a number of key linguistic evolutions permeating everyday communication, including but not limited to:


These and other evolutions in communication have co-developed with technological advancement, generating new genres (e.g., fanfic, unboxing), discourses (e.g., #photooftheday, comments sections in social media), and cultural practices (e.g., massive multiplayer online role-playing games [MMORPG]; blogging; podcasting), each of which grows corresponding conventions (e.g., #, lol, ). These become norms within the generating genres, though given the speed of technological change, the norms, too, are in a state of change, for example, the microblogging site, Twitter, has gone from 140 characters to 280 characters per tweet. Interestingly, some conventions have jumped generic borders, such as hashtags. For instance, #MeToo as a topic heading has migrated across microblogging and social media forums to protest signs and newspaper headlines, and even into speech, pronounced: "hashtag me too."

Given the plethora of novel language forms being generated in digital fora, appropriate theorizing of what constitutes *language* and *literacy* for learners in this day and age is urgently needed.

# 7.4 Theorizing Multimodal Communication: Two Views

"'Multimodality' names the field in which semiotic work takes place, a domain for enquiry, a description of the space and of the resources that enter into meaning in some way or another," states Kress (2011: 38). He further asserts that analysing multimodal discourse "needs to encompass all modes used in any text or text-like entity, with each described both in terms specific to its material and historical affordances and in terms shared by all modes" (2011: 38). This description signals the underlying complexity of multimodality as theorized in social semiotics and indicates significant overlap with Elleström's intermediality theory. There is commonality in conceptual range, including the modalities of basic media (material, spatiotemporal, and semiotic modalities though less of sensorial), as well as qualified (historical, social, cultural) aspects of media. Epistemologically, however, the approaches vary considerably.

*Social semiotics* is a theoretical perspective derived from Michael Halliday's (1978) systemic functional linguistics (SFL), which theorized language in social use. Halliday's SFL departed from prior structuralist theories of language form, which explained language as abstract structure, detached from social use. The concept of social semiotics was further developed by Hodge and Kress (1988), predating the evolution of digital communication. Kress' (2000, 2003, 2009, 2011) continued work on social semiotics, multimodality, and literacy provides a dominant theoretical lens for understanding the innovative multimedia texts that have evolved over the digital era.

In a social semiotics paradigm of multimodality, *mode* is the basic concept, indexing the means of semiotic representation. *Media* refers to technologies of dissemination (Bezemer and Kress 2008; Jewitt 2004; Kress 2005). Multimodal texts are designed and "composed of different modes, resting on the agentive semiotic work of the maker of such texts" (Kress 2011: 36). Modes are typically exemplified rather than analytically explained, for example:

If, going one step further, we compare a contemporary textbook with 'pages' on the Web dealing with the 'same' issues, we see that modes of representation other than image and writing—moving image and speech for instance—have found their way into learning resources, with significant effect. (Bezemer and Kress 2008: 167)

The affordances of modality can become exceedingly complex:

It varies in line with the affordances of each mode: here in a contrast of speech and image—of lexis vs depiction; of possession vs proximity or distance, of centrality or marginality; as a verb-form vs spatial co-location; sequence (as temporal succession in speech or linearity in writing) vs simultaneity (of appearance and arrangement) of the entities. (Kress 2011: 45)

In Elleström's (2020) conceptualization, multimodality is a feature that helps to define intermediality. His theorizing follows Mikko Lehtonen's explanation that "multimodality always characterises one medium at a time. Intermediality, again, is about the relationships between multimodal media" (cited in Elleström 2020: 41). Elleström explains, "intermediality is about the relationship between media having a multitude of vital traits, or modes" (2020: 41). Four modalities, that is, types of modes, form indispensable cornerstones of all media: *material*, *spatiotemporal*, *sensorial*, and *semiotic*. Together they build in physicality, perception, and cognition.

*Media* in a social semiotics paradigm may index what Elleström categorizes as *technical media of display* (2020: 33–40). However, a discussion of technical media invites an ontological lens on what constitutes technology, which exceeds the purview of this article, calling into question the relationship of qualified (socio-historical aspects of media) and technical media of display. My reluctance to do this, in brief, follows Lawson's (2008: 48) argument that technology is "irreducibly social." Bijker (2010), speaking from a social construction of technology theory, elaborates on how detached views of technology are impossible. In his words, "technology is socially (and politically) constructed; society (including politics) is technically built; technological culture consists of sociotechnical ensembles" (Bijker 2010: 72). Disconnecting technical media from their contextualizing socio-history is, thus, specious.

# 7.5 Modality, Mode, and Media in Digital Communication

Digital text creation is a process of multimedia design, not simply alphabetic encoding, and it cannot be taught as if it were. This is not a new realization, as can be seen in Kress' (2000: 339) prediction two decades ago:

The semiotic modes of writing and of image are distinct in what they permit, that is, in their affordances. Image is founded on the logic of display in space; writing (and speech even more so) is founded on the logic of succession in time. Image is spatial and nonsequential; writing and speech are temporal and sequential. That is a profound difference, and its consequences for representation and communication are now beginning to emerge in this semiotic revolution.

Multimedia design, however, is far more complex now than indicated then. Worth commentary is that Kress is on record (Kress and van Leeuwen 1996) describing the evolution of alphabetic writing from images—the linkage being very much still evident in logographic systems, such as Mandarin Chinese. So how can the logic of writing be so polarized from the logic of image, either in process—which in both cases requires temporal logic: you have to write or create the product—or in product—which, depending on the kind of image and kind of writing, may be spatial in orientation but is likely to be more complexly intertwined with temporal and perhaps even tactile perception? In terms of writing, consider subtitles at an opera, electronic highway signs, moving advertising on rotating e-bulletin boards. These require timed spatial perception. In terms of image, consider sculpture, which requires spatial perception but also requires movement around the sculpture, and thus is temporal in perception, too. Also sculpture may invite tactile perception as part of the sensory experience.

Text composition today engages and remixes graphic resources, sound files, and moving images from gifs to videos into alphabetic text in a keystroke. Elemental semiotic resources, such as pictures and letters, are essentially mashable by virtue of their shared materiality: the pixel. Mobile smart devices embed a portable digital toolkit for designing, producing, and sharing multimedia texts. Making sophisticated multimedia texts on a smartphone is easy-as-pie.

Language education requires principled conceptualization of the semiotic elements being functionally captured and remixed in contemporary digital communication so they are included pedagogically. Given the immense changes occurring in digital language form and function, gaining a full understanding of how multimedia textual production varies from linear alphabetic writing is a vast project. In the following section, two radically novel digital twists to language today are examined through Elleström's intermediality lens: the inclusion of *emoji*, for example, , , , in digital writing and the incorporation of AI interlocutors, or conversational digital agents, in spoken communication in digital environments such as mobile phones.

#### *7.5.1 Emoji*

Language learners routinely expect vocabulary and grammar to constitute basic language learning. However, the question "What is a word?" must be asked in an environment where novel word-like formations that have no pre-digital era precedent, including hashtags (e.g., #BlackLivesMatter), and emoji (e.g., ), have taken on a life of their own. This section examines emoji as a novel element in digital R/W vocabulary.

Pardes (2018) describes *emoji* as, "tiny, emotive characters—from to to —[which] represent the first language born of the digital world, designed to add emotional nuance to otherwise flat text." Emoji developed in Japan following early digital chat play with emoticons using the ASCII keyboard, for example, :-). Japanese emoji were seen as a valuable contribution to digital platforms and adopted by tech giants in the early 2000s, who then petitioned for their inclusion in the Unicode Consortium (Pardes 2018). This stabilized emoji for universal keyboard use in terms of the binary code computers use.

Emoji are pictograms, as are many public signs, for example, , indexing "recyclable." Pictograms differ slightly from logograms, which are words encoded in a script; for example, 你好 in traditional Chinese, pronounced *nı*̌ *ha*̌*o* (literally "you good"), encodes the greeting "hello," which in English requires five letters. You can speak a logogram but not a pictogram. As such, emoji exist only in literate form; they do not directly encode a spoken form, though they can be translated, for example, is a *smiley face*, though just one variant of smileys, also including , .

Emoji utilize an elemental iconic keyboard, which makes an emoji as easy to insert into written text as a letter (e.g., *a* or *b* or *c*). The emoji keyboard is analogous to the *qwerty* keyboard as *technical* interface (used with a screen as *technical medium of display*). The *qwerty* keyboard has been a primary technical interface between writer and text since its mechanically driven design in the nineteenth-century manual typewriter (Noyes 1983), used with paper. There are parallel keyboards for languages, for example, accented Roman (French, Swedish); non-alphabetic scripts (Chinese, in traditional or simplified logograms); and emoji, which can be imported into text-making at the touch of a button on a smartphone (see, e.g., Fig. 7.2).

Given that all basic media are characterized by four modalities—material, spatiotemporal, sensorial, and semiotic—and that a medium may demonstrate multimodal characteristics within a single modality, how might emoji be modally described? As emoji exist only in literate form, they are visually perceived, spatially interpretable symbols that are encoded materially in pixels, using a keyboard as technical interface and screen as technical medium of display. Consider the attempt to parse these complementary modalities in Table 7.1. It is critical to state up front that this untangling of the modalities of emoji is intended to illustrate the complexity and relative complementarity of media, not to deconstruct emoji into isolatable component parts that do not blend into each other. Parsing is intended to highlight more and less prominent modalities in interpretation.

Media and mediation are exceedingly complex in today's communication landscape. A media product or medium exists in historical-socialcultural space as well as physical-sensorial-cognitive space. According to Elleström, "qualified media" are "media types, which depend on history, culture and communicative purposes" (2020: 57). A more delineated

**Table 7.1** Basic and technical media of emoji


understanding of emoji in contemporary communication comes to light in the qualified aspects of the medium.

Emoji are media products; they are also a media type that is endemic to informal texting and digital chat environments. Despite their immense cross-cultural popularity and widespread use, they have not migrated to more formal writing environments. The qualified aspects of emoji help to


**Table 7.2** Basic, qualified, and technical media of emoji

flesh out a broader understanding of emoji as semiotic resource, going some distance in explaining the reluctance of formal educational infrastructures to absorb this element of contemporary communication into language study (see Table 7.2).

Intermediality is, of course, not neatly packaged in a grid, as the prefix *inter-* suggests: each aspect of media seeps into and colours complementary aspects. Hence, the unreality of separating technical media of display from social and historical context despite the practicality of knowing the intended or most suitable device. Nonetheless, creating a constituent media analysis of emoji is a valuable exercise for understanding how and where this literate innovation works in digital composition. Let us now turn to a more nebulous digital phenomenon: the surreptitious permeation of AI into digital conversation.

#### *7.5.2 Conversational AI*

A keen interest in understanding communication in mobile digital context is the emergence of the conversational digital agent. The expansion of voice recognition software into the sophisticated conversational digital agent historically coincides with the release of smart mobile devices, which, because of their limited screen size, benefitted from a voice to provide assistance rather than relying on tiny written instructions (Pinola 2011). The integration of global positioning system (GPS) receptors in smart devices enabled digitally voiced navigation. AI voices are also incorporated incognito in language teaching apps.

How might a disembodied digitized computer voice be described in terms of the four modalities of basic media? To hear a voiced message from a disembodied computer-activated conversational agent, switch on the voiceover accessibility feature on a computer; ask a question to your inphone conversational digital assistant, e.g., Siri, or plug in a destination on a GPS device and enable voiceover directions. This will result in a media product that could be basically parsed as in Table 7.3.

Digital voices are, of course, a vehicle for a linguistic message, which would require elegant semantic and structural delineation within the semiotic category. This analysis can rely on the extensive attention paid to how language conveys meaning to whom, when, where, why, how, and so on that is entailed in linguistic theories. However, the basic media product of the voice of the conversational digital agent (as well as less complex AI *chatbots*) is also a media type. The digital conversational agent has permeated not simply auditory media but also robotic shapes. In the spoof Amazon advertisement about the integration of the Amazon conversational agent into the Internet of things, as shown during the 2019 Super Bowl (American football), Alexa, the voice-activated digital agent, is capable of, among other things, interpreting dog bark commands ordering dogfood.2 Though very silly, the advertisement showcases the voices of male and female humans (speaking English), a dog, and a digital agent with attendant media properties, semantic comprehensibility, and so forth.

Quite apart from the exceedingly complex technical embedding of AI in media products that are more multifaceted—a smart mobile device, a refrigerator, a thermostat, and so forth—the emergence of the


**Table 7.3** Basic media modalities of conversational digital agent voice

conversational digital agent in language learning is a qualified media question mark. In the messy and dispute-ridden world of education, formal and nonformal, the contextualizing aspects of media may be a total dealbreaker. Whereas novel digital genres and discourses have reshaped communication practices, normalizing activities such as social media posting (Facebook, Instagram), blogging, videologging (YouTube), microblogging (Twitter), and texting (instant messaging), to mention just a few, the uptake of such discourses and their conventions in formal language and literacy teaching contexts has been spotty. Some schools encourage a bring-your-own-device (BYOD) approach to students' using their personal mobile phones for learning; others expressly forbid mobile devices in class. In some classroom contexts, designing new media texts is encouraged, whether using institutional or personal devices; in others, curricula default to the conventions of static print media, stuck in *speaking–listening–reading–writing* skills. An analysis of the digital conversational agent in terms of qualified media in addition to the complex morass of the technical display in our parsing exercise might look as in Table 7.4.


**Table 7.4** Basic, qualified, and technical media of conversational digital agent voice

This simplified grid is intended to spotlight modalities of media in terms of their prominence in the media product, not to cleanly disambiguate modalities from each other, as previously disclaimed. Qualified aspects of digitized voices re: legitimation for an educational activity, for instance, may colour acceptability of modalities such as voice quality as well as semiotic aspects of accent programmed into the digitized voice, and vice versa. The following discussion on materiality indicates the mammoth complexity of a disembodied digitized voice.

### *Language from Whose [sic] Perspective: The Complex Materiality of AI Communication*

Given the infusion of conversational digital agents in daily communication practices, from following voiced GPS navigation in the car to asking Siri for help on the iPhone to including the robotic Amazon digital assistant, Alexa, in family conversations at home, we are justified in asking: Is language still human?

Interestingly, humans using voice-activated digital assistants often assign human genders to them (e.g., "Thank you, lady in the computer!"; "She'll tell you when to turn"). These voices, though, are programs: circuits, not people, despite the fact that the original mediating material was produced by humans prior to being subjected to complex technological remediation. Peña and James (2016, 2018), writing about glitch art and pedagogy, problematize materiality in digital transmediation across sensory domains, making the case that computers can interpret and create sounds that are only partially interpretable to humans.

Human talk and text do not travel through computer circuits in humanly recognizable form. Computers function on a binary 0-1 code. The materiality of human–computer–human communication loops with digital conversational agents, such as Siri or Alexa, has complex technological layers, tapping a *material* modality that is a veritable iceberg reaching into depths that are only partially interpretable to human interlocutors.

So the answer to "Is language still human?" is convoluted: computer programs were originally programmed by humans—though whether this continues to be true in Web 3.0 environments is another question.

#### 7.6 Conclusion: From ABCs to Intermediality

The speed of development in literate conventions, genres, texts, and discourses as they mutually develop with socio-technical advancement has not been matched in formal language and literacy instruction, which still tends to prioritize socio-historically and politically sedimented literate forms. One problem in moving conceptions of writing into multimedia composition is that rapid changes in previously understood constancies, such as spelling, are *user-driven* in the digital age.

At the beginning of the school year, children go off to school to learn their ABCs. At least that is what used to happen and indeed is still the idea parents have in their heads. It is not so different with adult (and school) learners of second languages. But times have changed. Everyday digital communication practices—social media posting (e.g., Facebook, LinkedIn), microblogging (Twitter), videologging (e.g., YouTube), and photography-centred posting (e.g., Instagram, Pinterest), for example are co-encoded with emojis (e.g., ), and other graphic and sound media, such as photography, film, music, random sounds, gifs, and on and on. These forms are co-encoded with alphabetic symbols that can also be voiced—by software programs.

Conventions evolving in digital forums have revolutionized concepts as basic as the word: novel word-like forms, such as the hashtag, for example, #TimesUp; #foodporn that do not follow historical semantic or structural word formation patterns have crossed from digital environments into print media. Similarly, AI, which is commonly used for narrating GPS directions or giving help instructions on mobile device screens, is also being programmed into language learning apps.

Given this reality, language professionals and educational policy makers need to be aware of how media resources work to create texts and textual meaning from an arts-based as well as a language-based perspective. Elleström's (2020) intermediality paradigm offers a much-needed vector for analysing multimedia communication from an arts perspective that can be integrated with linguistic theories to understand truly digital language as it is evolving.

This chapter undertook a comparative look at *modality*, *mode*, and *media* through a social semiotic and an intermediality lens to clarify the complex task of understanding the contribution of different semiotic resources in multimedia textual products. The raw data on new conventions in digital language use are from an extensive literature review of how mobile digital access has affected linguistic communication. This review informs current exploratory research to build appropriate mobile production pedagogies for language learners, using, rather than ignoring, today's communicative potential.

Two innovative communicative features that evolved with communications technologies advancements were selected for analysis using an intermediality lens: the use of emoji in textual products and spoken communication with conversational digital agents. Though it must be said that emoji are unlikely to make a breakthrough into academic writing any time soon, other visual-spatial lexical innovations in digital communication have already begun to cross from specific digital forums into print manifestations, such as #hashtag topics: forms that do not cohere with lexical borders or alphabetic principles. The AI voice anonymously joins us in everyday communication. Many questions ensue.

#### Notes


#### References


Elleström, Lars. 2010. The Modalities of Media: A Model for Understanding Intermedial Relations. In *Media Borders, Multimodality and Intermediality*, ed. Lars Elleström, 11–48. Basingstoke: Palgrave Macmillan.

———. 2020. The Modalities of Media II: An Expanded Model for Understanding Intermedial Relations. In *Beyond Media Borders: Intermedial Relations among Multimodal Media, Volume 1*, ed. Lars Elleström, 3–91. Basingstoke: Palgrave Macmillan.


———. 2009. *Multimodality: A Social Semiotic Approach to Contemporary Communication*. London: Routledge.

———. 2011. Multimodal Discourse Analysis. In *The Routledge Handbook of Discourse Analysis*, ed. James Paul Gee and Michael Handford, 61–76. London: Routledge.

Kress, Gunther, and Theo van Leeuwen. 1996. *Reading Images: The Grammar of Visual Design*. London: Routledge.

Lawson, Clive. 2008. An Ontology of Technology: Artefacts, Relations and Functions. *Techné: Research in Philosophy and Technology* 12 (1): 48–64.


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# Index1

#### **A**

Actor, 14, 25, 96, 97, 101, 113–118, 120, 121, 123, 125, 127–131, 138, 144, 183, 201 *See also* Actress; Performer Actress, 129, 154, 157, 158 *See also* Actor; Performer Adaptation, 121, 137, 165 *See also* Transmediation Affection, 61, 81, 130 Almodóvar, Pedro, 147 Analogue, 117, 142, 143, 145, 151, 157, 159, 165 *See also* Digital Andrew, Dudley, 83 Architecture, 62, 77, 96–98, 107–109, 111, 114 Artificial intelligence (AI), 158, 219, 223, 224, 228, 231–236 Assayas, Olivier, 157 Atwood, Margaret, 137 Audiobooks, 198–214, 219

Audio-visuality, 180 Aural, *see* Hearing, sense of Auslander, Philip, 123

#### **B**

Bailey, Brett, 129, 135 Bal, Mieke, 42 Barthes, Roland, 143 Basic media, *see* Media types Bateman, John, 6, 8, 42, 114, 119, 138, 192n1 Benedetti, Jean, 120 Berlo, David K., 20, 25 Bezemer, Jeff, 218, 226 Bijker, Wiebe E., 227 Birkerts, Sven, 202 Blogs, 201, 223, 225, 233 Body body language, 63, 76, 222 embodiment, 114 Boenisch, Peter M., 98, 99

1Note: Page numbers followed by 'n' refer to notes.

© The Author(s) 2021 239 L. Elleström (ed.), *Beyond Media Borders, Volume 1*, https://doi.org/10.1007/978-3-030-49679-1

Bolchini, Davide, 13 Bolter, Jay D., 100, 201 Bordwell, David, 180, 186, 188 Brewer, William F., 27 Brooker, Charlie, 134 Brown, William, 176 Bruhn, Jørgen, 73, 86, 114, 142, 144, 148, 167 Bruno, Giuliana, 142, 143, 153 Buckland, Warren, 154, 155, 167n1, 168n3 Butterworth, Jez, 99

#### **C**

Chapple, Freda, 77, 98 Chateau, Dominique, 143, 161 Cinema, v, 60, 65, 95, 107, 143, 152, 157, 162–164, 167, 168n4, 176, 177, 179, 182, 183, 190, 191, 203 *See also* Film Clüver, Claus, 9, 34, 75 Coen, Ethan, 153 Coen, Joel, 153 Cognition, 12, 13, 43, 51, 85, 115, 116, 127, 152, 199, 226 Cognitive import, vii, 12, 13, 16, 17, 19–27, 29, 33, 38, 39, 46–48, 50, 51, 53, 54, 62, 70, 74, 75, 79–81, 96, 100, 103, 105, 106, 110, 116, 144, 145, 148, 152, 153, 155–157, 159, 166, 168n11, 169n12, 182 Colman, Olivia, 129–131 Comics, 16 Communication, 27–33, 40, 50, 81, 101, 105, 106, 110, 153–157 extracommunicational domain, 27–33, 40, 50, 81, 101, 105, 106, 110, 153–157 intracommunicational domain, 27–31, 33, 50, 81, 105, 106, 110, 155, 156

Cramer, Florian, 141, 161 Cronenberg, David, 147, 154, 160 Crossley, Mark, 99, 100, 137, 144, 145, 167n2 Cyrus, Miley, 129, 134, 135

# **D**

Dalsgård, Anne Line, 212, 213 Davis, Deborah, 129 Deleuze, Gilles, 177 Derrida, Jacques, 143, 144 Diegesis/diegetic diegetic camera, 134, 176, 183, 185 diegetic screen, 152, 153, 155–156, 163, 165 diegetic world, 143–147, 152, 155–158, 160, 163–167 intra-diegetic, 152, 157, 176, 192 metadiegetic, 152–154, 156–159, 164 Digital, 97, 98, 110, 117, 142, 143, 145, 148, 151, 152, 156–160, 162, 165, 169n14, 175–192, 217–236 *See also* Analogue

#### **E**

Education, xi, 9, 220, 221, 224, 228, 233 *See also* Learning; Teaching Eikhenbaum, Boris, 67 Ekphrasis, 82 *See also* Media representation Elleström, Lars, v–vii, 5, 10, 16, 20, 21, 29, 31, 46, 50, 70, 71, 73, 74, 80, 81, 86, 96–101, 103–105, 109, 114–119, 123–127, 137, 142, 144–146, 150, 152, 154–156, 158–160, 162, 164, 166, 167, 168n8,

168n10, 169n11, 169n12, 177, 181, 182, 198, 199, 204, 210, 218, 219, 222, 225, 226, 228, 229, 235 Elsaesser, Thomas, 142, 143 Email, 24, 57, 60, 96, 206 Emoji, 228–231, 235, 236 Enns, Anthony, 169n13 Extracommunicational domain, *see* Communication

#### **F**

Fiction, 62, 134, 137, 147, 183 Fictionality, 104, 136, 137, 152, 155, 158, 175, 178, 183 Film, 5, 96, 98, 114, 141–167, 175, 203, 235 *See also* Cinema Focalization, 148

#### **G**

Garland, Alex, 156 Garncarz, Joseph, 60 Gaudreault, André, 60, 157, 162, 169n14 Genette, Gérard, 153, 154, 156, 158 Genre, 64, 78, 82, 99, 104, 129, 147, 168n10, 176, 180, 181, 187, 222, 224, 225, 233, 235 *See also* Submedium Georgi, Claudia, 99, 100, 103, 107, 109 Gershon, Ilana, 219 Gesture, 4, 13, 15, 24, 29, 36, 37, 42, 47–49, 53, 63, 76, 77, 79, 81, 98, 114, 117, 121, 147 Gjelsvik, Anne, 142, 144 Graphic novel, 222 Grusin, Richard, 100, 201

#### **H**

Hall, Stuart, 10–13, 20, 22, 23, 26 Halliday, Michael A. K., 225 Hanson, Curtis, 147 Have, Iben, 198–200, 204, 209–213, 219 Hearing, sense of, 44, 219 Helle, Helle, 200, 203–206, 208–213 Heteromediality, 73–75, 85, 142, 148 Hypermediacy, 99–101, 103

### **I**

Iconicity, 21, 51, 52, 58, 67, 69, 70, 82, 85, 135, 152 *See also* Icons; Sign types Icons, s*ee* Sign Types Ihde, Don, 203, 204 Illustration, 137, 222 Images, v, 6, 7, 38, 41, 42, 45, 47, 51, 52, 55, 56, 59, 66, 68, 69, 77, 78, 105, 106, 110, 114, 141, 142, 144, 147, 150, 152, 154, 155, 157, 159–162, 164, 166, 169n14, 175–180, 182–184, 186, 190, 192, 199, 208, 218, 222, 226–228 *See also* Icons; Sign types Immediacy, 100, 103, 190–192 Indexicality, 21, 51, 69, 152 *See also* Indices; Sign types Indices, *see* Sign types Innis, Harold A., 35 Internet, 188, 201, 206–208, 232 Interpretant, *see* Sign constituents Intracommunicational domain, *see* Communication

### **J**

Jackson, Shannon, 119, 120 Jacobs, Jane, 108 Jakobson, Roman, 10–12, 26, 44 Jewitt, Carey, 114, 218, 226 Johnson, Mark, 13 Johnston, John, 160 Jonze, Spike, 158 Journalism, 185

#### **K**

Kattenbelt, Chiel, 77, 97–99, 109 Kirby, Michael, 122, 123, 125 Kittler, Friedrich A., 160, 166 Koepnick, Lutz, 209 Krämer, Sybille, 149, 161, 162 Kress, Gunther, 6, 8, 42, 114, 115, 218, 225–227

#### **L**

Laban, Rudolf, 123, 124 Language, vii, 6, 12, 16, 22, 37, 42–45, 52, 53, 56, 57, 60, 63, 75, 76, 83, 200, 209, 213, 217–236 *See also* Sign types; Symbols; Texts; Verbal Lanthimos, Yorgos, 129 Lavender, Andy, 85, 100, 101, 126–127, 135, 144, 152, 169n12 Lawson, Clive, 227 Learning, 4, 178, 183, 217–236 *See also* Education; Teaching Leeuwen, Theo van, 6, 8, 42, 227 Lehtonen, Mikko, 41, 43, 226 Lessing, Gotthold Ephraim, 4 Linguistics, 7, 9, 10, 22, 41, 44, 70, 168n6, 179, 213, 218, 219, 222, 224, 225, 232, 235, 236 Literacy, 219–222, 225, 233, 235 Literature, 6, 9, 45, 58, 59, 66, 67, 82, 110, 167, 198, 199, 201, 202, 207, 210, 213, 214, 218–221, 223, 235 Ljungberg, Christina, 66

Lotherington, Heather, 114, 168n6, 168n9, 217, 221, 236n1 Lu, Amy Shirong, 13 Lund, Hans, 75 Lutas, Liviu, 144, 158, 168n7 Lynch, David, 160, 161

#### **M**

Mankiewicz, Joseph L., 147 Manning, Paul, 219 Maps, viii, 125, 184 Marion, Philippe, 60, 157, 162, 169n14 Materiality, viii, xii, 4, 13, 19, 35, 45, 49, 52, 56, 63, 65, 68, 70, 77, 78, 85, 99, 100, 109, 147–150, 165, 179, 206, 210, 228, 234 McIntosh, Kate, 99, 103, 105 McKenzie, Jon, 125 McLuhan, Marshall, 14, 16, 53, 84, 176 McNamara, Tony, 129 Media borders, xii, xiii, 8, 66–73, 82, 85, 86, 164, 165, 167 Media characteristics, 80–83 Media integration, xii, 73–84 Media modalities material modality, 20, 44, 47, 102, 109, 145, 147, 149, 151, 155, 234 semiotic modality, 20–22, 44, 46, 49, 52, 57, 66, 67, 78, 115, 146–147, 152–159, 225, 226, 229 sensorial modality, 20, 22, 44–47, 49, 107, 109, 115, 116, 145–152, 156, 165, 225, 226, 229 spatiotemporal modality, 20, 22, 44, 46–49, 66, 109, 115, 145–152, 165, 225, 226, 229

Media representation, 81, 83 Media transformation, xii, 73–84 Media translation, 73–84 Media types, xiii, 8, 9, 26, 46, 47, 53, 55–59, 64–69, 71, 72, 74, 77–79, 83, 98, 115, 145–160, 164, 225, 229, 232 basic media types, xiii, 9, 54–60, 64–67, 69, 71, 72, 74, 77–79, 83, 145–160, 164 qualified media types, 9, 54–67, 71, 72, 74–78, 80–85, 117, 145, 155, 164, 165, 167, 168n10 Mediation, v–ix, 14, 38–40, 44, 47, 49, 50, 53, 57, 97, 100, 105–107, 110, 114, 115, 118, 154, 166, 168n11, 175–192, 201, 210, 229 Medium specificity, v, 58, 144, 160 Mendelssohn, Moses, 44 Mendes, Sam, 154 Metalepsis, 158, 168n7 Mind perceiver's mind, 12, 13, 17–21, 23, 24, 26–29, 33, 38–40, 44, 46, 47, 49–52, 54, 57, 61, 71, 80, 96, 100, 103–106, 116, 144, 146, 152, 155, 168n11, 182 producer's mind, 12, 13, 15, 17, 19, 23, 24, 26, 27, 33, 35, 36, 39, 46, 96, 100, 104–106, 116, 144 Mitchell, W. J. T., 5, 42, 45, 53, 67, 73, 75 Modalities, v, 4–86, 97, 113–118, 124–137, 141–167, 199, 218, 227–234 *See also* Media modalities Modality modes, 41–55, 59, 63–65, 67–69, 72, 74–76, 80, 81, 85, 141–167 Modes, 65, 97, 113–118, 124–137, 141–167, 178, 218, 227–234 *See also* Modality modes

Müller, Jürgen E., 34, 63, 84 Multimedia, 75, 218, 219, 221, 223, 226–228, 235 Music, 4–6, 22, 28, 36, 42, 45, 47, 49, 51–53, 57–59, 62, 66, 67, 77, 78, 81, 98, 119, 135, 160, 200, 207, 213, 235

#### **N**

Narration narrative, 22, 29, 42, 79, 82, 83, 100, 128, 129, 135, 143, 145, 146, 150, 153, 154, 156–158, 160, 162, 167, 168n4, 175–192, 192n1, 201, 204–207, 210, 211, 213, 214 narrator, 205, 209–211, 213 Nelson, Robin, 104 Newell, Kate, 137, 144, 165

### **O**

Object, *see* Sign constituents Odin, Roger, 143, 157, 158, 163 Ong, Walter J., 202, 219 Opera, 227 Oplev, Niels Arden, 155 Orality, 219 *See also* Speech Östlund, Ruben, 163

### **P**

Paech, Joachim, 142, 143, 159, 168n5 Palma, Brian de, 147, 188, 189, 191 Pardes, Arielle, 228 Participation, 66, 101, 103–106, 122, 156, 209 Pavis, Patrice, 104, 109, 167n2 Pedersen, Birgitte Stougaard, 198–200, 204, 209–213, 219

Peirce, Charles Sanders, 21, 22, 27, 39, 42, 44, 50, 51, 69, 74 Perception, xii, 7, 8, 13, 17, 18, 20, 23, 25, 27, 28, 31–33, 38, 39, 43, 49, 50, 68, 70, 85, 103–105, 107, 110, 114, 116–118, 137, 148, 149, 154, 155, 168n4, 176, 177, 180, 182, 190, 192, 226–228 Performance, 5, 96, 113, 118–138, 145 Performer, 62, 101–104, 106, 113, 117, 119–124, 127–129, 132, 133, 136, 138 *See also* Actor; Actress Pisters, Patricia, 176, 188 Print, 120, 218, 224, 233, 235, 236

# **Q**

Qualified media, *see* Media types Qualifying aspects contextual qualifying aspects, 60, 63, 65, 159, 160 operational qualifying aspects, 60–65, 162

#### **R**

Radio, 49, 67, 79, 114, 117, 119, 187, 201 Rajewsky, Irina O., 9, 13, 66, 71, 74, 79, 86 Reading, 29, 67, 118, 124, 125, 130, 160, 165, 198–214, 218, 220, 222, 233 Remediation, 201, 202, 214, 219, 234 Representamen, *see* Sign constituents Representation, 22, 30, 32, 38–40, 44, 49, 51, 52, 57, 68–70, 74, 81, 83, 97, 106, 107, 115, 116, 123, 137, 154, 155, 157–159, 166, 168–169n11, 178,

181–183, 190, 192, 206, 208, 218, 226, 227 Rice, Ronald E., 14, 58 Rimmon-Kenan, Shlomith, 5 Rist, Pippilotti, 142, 147 Ryan, Marie-Laure, 28, 29

## **S**

Satrapi, Marjane, 222 Saussure, Ferdinand de, 6 Schechner, Richard, 97 Schirrmacher, Beate, 19, 168n5 Schmidt, Siegfried J., 32 Schramm, Wilbur, 10–13, 20, 22, 23 Science, 42, 51, 59, 134, 147, 176, 182 Scott, Jo, 102, 103 Scott, Ridley, 151 Screens, 15, 18, 25, 34–37, 47, 59, 65, 79, 107, 109, 131, 141–167, 176, 177, 180, 185–192, 218, 220, 222–224, 229, 231, 235 Semiotics presemiotic modalities, 68, 69, 80, 100, 146, 153, 165, 167 semiosis, 21, 22, 27–29, 31, 39, 45, 49, 57, 101 semiotic modality, 20–22, 44, 46, 49, 52, 57, 66, 67, 78, 115, 146, 147, 152–159, 204, 225, 226, 229 Sensoriality, 99, 101–104 Sequentiality, 49, 53, 220, 227 Sewitsky, Anne, 134 Shannon, Claude E., 23 Sight, sense of, viii, xiii, 6, 27, 44 Sign constituents, 21, 27, 29, 30, 32, 33, 39, 40, 69 interpretant, 21, 39 object, 21, 39 representamen, 21, 39

Sign types, 158 icons, 21, 44–46, 50, 51, 53, 56, 59, 63, 69, 77, 145, 146, 152, 160 indices, 21, 44–46, 50, 51, 53, 59, 63, 69, 77, 145, 146, 152, 160 symbols, 21, 44–46, 50, 51, 56, 59, 63, 69, 77, 145, 146, 152, 160 Simonson, Mary, 24, 105, 144 Smell, sense of, 20 Social media, 61, 81, 109, 180, 185, 187, 189, 219, 221–225, 233, 235 Social semiotics, 9, 42, 218, 225, 226, 235 Song/sing, 14, 15, 24, 40, 48, 78, 81, 83, 207 Space three-dimensional space, 46, 58, 65, 68, 145, 149, 165 two-dimensional space, 53, 58, 68 *See also* Spatiotemporality Spatiotemporality, 48, 99 Spectator, 100, 101, 114, 117–119, 128, 133, 135, 137, 143, 162, 180 Speech, 4, 13, 17, 18, 23, 42, 48, 59, 67, 76, 77, 84, 160, 200, 203, 205, 209, 210, 219, 225–227 *See also* Orality Spielberg, Steven, 147 Spielmann, Yvonne, 61 Stanislavski, Konstantin, 120, 121 Story, 80–82, 96, 128, 133, 134, 154–158, 167n1, 178, 185–190, 202, 203, 205, 207, 209, 213 Submedium, 63, 64, 78, 147, 168n10 *See also* Genre Suvin, Darko, 147 Symbolicity, 21, 22, 51, 69, 152 *See also* Sign types; Symbols Symbols, *see* Sign types

#### **T**

Taste, sense of, 42, 49 Teaching, 121, 218, 220–222, 231, 233 *See also* Education; Learning Technical media, 9, 33–38, 64–66, 76, 79, 85, 97, 105, 109, 110, 116–118, 145, 146, 152, 159, 162, 166, 167, 168n9, 226–227, 230, 231, 233 Technology, 58, 60, 65, 117, 134, 160, 168n9, 176, 180, 198–200, 203, 204, 207, 208, 212, 214, 218, 219, 223, 224, 226, 227, 236 Television/TV, v, 5, 15, 16, 20, 29, 36–38, 47, 57, 60, 62, 65, 79, 82, 84, 98, 102, 117, 119, 134, 142, 143, 146, 147, 151, 153–155, 158, 160–167, 187 Texts, 6, 98, 114, 146, 179, 198, 218 *See also* Language; Sign types; Symbols; Verbal Theatre/theater, 25, 37, 53, 65, 77, 78, 95–111, 114, 117, 119, 120, 122, 128, 129, 133, 145, 157, 158, 162, 167n2 Theme, 80, 136, 187, 204–206 Time, v, 4, 96, 115, 145, 180, 198, 212–214, 219 *See also* Spatiotemporality Touch, sense of, 42, 102 Translation, *see* Media translation Transmediality, 73–75, 79–81, 83, 85, 159 Transmediation, 81–83, 234 Truthfulness, 62, 158, 175–192 Tseng, Chiao-I, 144, 152, 158, 168n4, 176, 177, 179, 183, 192n1 Tucker Green, debbie, 129, 131, 132

#### **V**

Veltruský, Jiří, 45 Verbal, 7, 12, 22, 26, 42, 49, 51, 52, 56, 75, 78, 83, 117 *See also* Language; Sign types; Symbols; Texts Vieira, Miriam, 114 Villeneuve, Denis, 148–150 Vinterberg, Thomas, 163, 164 Virginás, Andrea, 143, 147, 151 Virtual reality (VR), 121 Virtual sphere, 29–33, 40, 44, 46, 49, 54, 57, 69, 71, 74, 76, 79–81, 101, 102, 146, 152 *See also* Intracommunicational domain Vision, *see* Sight, sense of Voice actor's voice, 117 AI (digital) voice, 231, 232, 234, 236

human voice, 219 recorded voice, 211 voiceover, 188, 232

#### **W**


#### **Z**

Zvagintsev, Andrei, 163