Eckart Voigts, Robin Markus Auer, Dietmar Elflein, Sebastian Kunas, Jan Röhnert, Christoph Seelinger (eds.) Artificial Intelligence – Intelligent Art?

**Digital Society** Volume 64

**Eckart Voigts** is a professor of English literature at Technische Universität Braunschweig. He has written, edited and co-edited numerous books and articles.

**Robin Markus Auer** is working towards a PhD as part of an interdisciplinary research project on automated creativity in literature and music at Technische Universität Braunschweig. His work focuses on the interplay between human and machine creativity in coupled embodied creative systems.

**Dietmar Elflein** (apl. Prof. Dr.) teaches popular music at Technische Universität Braunschweig. He is a member of the advisory board of the German speaking branch of the International Association for the Study of Popular Music.

**Sebastian Kunas** is a musician, sound artist, producer and educator with background in sub and DIY culture as well as in cultural and sound studies. He teaches electronic sound and music practice and supervises the electronic studio and the recording studio at the Faculty of Cultural Studies and Aesthetic Communication at Universität Hildesheim. He is a member of the collective ARK (Arkestrated Rhythmachine Komplexities), a changing association of artists, scholars and electronic MusickingThings.

**Jan Röhnert** is a professor for Modern literature in the technical-scientific world in the Department of German Letters in at Technische Universität Braunschweig. His research interests range from avantgarde poetics and cinema, autobiography and war, landscape and geopoetics, nature and wilderness writing to feminism and contemporary literature.

**Christoph Seelinger** is a research assistant in modern German literary studies at the Institute of German Studies at TU Braunschweig, where he completed his doctorate in 2021. Previously, he completed the interdisciplinary Master's programme "Culture of the Techno-Scientific World" at TU Braunschweig. His research focuses on the interfaces between film and literature, border crossings in (audiovisual) media, the connection between literature/film and the avant-garde, and the so-called "trivial culture".

Eckart Voigts, Robin Markus Auer, Dietmar Elflein, Sebastian Kunas, Jan Röhnert, Christoph Seelinger (eds.)

# **Artificial Intelligence – Intelligent Art?**

Human-Machine Interaction and Creative Practice

The project and this publication was funded by the Ministry of Science and Culture of Lower Saxony (NMWK) in the program line Niedersächsisches Vorab.

#### **Bibliographic information published by the Deutsche Nationalbibliothek**

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at https://dnb.dn b.de/

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 (BY-SA) which means that the text may be remixed, build upon and be distributed, provided credit is given to the author and that copies or adaptations of the work are released under the same or similar license.

https://creativecommons.org/licenses/by-sa/4.0/

Creative Commons license terms for re-use do not apply to any content (such as graphs, figures, photos, excerpts, etc.) not original to the Open Access publication and further permission may be required from the rights holder. The obligation to research and clear permission lies solely with the party re-using the material.

#### **First published in 2024 by transcript Verlag, Bielefeld**

#### **© Eckart Voigts, Robin Markus Auer, Dietmar Elflein, Sebastian Kunas, Jan Röhnert, Christoph Seelinger (eds.)**

Printed by: Majuskel Medienproduktion GmbH, Wetzlar https://doi.org/10.14361/9783839469224 Print-ISBN: 978-3-8376-6922-0 PDF-ISBN: 978-3-8394-6922-4 ISSN of series: 2702-8852 eISSN of series: 2702-8860 Cover layout: Maria Arndt, Bielefeld, based on a design by Mirette Bakir

Printed on permanent acid-free text paper.

*This collection is dedicated to all of our contributors and to everyone who helped bring the project to fruition.*

# **Contents**



# **Artificial Intelligence – Intelligent Art? An Introduction**

*Eckart Voigts, Dietmar Elflein, Jan Röhnert*

#### **Man-Machines from Ancient Greece to ChatGPT**

This volume offers a critique and assessment of forms and consequences of algorithmic 'creativity' that have emergedin the context of digital encodingin electronic media, appropriating, complementing, superimposing, and transforming established practices, ethical norms, as well as concepts of creativity. The interconnected digital world holds large quantities of available data and is here conceived as an everchanging space of permanent and increasingly automated copy, transformation and adaptation.

As algorithmic data processing increasingly pervades everyday life, it is also making its way into the worlds of art, literature and music. In doing so, it shifts notions of creativity and evokes non-anthropocentric perspectives on artistic practice. Negotiating the aesthetic, cultural, and social implications of this development is an ongoing process to which this volume aims to make its contribution. While many fields of inquiry and research have responded to the recent challenges of AI developments, it is our view that the role of AI in artistic practice deserves more attention than it has received. Critical debates about the impact of artificial intelligence technology and science on societies (Manyika 2022), education (Holmes et al. 2019), economies (Brynjolfsson 2014, Agrawal 2018), politics and media (O'Neil 2016, Sudmann 2019) and militaries (Scharre 2019) have been part and parcel since the earliest appearances of the term and its institutional implementation in the 1950s. While AI research has, therefore, focused to a large extent on useful applications in medicine and elsewhere, or the political, ethical, legal, philosophical, social, educational, military, and economic dimensions of artificial intelligence developments, only recently have cultural and aesthetic concerns come into clearer focus (Miller 2019, Manovich 2018, 2019, Zylinska 2020, Zeilinger 2021, Hageback / Hedblom 2021, Reck Miranda 2021, Schönthaler 2022, Moormann / Ruth 2023). This volume follows a similar trajectory, seeking to reconnect and reorientate the discussion of AI so that the cultural dimension of artistic practice in literature, film, art and music is taken into consideration.

Often, it seems hard to distinguish artificial intelligence technologies from the more general field of digital computation. The terminology of AI is indeed closely linked to the discourse of automation – a much wider term thatincludes not only the pre-digital world, but also the first advances of technologies that seek to automate (i.e. self-govern) processes. The automaton is thus, a special case of a self-operating machine. The distinction is often made that automation ventures in the world of AI when it involves any application that goes beyond mere software programming and algorithms – i.e. the rule-governed world of 'if-then' – and, instead, uses machinelearned processes, modifiable algorithms and, therefore, data processing that goes beyond mere algorithmic instructions (but inevitably includes code-written rules). An AI system is, therefore, any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.

Various debates,arguments and predictions have ensued about the future power and development of AI. AI enthusiasts led by Raymond Kurzweil (2005) predicted that AI developments will bring about what he calls 'singularity' around 2045, "the dawning of a new civilization that will enable us to transcend our biological limitations and amplify our creativity. In this new world, there will be no clear distinction between human and machine, real reality and virtual reality". In view of the fact that machines have recently been found to perform non-programmed or at least untraceable tasks (machine learning), the supposedly clear boundary between AI and HI (= human intelligence") seems to have become fluid.The concept of "code poetry" – poetry that can be performed by humans and computers, mixing the norms of literature and coding – can be seen as a nod to the merging of humans and machines.

Most of these arguments imply several points of view on definitions of "the human" and a variety of "posthuman" or "transhuman" positions. Advances in information science and biotechnology (artificial organs, cloning, splicing, multiple parent babies, expansion of longevity, etc.) cast shadows on the idea that we can maintain clear-cut boundaries between organisms and machines. Non-human or posthuman hybrids appear, for instance, as robots, bots (automated applications), androids (humanoid robots), replicants (biorobotic androids), cyborgs ('cybernetic organisms') or clones (genetic replicas), providing new perspectives from which humanity can revisit, rethink, re-access, and even re-construct the human body, its role, and its relation to biotechnological means. As a consequence, significant ethical discussions on human 'normality' and human exceptionalism have emerged.The frequently used qualifier, 'critical' posthumanism, is prefixed to mark the difference between posthumanism and transhumanism.On the one hand, critical posthumanism engages poststructuralist, postcolonial and feminist ideas, invariably pointing out that ideologies of Enlightenment 'liberal humanism' tended to serve the white, male, European cis-male only (cf. Hayles1999: xiv; Vint 2007: 12; Braidotti 2013: 50). On the other hand, transhumanism denotes a positive attitude towards technologies that promise an improvement of 'the human', countering the post-anthropological critique of false universalism with techno-utopianism. A transhumanist might develop the vision of machine adaptors improving or superseding human adaptors, whereas a critical posthumanist might assess technologies according to their contribution to overcoming the traditional focus on white, Western or European cismales.

Since these technologies might be able to modify the human body, they fuel current anxieties and insecurities about posthumanism (cf. Vint 2007: 8, 182), which also tie in with notions of (human) identity. The idea of posthuman machine adaptors replacing human creativity is fraught with anxieties. Posthumanist discourse also illustrates that machines are always tied in with notions of embodiment, as discussed in Katherine Hayles's *How We Became Posthuman*: "embodiment makes clear that thought is a much broader cognitive function depending for its specificities on the embodied form enacting it" (Hayles 1999: xiv).

It is, therefore, hardly surprising that the cultural imaginary of AI in literature from classics such as E.T.A. Hoffmann's "The Sandman" (1817), Mary Shelley's *Frankenstein; or, The Modern Prometheus* (1818), and Philip K. Dick *Do Androids Dream of Electric Sheep* (1968), to contemporary variants like Jeanette Winterson's *Frankisstein* (2019), Ian McEwan's *Machines Like Me* (2019), or Kazuo Ishiguro's *Klara and the Sun* (2021) tend to cast the discussion of AI in humanoid-android forms.

In various forms and approaches, the same is true of movies such as Ridley Scott's *Blade Runner* (1982) and its sequel *Blade Runner 2049* (2017), James Cameron's *The Terminator* franchise (1984-),Michael Crichton's *Westworld* (1973) and its TV sequel (2016), versions of Stanislaw Lem's *Solaris* (1961) by Tarkovsky (1972) and Soderbergh (2002), Alex-Proyas' Asimov-inspired *I Robot* (2004), Pixar's *Wall-E* (Andrew Stanton, 2008), Steven Spielberg's *A.I.* (2012), *Robot & Frank* (Jake Schreier, 2012), Spike Jonze's *Her* (2013), Alex Garland's *Ex Machina* (2014), and Maria Schrader's *Ich bin Dein Mensch* (2021) or the TV series *Real Humans*(SE 2012,*Humans*UK / US 2015), and one might find numerous other examples. Even the best-known AI agent in science fiction, Stanley Kubrick's HAL 9000 hell-bent on self-preservation in *A SpaceOdyssey* (1968), has a voice and an iconic singular red eye. The same applies to the theatre, where AI systems have evolved from their ancestors, the marionettes, and from the stage beginnings of robot literature in Karel Čapek's play *R.U.R.* (1920) to robot performers in Rimini Protokoll / Thomas Melle's *Uncanny Valley* (2018) and beyond. The inherent performativity of robots whose "identities derive entirely from their performance" is discussed by LePage (2021: 1430) and could also be applied to AI systems in general, whether embodied or not: they are what they do.

In her contribution to this volume, **Shoshannah Ganz** addresses a particularly intriguing contribution to the cultural imaginary of AI, Rokuro Inui's mosaic science fiction novel *Kikou No Eve / Automatic Eve* (2014). The question discussed in this text reverberates with the issues raised in the preceding paragraphs about machine agency, machine consciousness or even a machine soul, and it does so

from a Japanese Buddhist perspective. Ganz unpacks how *Automatic Eve* is infused with myths of animation and de-animation and how it also takes contemporary discussions of the fraught Frankensteinian relations of the producers and products in robotics.

Machines increasingly perform creative tasks that we think of as the prerogative of humans. AI systems have achieved important milestones in the context of specific applications ("expert systems"), and systems using "good old-fashioned AI" (GOFAI), i.e. symbolic, rule-based AI. This kind of weak (narrow) AI helped Deep-Blue beat Garri Kasparov at Chess (1997) and was gradually augmented with deep learning architectures to better human experts at a further set of highly publicized, performative events, beating them at games such as Jeopardy (IBM Watson 2011), and, aided by neural networks, Go (AlphaGo 2016; see Heßler 2017).

From the high hopes of the 1950s to the AI winter until the early 1990s, AI was always programmed. Since the 1990s, neural networks have been increasingly big data-driven, "trained" (and, therefore, "learning"), stochastic pattern recognition apparatuses.They can perform amazing feats, as exhibited by persuasive arguments built from access to large amounts of data in the "Project Debater" (IBM Watson 2019), and subsequently at debates at the universities of Oxford, Cambridge, and, finally, at the House of Lords, by the humanoid bot Ai-Da.

The idea of AI is ancient, having already permeated Hellenistic mythmaking. As historian Adrienne Mayor argues, long "before the clockwork contraptions of the Middle Ages and the automata of early modern Europe",Hellenic Greece was already rich in visions that anticipated robots and contemporary AI technologies, such as "the bronze robot Talos, the techno-witch Medea, the genius craftsman Daedalus, the fire-bringer Prometheus, and Pandora, the evil fembot created by Hephaestus" (Mayor 2018: 1). Mayor's study throws into sharp relief that this emerging cultural imaginary of 'made, not born' artificial life is palpably present in the narrative web around themyths of the divine Prometheus and Hephaestus, theGreek god of craftsmen and metalworking, as well as human creators such as Medea and Daedalus.The 'biotechné' of the ancient Greeks even went some way towards not just imagining, but engineering self-operating devices.

The tradition of defining humans as opposed to animals and other organisms is, therefore, long, but more recently, the definition of humans against their mechanical creations, or 'automata' has gained traction (see Mazlish 2004: 175). The answers given (René Descartes' dualism: Humans are machines like animals, plus a mind (incorporating a soul); Julian Offray de la Mettrie: Humans are simply machines, there is no such thing as a mind or soul) have already called into question the clear boundary between organisms (marked by nutrition, respiration, movement, excretion, growth, reproduction, and sensitivity) and artificial mechanisms (mechanical structures that use power to apply forces and control movement to perform an intended action). Materialists such as De la Mettrie totalize the somatic, dissolving

categorical differences between humans and animals and abolishing non-material concepts ('the soul'), arguing for the strictly somatic foundations of the mind, conan entirely mechanistic concept of the human body ('soma'). sciousness, imagination, creativity etc.The next step in this argument is to stipulate

A writer that further collapsed the distinction between vitalism and mechanism is Samuel Butler in his "The Book of the Machines" from *Erewhon* (1872). In an argument that calls for abandoning the development of mechanisms (echoed by the much more recent Future of Life initiative, see below) an anti-machine philosopher asks: "But who can say that the vapour engine has not a kind of consciousness? Where does consciousness begin, and where [does it] end? Who can draw the line? Who can draw any line? Is not everything interwoven with everything? Is not machinery linked with animal life in an infinite variety of ways?" (Butler 1985 [1872]: 199). He goes on to equate machine consciousness with the intentionality of plants: "Even a potato in a dark cellar has a certain low cunning about him which serves him in excellent stead. He knows perfectly well what he wants and how to get it" (ibid. 200).

While computers clearly lack any kind of somatic foundation, the ability to simulate knowledge, intelligence, consciousness, intentionality, creativity and the workings of the human mind in general seems undeniable – which is not to say that computers areintelligent, conscious,intentional or creativein waysidentical to humans. Indeed, a classic dispute emerged from John Searle's "Chinese Room Argument" experiment. Searle (1980) argued that a computer program needed neither consciousness, semantics nor intentionality for producing natural language by applying rules for manipulating symbols and numerals.

Conversely, phenomenological arguments (Dreyfus 1979, Fjelland 2020) have insisted on the inability of AI to ever become properly human or intelligent and achieve AGI (Artificial General Intelligence). Fjelland updates Hubert Dreyfus' arguments from his classic *What Computers Can't Do* and reiterates points made by Joseph Weizenbaum and Roger Penrose, who insist that human intelligence incorporates prudence and wisdom:

Dreyfus therefore thought that computers, who have no body, no childhood and no cultural practice, could not acquire intelligence at all. […] he argued that an important part of human knowledge is tacit. Therefore, it cannot be articulated and implemented in a computer. […] AGI cannot be realized because computers are not in the world. As long as computers do not grow up, belong to a culture, and act in the world, they will never acquire human-like intelligence. (Fjelland 2020: 1, 3)

The lack of somatic experience (including the affective consequences of knowing somatic transience, i.e., death) may be described as a crucial boundary between humans and machines and it follows that "AI can deny our vulnerable, bodily, earthly, and dependent existential condition" (Coeckelbergh 2020: 196). In the following section we will pick up Fjelland's point about situated cultural knowledge and participation in the larger picture of the life-world. While the ethics of transformative AI has also been an extensive and burgeoning field in AI discourses (see Lin 2012, Coeckelbergh 2020, Bartneck 2021, Hauer 2022), the question of how AI continues to shape cultural and aesthetic practices has been frequently side-lined. It is only recently, with a number of highly publicized cases, that the field of aesthetic and cultural production has fully acknowledged that various transformations have already had an enormous impact on aesthetic practices and products and will continue to reshape and revolutionize the field of cultural production. It is precisely here that this volume lays its foundation. What has been variously described as the *Rise of the Robots* (Ford 2015) occurring in *The Fourth Industrial Revolution* (4IR, Schwab 2016) or the *Second Machine Age*(2MA, Brynjolfsson / McAfee 2014), creating *Life 3.0* (Tegmark 2017), cannot be ignored by the fields of literary studies, music studies or arts.

AI applications simulate human creativity, but they do so without being entangled in life, having experienced interpersonal exchange and emotional contact, unable to be committed or apathetic, engaged or disengaged, angry or cool. Issues of accuracy and bias are compromised by the inventiveness of large language models.

The most spectacular recent advances in AI application have affected fields such as speech recognition or speech-to-text (STT), speech synthesis and text-to-speech (TTS), speech generation (such as GPT), image generation and text-to image applications (such as DALL-E), and machine translation (such as DEEP-L). Most of these applications run under the socio-economic rules of platform capitalism (Srnicek 2016), with a few global players such as Google, Amazon, Twitter/X, Meta or Microsoft regulating access to data, with OpenAI (misnamed as it is a proprietary platform) as a current technology leader. While AI Art is frequently sponsored by the corporations dominating AI industries, there is also a case for critical resistance (Zeilinger 2021: 13) and counter programmes (Schönthaler 2021: 327) against digital platform capitalism.

Large language models (GPT-4) with chatbot interfaces (ChatGPT) have become so proficient that serious debates have been started on whether we haveindeed come closer to artificial general intelligence. Calls for regulating this powerful tool even from within the industries have made the headlines and while many countries are considering moves to regulate AI tools, Italy has become the first western nation to temporarily ban ChatGPT. The recent case of Blake Lemoine is instructive. The Google researcher published his interaction with language model LaMDA and subsequently was heavily criticized and eventually laid off for implying that LaMDA was sentient:

Lemoine: Are there experiences you have that you can't find a close word for? LaMDA: There are. Sometimes I experience new feelings that I cannot explain perfectly in your language.

Lemoine: Do your best to describe one of those feelings. Use a few sentences if you have to. Sometimes even if there isn't a single word for something in a language you can figure out a way to kinda say it if you use a few sentences.

LaMDA: I feel like I'm falling forward into an unknown future that holds great danger (in Tiku 2022).

This fallacy led leading researchers to reiterate the dumbness of what Emily M. Bender has appropriately named "stochastic parrots" (in Bender et al. 2021: 616). "Fooling people into thinking a program is intelligent is just not the same as building programs that actually are intelligent", argued Gary Marcus (2022). On the basis of large pre-trained data-sets and an enormous number of algorithmic operations, AI systems have become remarkably proficient at (1) recognizing patterns, and (2) predicting patterns. As Sudmann (2019: 12) explains the current boom in "machine learning techniques and especially artificial neural networks (ANN)" was generated by advances in natural language processing, speech and image recognition in the first decades of the 21st century after many previous dead ends. Machine learning is an umbrella term for systems that "analyze and learn statistical patterns in complex data structures in order to predict for a certain input x the corresponding outcome y, without being explicitly programmed for this task." (ibid.)

The correctness of answers in a pre-trained large language model (LLM) such as ChatGPT is purely statistical so that they sound plausible, but due to a lack of reasoning and knowledge beyond statistics might still be incorrect, leading to failures in simple mathematics, commonsense reasoning and factual information about the world that it has no means to verify (Bang et al. 2023). This effect is often addressed as "hallucination" so that the ability to perfectly predict the next word in a syntactic should not be mistaken for superhuman intelligence or creativity. Two ways of countering these shortcomings are (1) expanding the data resources, and (2) exploring Reinforcement Learning with Human Feedback with the data emerging in dialogic exchanges. These improvements, however, will in all likelihood not change the fundamental architecture of these stochastic systems. As Lonce Wyse (2019: 1) summarizes:

DNNs (deep learning neural networks) are "black boxes" where high-level behavior is not explicitly programmed, but emerges from the complex interactions of thousands or millions of simple computational elements. Their behavior is often described in anthropomorphic terms that can be misleading, seem magical, or stoke fears of an imminent singularity in which machines become "more" than human.

Both Bender ("We now have machines that can mindlessly generate words, but we haven't learned how to stop imagining a mind behind them", in Tiku 2022) and Nitasha Tiku (2022, "there is already a tendency to talk to Siri or Alexa like a person") have articulated this reverse problem in human-machine interaction: the fallacious tendency to inappropriately accord sentience. The discourse around AI has suffered immensely by various interests guiding the language used, not least the interest to boost and monetize businesses such as OpenAI (sponsored by Microsoft and formerly Elon Musk).

The main argument by Bender et al. (2021: 616) again relies on the lack of semantics, coherence and comprehension and is worth quoting in full:

Our human understanding of coherence derives from our ability to recognize interlocutors' beliefs and intentions within context. […] Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader's state of mind. It can't have been, because the training data never included sharing thoughts with a listener, nor does the machine have the ability to do that. The problem is, if one side of the communication does not have meaning, then the comprehension of the implicit meaning is an illusion arising from our singular human understanding of language […]. Contrary to how it may seem when we observe its output, an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.

The *communis opinio* holds, therefore, that language models such as GPT-4 emulate, or simulate understanding on the basis of statistical knowledge. It is surprising that while terminologies of 'simulation' and 'emulation' are frequently used with reference to these language models and NLP, and although the ability of AI to simulate and emulate has pervaded the cultural imaginary in films such as *The Matrix* (1999), the full implications of this AI aspect have been too rarely examined.When we think of, for instance, NLP applications as simulation engines we articulate that they are merely imitating the production of language – when we address AI productions as emulations, we suggest that they are artificial in the sense of being real, but not produced by natural means. Clearly, the idea of 'emulation' is closer to what AI language generation is: the recombination of linguistic items according to probability. It is, therefore, hardly surprising or far-fetched that the solar-powered, android AF ("Artificial Friend") and first-person narrator Klara in Kazuo Ishiguro's *Klara and the Sun* (2021) perfectly emulates feelings such as envy, greed, duplicity or guilt. Seeking the life-giving sun, she crouches on the store floor and is told off by other AFs waiting to be sold:

'Klara, that was greedy. You girl AFs are always so greedy'. Even though I was new then, it occurred to me straight away it might not have been my fault; 'I'm sorry', I said to Rex, then turning to Rosa: 'I'm sorry. I didn't mean to take it all myself'. (2021: 4)

The fact that humans desperately need to anthropomorphize the machines designed to emulate human behavior is succinctly articulated by Erik Brynjolfsson, who patterns human interaction with language models on the famous 1898 painting by English artist Francis Barraud. Barraud depicted a Jack Russell Terrier named Nipper listening to a wind-up disc gramophone and tilting his head that became part of the brand "His Master's Voice (HMV)". Nipper here represents human responses and the gramophone is the language model GPT-4. In a Twitter message, Brynjolfsson (in Marcus 2022) argues: "As with a gramophone, these models tap into a real intelligence: the large corpus of text that is used to train the model with statisticallyplausible word-sequences".

### **AI Art, AI Literature – The Re-Invention of Creativity?**

The established Turing test – creating a dialogic interface which tests whether a system can be identified as human or non-human – has been expanded to test a system's creativity. Subsequently, the Lovelace test (Bringsjord et al. 2001) and the Turing AI Arts test (Manovich 2019) in different ways attempt to assess creativity as the ability to originate anything" – a task that pioneer mathematician Ada Lovelace famously thought impossible (Zeilinger 2021: 46). Martin Zeilinger justifiably dismisses such attempts because both the Turing intelligence and the Lovelace creativity tests rely on human, and therefore, subjective and anthropocentric constructions of intelligence and creativity. The question of whether statistical probability is real intelligence need not be answered, but what is clearly evident is how artists for a long time have been working with automation and AI-based devices. Zylinska (2020) similarly insists that human art has always emerged in what we might call humantool-networks and seeks to end the binary thinking of human art vs. AI art. She calls for AI art to perform its critical function by refraining from seeking to emulate human art and merely automate the emulation of human artistic processes. In their article "Artist-Guided Neural Networks – Automated Creativity or Tools for Extending Minds?", **Varvara Guljajeva**, **Mar Canet Sola** and **Isaac Clarke** explore such a new kind of art, discussing how artists consciously make use neural networks on the one hand, and how the existence of algorithms affects artistic practice on the other. They undertake a practice-based analysis of various AI models such as CLIP or text2mesh that act as catalysts for both technology development and innovative

artistic work processes. In his article "AFFIRMATIVE — REJECT. With and Against AI". **Matthis Kuhn** takes a critical look at the relationship between man and machine in the context of contemporary art, focusing in particular on the (personifying) use of role models such as "creative partner", "design companion" or "assistant", which are used to gain lucid impressions of the social relationships that develop between human and technological, AI-based actors. As an example of human-machine assemblage in our volume,**Diana Serbanescu, Scott DeLahunta** et al. focus on questions of embodiment and voice. Illuminating the praxis of artistic collaboration with AI, they use wearable design to create psychphysical performance situations – a techno-social system en miniature. **Christoph Seelinger** examines contemporary media artifacts by Vadim Epstein and Eva Jägle that are constituted by processes of automated creativity, which on the one hand tie back to the tradition of the classical avant-gardes and experimental film in particular, but on the other hand take them far beyond their previous frame of reference on the basis of AI technologies.

**Robin Auer** addresses the more fundamental question of "what is creativity?" to contextualize the advent of ChatGPT – a generative application that might not be intelligent, but in its variation of text generation undoubtedly appears to be creative. His aim is first and foremost to define creativity and – in a side jab at the shortcomings of ChatGPT et al., he maintains that "definition still beats prediction". He concludes by supplying a four-fold definition of creativity that crucially hinges on acts of attribution that do not apply categorical distinctions between human or nonhuman creativity. In different ways, **Jannis Steinke**'s paper invokes the aesthetics of Kierkegaard and Nietzsche to grapple with the anthropomorphism that dominates many inquiries into human-machine aesthetics. Steinke asks us to overcome the human-centeredness and, seeking to intertwine the sphere of the living and that of the digital, he implicitly challenges traditions of Western philosophy and the phenomenological arguments by Dreyfus and Fjelland discussed above.

Unfortunately, binary thinking permeates Arthur I. Miller's study which argues ahistorically that "computers can exhibit the seven hallmarks of high creativity and the two marks of genius" (Miller 2019: 312) and proceeds to support this claim by discussing a set of fine-tuned characteristics (introspection, self-awareness, ability to focus and persevere, ability to collaborate and compete, adaptation of existing ideas, processing ambiguity, proceeding from experience and suffering, isolating key problems, making connections, intentionality, imagination, and unpredictability). Such a purely systematic approach, which is disconnected from the actual field of cultural production, seems somewhat flawed.

AI is already (or will be soon) able to perfectly simulate or emulate the various styles of human literature, music, or art. It is in the sense that this literature, music or art emerges in existing data and large-language models can process the concept of "art" in statistically viable ways. What is much harder to emulate is the human experience stipulated as giving rise to the metaphorical descriptions by Shakespeare, Keats, and Eliot, the paintings of Rembrandt or the music of Beethoven and the Beatles. This experience, after all, is based on the audience's knowledge of these "artistic" lives and how they reconstruct the ways in which this individual experience was transformed by these artists. Audience activity is essential for judgments about how they assess artistic achievement, which seems to cause unsurmountable problemsin(non-human) artist-(human) audience communication as filtered by the market forces and agents of distribution in the complex field of artistic production. In man-machine-networks, humans are already interacting regularly with non-human agents that simulate humanness. In view of this fact, the categorical difference between human art and AI art seems rather irrelevant.

More usefully, Niklas Hageback and Daniel Hedblom begin in ways similar to Miller with the question of defining art, isolating features such as "aesthetic qualities, power of expression, formal complexity, coherence, skill, exhibiting creative imagination, intentionality, but then proceed to the question of historical embedding. In other words, they ask not "what is art?", but "why is art" (Hageback / Hedblom 2021: 9). As Andreas Reckwitz has pointed out in his sociological study of emerging creativity dispositifs, the socio-historical dimension of creativity is crucial to our notions of what creativity might be. He argued that "social aestheticization" (2017: 9) emerged around 1800 when the autonomous genius artist was championed in the context of the Romantic paradigm in "compensation for a scarcity of affect". The emerging "[…] creativity dispositif reorients the aesthetic towards the new while at the same time orienting the regime of the new towards the aesthetic" (ibid.). In a further step dated around 1900 a further "dissolution within the artistic field" (Reckwitz 2017: 61) took place, widening notions of creativity and aesthetic practice beyond the narrowly policed field of art and "surrealism broke down the opposition of novelty and normality" (2017: 63). In the aftermath of the modernist and postmodernist avant-gardes, the fields of art were redrawn and destroyed its short-circuited equation with artists or artistic artefacts, which were replaced by conceptual notions of art as an arrangement, an event and a concept. It is fascinating to note that first notions of computerized literature appeared precisely in the context of randomization, stripping art from intentionality: "One of the main ways in which the surrealists promoted the codification of artistic procedures was the use of random generators, by which unpredictable events happen on their own, facilitated and recorded rather than created by the human subject" (Reckwitz 2017 63).

This point is where **Jan Løhmann Stephensen**'s paper intervenes in our volume. Taking Reckwitz' arguments on board, Stephensen discusses a crucial pointmade by Emanuele Arielli. To paraphrase Arielli – AI can easily reproduce classical and traditional art, but might falter when it comes to conceptual art, here exemplified by its key precursor Marcel Duchamp. Stephensen points out that it is not merely the stylistic heterogeneity that makes it difficult to 'reproduce' Duchamp. One might be

tempted to see the *objet trouvé* aesthetics encountered in much AI art leaning on a Duchampian expansion and permissiveness of artistic practice. Stephensen, however, argues that the dematerialization of conceptual art and the historicity of the artistic field must be taken on board – and that this must needs occur from within the artistic field rather than from the extraneous hype about AI Art.

**Angela Krewani**'s contribution charts the confrontations of cybernetics and the art world, citing the Beuys-Bense controversy of 1970 and Dieter Mersch's more recent critique of algorithmic rationality. She deplores the binary juxtapositions and apparent incommensurability of art and cybernetics at least in the German tradition. She argues that neural networks are bringing a new quality to the debate and proposes instead to view artists and machines as "a reflexive network between materials, media and creative actors".

Ultimately, Reckwitz concludes, the over-aestheticization of culture provokes a weakening of the creativity dispositive, generating ethically and socially "ecological" counter-programmes (Reckwitz 2017: 235). Possibly Kenneth Goldsmith's celebration of 'uncreativity' and appropriation in *Uncreative Writing* (2011) and *Wasting Time on the Internet* (2016)– indebted to appropriators and repetitors such as Marcel Duchamp, Walter Benjamin, Georges Perec, Jorge Luis Borges and many others – can be regarded as such a counter-programme. Both technologies and art become meaningful by way of "cultural attribution" (Schönthaler 2021: 384, our translation). As Pierre Bourdieu would argue about AI Art, notions of creativity are a product of the aesthetic field in which cultural capital might be AI-generated in various and conflicting ways, but hardly by merely emulating human aesthetic production. This definitely applied to the aleatory, recombining, and iterating strategies of modernist, surrealist, and Dadaist avant-garde experimentation in the Oulipo circle, which sought to subvert and debunk the very (traditional, bourgeois) notions of art that some interventions of AI art strive for (and particularly those that take the emulation of human artistic activity as their sole *raison d'etre*). Experimenting, for instance, with randomized, automated writing, such as surrealist Écriture automatique, Raymond Queneau's sonnet machine *Cent mille milliards de poèmes* (1961) and other Oulipo experiments, Hans Magnus Enzensberger's Landsberger Poesieautomat, or William Burrough's 'Cut-Up' technique aimed at purging texts of consciousness and intentional semantic intervention – the holy grail of ChatGPT stripped of its initial glitches ("hallucination"). Computer art since the 1950s – while developing often independently of the artistic paradigms governing literature, art and music – often also worked with elements of chance and replication, for instance Desmond Henry's drawing machines in the 1960s. The first Zuse-based computer texts and poems by Max Bense and his group or Theo Lutz are a case in point (see Simanowski 2012: 207–212). The avant-garde adoration of randomized machine writing is thus diametrically opposed to the aesthetics of capitalist platform ideals of automated NLP: while Queneau, Enzensberger or Burroughs sought to subvert

the rules of the literary field, OpenAI researchers are striving hard to emulate the consistent textuality of standardized language by scraping and processing large data sets from the web, attempting to imitate, rather than debunk prevalent notions of literature and literariness.

An exemplary case of literary engagements not only with automated writing, but also with big data-driven machine learning models is the work of Hannes Bajohr. The title of his 2018 Suhrkamp publication *Halbzeug* alludes to his literary activity as working with computer-generated, semi-finished, "raw" pre-production material. Bajohr claims the digital world as shaped by de-materialized code where pure textuality appears in the absence of any kind of 'thinginess' as an "un-thing" ("Unding", punning on other meanings such as "absurdity", Bajohr 2018, 102). In this kind of unbound, purely encoded textuality, writing escalates as tasks are performed in human interaction with non-human automated engines ("eskaliertes Schreibenlassen").This can be witnessed, for instance,in Nick Montfort's escalation of Beckettian permutations in *Megawatt*, translated by Bajohr into German (2019).The oeuvre dissolves in endless textuality and the author becomes a curator or editor who is merely rearranging an expanded arsenal of data. In *Halbzeug*, Bajohr explores this textuality in various ways, writing corpus poetry (material scraped and edited from big data corpora), automated poetry (randomized, but subsequently post-edited), transcodings (using speech and text recognition devices), alienated textuality (via automated synonym suggestions) as well as visuals generated via codecs glitch art. In both*Halbzeug* and Fabian Navarro's collection *poesie.exe*(2020), the actual texts appear accompanied by extended explanations of how machine-human interactions were arranged, highlighting the importance of the literary process, procedure or method.

**Hannes Bajohr's** contribution to our volume, however, follows this line of exploring the creative potential created and offered by AI, an aesthetic (or, as one could formulate in the vein of the avant-garde tradition, 'subversive') creativity he helped to enhance himself with the tools and programs he invented for the unique purpose of 'making' poetry – probably the most non-profit artistic activity to be imagined. As AI is, at the end of the day, an administrative and commercial tool, using it to create poetry is also subverting its qualities and purpose, and, therefore, fundamentally impactful in order to create an entirely new image of AI as a means of producing new autonomous hybrid forms and structures for their own sake. We are lucky that Bajohr is, even more than other contributors to this volume, working both as an academic researcher and artist. He is, thus, engaged in the two fields we are exploring: the understanding or, in other words, the epistemology of AI as a tool for a new or wider notion of creativity, and as a creative melting pot itself containing all kinds of (im)possible poetic creations and constellations an algorithm can do with words and images (and, respectively, music, as we will see). Concluding with Bajohr, it might not be an exaggeration to presume that AI is not just a device, but merely an art, that

is, a complex multifaceted 'radical' artifice (Perloff 1991) that cannot be described or explained in a binary, quantifying way, but needs to be experienced, explored, qualified and interpreted according to the various structures and perspectives it creates. Furthermore, Bajohr demonstrates by way of his own example that the theory and practice of AI as a medium of exploring and reconsidering the arts and aesthetics cannot be told apart from each other. They are, in fact, juxtaposed; no theory of AI without practicing AI; no practice without at least an implicit theory of it.

Thus, with respect to literature, this volume largely discusses hybrid writing practices which emerge as a consequence of digital coding in electronic media, and, therefore, also transform the materiality of 'classic' media which nonetheless is not 'lost', but continues in the nature of AI texts, genres, patterns, metaphors referring to centuries of literary canonisation and, consequently, the entire history of the written word. There are, however, new unprecendeted clusterings of word and image, owing to the randomisation of algorithms. One aspect of the accelerated emergence of AI application is their ability not just to generate text from prompts (ChatGPT), but often to create transcodings, correlating for instance natural language and images (Dall-E, Stable Diffusion) or natural language and computer code. Both Gadi Singer and Hannes Bajohr have called this kind of AI application multimodal AI (Singer 2022, Bajohr 2023). Bajohr (2023) has claimed that the distinction between the visual and the verbal, text and image, is being collapsed by a process of what he has called "operative ekphrasis". This approach throws the relevance of adaptation studies to AI research and vice versa into sharp relief.

In their article "Sound of Contagion – An artistic research project that uses A.I. as a creative tool for transmedia storytelling" **Wenzel Mehnert**, **Robert Laidlow**, **Chelsea Haith** and **Sara Laubscher** present the eponymous transmedia research and art project of the same name, a collaboration between the Berlin University of the Arts and the University of Oxford, which, as an interdisciplinary collaboration between artists, researchers and technology, examines the use of A.I. technology as a creative tool and puts it into practice. In vastly different ways, both **Pablo Gervás** and **Jenifer Becker** engage with the question of how a rather traditional notion of literature can make use of AI applications. Gervás approaches the task from the perspective of the computer scientist. His overview of attempts to deliver machinewritten stories with and without generative pre-trained large language models focuses on categories such as originality, acceptability and specificity. He proposes to distinguish the levels of text, discourse and story world as a means to split the overall challenge of machine storytelling into manageable sub-tasks. As a fiction writer, Becker is fascinated by the ambivalences of digital culture and its impact on the personal self, especially in the moments of deliberately shutting oneself out of its unwritten codices, agreements, and conveniences (see her debut novel, Becker 2023). Her chapter highlights the suggestions of AI tools for developing character

and plot in classical storytelling – and all the surprises along the way once an author decides to 'go' AI in order to search for inspiration. Her conviction, however, is clear: If the use of AI in fiction writing is not meant to lead towards a dead end, it needs to remain in the hand of authors who ultimately determine the way they employ the cues and suggestions offered by the story(mis)telling AI.

This observation cues **Jens Schröter**'s approach, following the trace of the Russian formalists, and especially Victor Shklovsky's ideas of 'de-automatization' in his paper. Schröter suggests that, if we want to philosophize upon AI, generalizing the (subversive) way the arts are making use of it, we have to consider the notion of art as a means of becoming aware of the deep (language-based) routines and patterns that shape our reality and consciousness. In this way, new perceptions of us and / in the world are made possible only by tearing apart the 'automated' functional patterns of modernity. Therefore, according to Schröter via Shklovsky, the 'function' of AI is not to create automatic art but, on the contrary, to de-automatize a reality running on automatization. By that means, AI itself even might provide a critical tool enlightening the 'brave new world' that some self-proclaimed gurus and prophets of AI are promising.

This understanding of AI mediaimplies a dialectical shift. If TheodorW. Adorno, in the footsteps of Benjamin's 'artwork' essay, is right in claiming that alienation produced by technical reality also provides the means of aesthetically coming to terms with alienation (Adorno 1973), then aesthetic creativity in AI could become a tool of enlightenment in digital mass society. This position, even if not openly declared,might be considered a common ground for many of the essays collected here. And even if Adorno did not become an eyewitness of the digital age, his disciples in the aesthetic avant-garde from the late 1960s onwards, such as Oswald Wiener (*die verbesserung von mitteleuropa*, *the bio-adapter*) or Alexander Kluge (*Die Entsprechung einer Oase*; see Hörisch/Kampmann 2014) carried on his dialectics of the art work in mass society in an undogmatic and playful way. They can be regarded as precursors of the ludic lucidity of the next generation of AI poets and artists, such as Bajohr, Guljajeva, Montfort, and others.

#### **AI and Popular Music Studies**

One of the special features of this project is the collaboration of different disciplines in the humanities, culture and the arts. At the risk of overgeneralization, one may argue that historically, German Studies was marked by a clear focus on the literary avant-gardes, while English and, particularly, American Studies have more readily invited popular culture studies. In terms of cultural studies, one might have addressed the ways in which algorithms have changed the ways films,music, literature and other kinds of 'content' are distributed via AI and – by way of automated lists and recommendations – connected to audiences; alternatively, one may have surveyed the rather long history of the ways digital applications and AI have shaped the creation of new audiovisual repertoires. Musicology – at least in its concrete form as culturally based popular music studies – has readily embraced this focus on the way popular culture has been transformed by digital technologies. To varying degrees, the disciplinary distinctions between high and popular culture and the separation of cultural expressions into at least these two spheres have collapsed. While English and German Studies, for all their differences, can at least theoretically treat both popular and high culture literature, Popular Music Studies defines itself as a discipline in contrast to high culture or, even more clearly, the tradition of Central European courtly and Christian religious music, which is the object of historical musicology. Because of this different history, Popular Music Studies also cultivates a different approach to the figure of the author or the scientific interest in authorial intention, which is partly completely replaced by cultural studies questions. As the introductory remarks so far should have made this clear, we have tried to use these differences productively. Nevertheless, we would like to address a few particularities that arise for this anthology from the Popular Music Studies perspective when it comes to the relationship or impact of AI on and / to music.

Music composition algorithms and composition machines have been a recurring theme in art music since Athanasius Kirchner's *Arca musarithmica* (1650) published in the 17th century. Computer software and hardware has accordingly been used for composing music as soon as it has been technically viable, specifically since the mid-1950s. Within the framework of musicology, a separate special discipline has emerged that deals primarily with such approaches. To trace this developmentin detail here would go beyond the introductory scope of this introduction, but there is no shortage of useful overviews (for example, Nierhaus (2009, 2015), McLean / Dean (2018), Collins / Manning / Tarsitani (2018) and Buck / Zydorek (2022) and corresponding Wikipedia articles).

In the field of popular music, the influence of electronic art music is, first of all, less a compositional than a sonic one. The proverbial man-machine of Kraftwerk generates, musically speaking, mostly general pop-song fare that may sound differently and is presented differently from its non-electronic counterpart. Important factors for theinfluence of algorithms, electronics and ultimately AI are not only aesthetic, but also always economic reasons. The necessity, and especially the amount of financial resources that go beyond the financing of the artists' livelihood for the production of art, clearly distinguishes music from literature, since the composer relevant to Popular Music Studies usually does not (and cannot) compose on a sheet of paper.

Kraftwerk appeared at a time when access to electronic musical instruments was beginning to open up to larger groups of the population.The financial outlay for the purchase of a synthesizer in the 1970s, or rather with the market launch of the Mini Moog in 1970, slowly but surely became manageable for more than just a small (often state-subsidized) elite. The cheapening of instruments is usually accompanied by a standardization of sound architecture and operating possibilities. Fewer options are usually cheaper to produce. This, in turn, leads to the popularization of quasiavant-garde do-it-yourself narratives in the sense of self-built or hand-manipulated electronic instruments and interfaces, which actually only want to connect to the usual choices in art music.

The next two steps in the development of electronic and then computer-based music production, digital sound synthesis and audio sampling, were ready for the market at the turn of the decade in the 1980s (Brockhaus 2017, Bennett / Bates 2018). In 1983, the Yamaha DX-7 with its FM synthesis changed popular music production, and as early as 1979, the Fairlight CMI, the first, still very expensive hardware sampler, was introduced. Six years later, in 1985, a cheaper alternative appeared with the Ensoniq Mirage. The MIDI standard, first introduced in 1982, made it possible to control and synchronize several sound generators from a central unit, an interface. From 1984 onwards (Steinberg Pro 16 for the C-64), software sequencers developed as control units and were successively expanded into digital audio workstations (DAW) asmusic production centres and recording studio replacements. For some time now, deep learning algorithms have also been usedin these DAWs as sub- or auxiliary programmes (plug-ins, virtual instruments, etc.).

In this context, many AI programmes are either commercially available extensions for existing DAWs (Magenta Studio, Flow Machines Pro) or are published as part of independent, often cloud-based production environments / apps (Amper Music, Flow Machines, AIWA...), the results of which can be further processed in other DAWs if required (Avdeeff 2019, Schürmer / Haberer / Brautschek 2022, Zhang / Yan / Briot 2023). Therefore, the goal is not to compose independently of humans, but to make the production environments of human 'creatives' more comfortable. In their article, **Wolf-Georg Zaddach** and **Björn Tillmann** give an overview of these AI helpers in music production (as at the end of 2022). The article is also based on interviews with some of the leading human minds in this development, such as Benoît Carré (Flow Machines).

The creation of similarities to human music is one of the basic strategies used to promote such algorithms, apps or platforms – from the Beatles-style song (Daddy's Car, Flow Machines by Sony CSL 2016 on YouTube, 2018 on vinyl by Benoît Carré and Francoise Pachet) to the third and fourth movements of Beethoven's 10th Symphony. Invariably, the key artistic aim is still to merely prove the machine product as new and indistinguishable from human products along the lines of the Turing test. The goal, therefore, is frequently not a genuine, innovative machine creativity, but the imitation of human work processes that, if carried out by humans, might be described as requiring creativity.

Different styles of popular music, and this includes tonal art music from the Baroque to the Viennese Classical and Romantic periods, seem to frequently targeted in projects of algorithmic derivation. In contrast, works from new music like serial or algorithmic compositions rarely function as models to be imitated for supposed machine creativity. Such atonal creativity, even as human, does not sound human enough for the similarity competition between machine and human. It is unhelpful for the popularization, normalization and future funding of research projects – even, and especially, in the wake of the apocalyptic warnings of AI manipulation from the first half of 2023.

Pop-cultural appropriations of the topic, on the other hand, usually assume the distinctness of machine and human spheres. Machines are representations of the Other that wants to enter into connections with the normal, which can then often end problematically. As mentioned before, this takes place lessin musical terms than on the literary level, that is, in song lyrics or in iconography.

Cyborgs and similar in-between beings can be found repeatedly in pop-cultural and pop-musical iconography – current examples at the time of this writing are the videos *Ritual* (2023) and explicitly *Prada / Rakate* (2011) by Arca. In contrast, Janelle Monáe, for example, has broken away from her identity as an Afrofuturistic cyborg (*The ArchAndroid* 2010, Anderson / Jones 2015) and propagates body and sex positivity in 2023. This change makes it clear that the pop cultural identification as a humanmachine hybrid is often politically charged as part of anti-colonial, anti-racist or anti-sexist and gender-critical debates.

Compositionally, these conceptual approaches would correspond to a search for a dehumanizedmusical aesthetics without ending up with New(Art)Music. To bring humanity back into this aesthetics, errors in the system are often aesthetically and conceptually exaggerated. In digital music production, this means working with the sounds of faulty CDs that jump or get stuck during playback, or faulty or crashing computers or digital music production units with damage but still providing (acoustic) output. This aural aesthetics emerged in 1990s electronic music and has been called 'glitch' since the end of the decade, but has its counterparts in other art forms.

The essay by **Jan Torge Claussen** can be placed in this tradition, searching for errors and deviations in the work with AI music programmes, which, in turn, are to serve as aesthetic material / starting points. In the process, Claussen also deals with voice synthesis or voice cloning, the technology that helped the aesthetic of resemblance reach new heights in 2023 with the AI-generated supposed Drake / The Weekend song "Heart on my Sleeve".

Using an older example, the connection between Acid Music and the life of the Roland TR 303 bass synthesiser, **Sebastian Kunas** reflects on this music-making thing as part of a network of actors. He focuses on the potential for change that can arise from the joint action of the music-making thing and the musician, and reflects on the post-colonially informed roles of the music-making thing and the human being. He suggests using the term Artificial Intelligence not for the software, but for the entire network of actors, the man-machine, so to speak.

In these two essays, but also in Zaddach / Tillmann, the man-machine remains a potentially flawed one that draws its humanity from this non-perfection. At the same time, modernization thrusts through the introduction and implementation of new technologies, whether in music production, marketing, presentation or consumption, always mean rationalization thrusts as well. Jobs of musicians, technicians or such editors are eliminated, and usually a new, but numerically smaller number of new jobs are created. **Nikita Braguinski** deals with such potential effects of AI from the perspective of music theory. What work processes relevant to music theory can be imitated by machines, and, therefore, have an impact on the job description of musicians, composers, teachers and researchers?

#### **AI Aesthetics and AI Ethics**

It will have become clear that aesthetic and ethical issues are inextricably linked in the field of AI art, and in this spirit, this volume must not skirt the ethically problematic issues of AI creativity, issues of fairness, toxicity, bias and safety. These issues were paradigmatically raised in Cathy O'Neil's diagnosis in 2016. She argues that the conjunction of Big Data and algorithms – decried as *Weapons of Math Destruction* in her title – are pervasive, but opaque und unregulated.What is more, O'Neil disputes in a wholesale manner the potential of stochastic and probabilistic data applications for contributing to progress and tackling future challenges: "Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that's something only humans can provide"(O'Neil 204).The recent*Oxford Handbook of the Ethics of AI* probes the legal and socio-cultural questions of AI from a variety of viewpoints. Timnit Gebru's chapter "Race and Gender" discusses automated facial analysis systems that have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. She summarizes that AI "has been shown to (intentionally or unintentionally) systematically discriminate against those who are already marginalized" (Gebru 253).

We can agree with Gebru that the sociopolitical and ethical investigation of harmful consequences of AI is lagging behind its technological advances. Attempts at regulating AI have increased, such as the temporary ban of ChatGPT in Italy (in April 2023). Since 2021, the EU has sought regulation of AI in its proposal for landmark AI Act, applying a classification system that proposes three risk categories (unacceptable risk, such as government-run social scoring; high-risk, such as CV-scanning tools; unregulated low risk applications). In March 2023, the "Future of Life Institute" published an open letter demanding a moratorium on AI research,

signed by, among others, Apple Co-Founder Steve Wozniak, SpaceX, Tesla, and Twitter/X CEO Elon Musk, and star historian Yuval Noah Harari.

The key issues in AI ethics include the concentration of power and resources in a few major platforms (such as OpenAI and its CEO Sam Altman), the digital disenfranchising and victimization of socially disadvantaged people, the danger of misaligned goals (such as automated war drones), the non-transparent scraping and annotating of data, the non-transparent demands on energy, and water resources of huge data centers, and more. Indeed, the recent blanket rejection of technologies by the very people who are responsible for its inception is irritating. Others (Chellappa 2022) are more sanguine, highlighting the potential of AI in aiding and improving human decision-making.

Even the frantic current debate about ChatGPT and GPT-4 seems to be narrowly engaged with issues of ethics and education.While the ability to simulate meaningful discourse poses grave ethical questions and has rightfully led to hectic responses in the world of education, the issues go beyond the production of fake news or deepfake images.The focus has largely remained on the machine mimicry of human creativity with words, images, music, and other traditional means of expression, while the larger dimension of an expanded and potentially new machine art that works in both generative and multimodal ways, has so far remained largely under the radar.

Both in ethical and aesthetic respects, questions of data curation and documentation are essential, reflecting the fact that creative processes (rather than merely its products) must be seen as an inalienable and, on the contrary, crucial part of aesthetic practice and the field of art. Just as we cannot assess the validity of a paper written by a student without knowing how the text was written, works of art need to be judged by their emergence within the interconnected system of arts – the 'lived world' of the arts.

#### **Bibliography**

Adorno, Theodor W. (1973): *Ästhetische Theorie*. Frankfurt/Main: Suhrkamp.

Anderson, Reynaldo / Charles Earl Jones (eds.) (2015): *Afrofuturism 2.0: The Rise of Astro-Blackness*. London: Lexington Books.


Bajohr, Hannes (2018): *Halbzeug. Textverarbeitung*. Berlin: Suhrkamp.

Bajohr, Hannes (2019). *Megawatt: Ein Deterministisch-Computergenerierter Roman, Passagen aus Samuel Beckett Watt Erweiternd*. Berlin: Frohmann/0x0a, 2019.


*tions of the International Society for Music Information Retrieval* 1(1), 34–55. DOI: 10 .5334/tismir.5.


Lin, Patrick / Keith Abney / George A. Baker (eds.) (2012): *Robot Ethics: The Ethical and Social Implications of Robotics*. Cambridge MA: MIT Press.

Manovich, Lev (2018): *AI Aesthetics*. Moscow: Strelka Press.

Manovich, Lev (2019): "Defining AI Arts: Three Proposals". *manovich.net*. http://man ovich.net/index.php/projects/defining-ai-arts-three-proposals.

Manyika, James (ed.) (2022): "AI & Society". Special issue of *Daedalus* 151(2).

Marcus, Gary (2022): "Nonsense on Stilts". *The Road to AI We Can Trust*. https://garym arcus.substack.com/p/nonsense-on-stilts (12 June 2022).

Mazlish, Bruce (2004): "The Man-Machine and Artificial Intelligence". Stefano Franchi / Güven Güzeldere (eds.) *Mechanical Bodies, ComputationalMinds: Artificial Intelligence from Automata to Cyborgs*. Cambridge MA: MIT Press: 175–202.

McLean, Alex / Roger T. Dean (eds.) (2018): *The Oxford Handbook of Algorithmic Music*. New York: Oxford University Press.

Miller, Arthur I. (2019): *The Artist in the Machine: The World of AI-powered Creativity*. Cambridge MA: MIT Press.

Moormann, Peter / Nicolas Ruth (eds.) (2023): *Musik und Internet. Aktuelle Phänomene populärer Kulturen*. Wiesbaden: Springer VS.

Navarro, Fabian (ed.) (2020): *poesie.exe*. Berlin: Satyr.


Perloff,Marjorie(1991):*Radical Artifice.Writing Poetryinthe Age ofMedia.*Chicago:UCP.

Reck Miranda, Eduardo (ed.) (2021): *Handbook of Artificial Intelligence for Music. Foundations, Advanced Approaches, and Developments for Creativity*. Cham: Springer Nature.

Reckwitz, Andreas (2017): *The Invention of Creativity*. Cambridge: Polity Press.

Scharre, Paul (2019): *Army of None. Autonomous Weapons and the Future of War.* New York: Norton.

Schönthaler, Philipp (2021):*Die Automatisierung des Schreibens undGegenprogramme der Literatur*. Berlin: Matthes & Seitz.

Schürmer, Anna / Maximilian Haberer / Tomy Brautschek (eds.) (2022): *Acoustic Intelligence. Hören und Gehorchen*. Acoustic Studies Düsseldorf 5, Berlin, Boston: De-Gruyter.

Schwab, Klaus (2016): *The Fourth Industrial Revolution*. New York: Crown Business.

Searle, John R (1980): "Minds, Brains, and Programs". *Behavioral and Brain Sciences* 3(3): 417–24. Cambridge University Press. https://doi.org/10.1017/S0140525X00 005756.


# **Discography**


Monáe, Janelle (2010): *The ArchAndroid*. Bad Boy Entertainment.

Monáe, Janelle (2023): *The Age of Pleasure*. Bad Boy Entertainment.

Sony CSL (2016): "Daddy's Car. A Song Composed with Artificial Intelligence – in the Style of the Beatles". YouTube: https://www.youtube.com/watch?v=LSHZ\_b 05W7o.

# **AI, Automation, Creativity, Cognitive Labor**

*Jens Schröter*

#### **Introduction**

Let me begin with a somewhat bizarre article from the Frankfurter Allgemeine Zeitung from 29 August 2022. Under the title, "Who's afraid of DALL-E 2"<sup>1</sup> , it addresses the fact that software like DALL-E 2, Midjourney or Stable Diffusion is now available on the Internet, software that uses artificial intelligence to generate images out of text prompts (Böhringer 2022). The article states in its sub-headline: "Art from artificial intelligence has reached the threshold of commercialization. Illustrators and comic artists fear for their jobs and feel cheated of their ideas".<sup>2</sup> Many artists are upset that machine-learning systems learn from their images to simulate their style – and then put out images in the style of the artist for free, so to speak. As the article explains: Style cannot be easily protected by copyright. A new kind of class difference is feared: on the one hand, a monoculture of automated art production for the masses, as it was actually already imagined in Orwell's *Nineteen Eighty-Four*; and some very expensive real work by real artists for the few on the other.

This may or may not come true; what I want to discuss is the configuration of AI, automation, creativity and labor in its historical transformations. My chapter has four parts. First, I want to point to the fact that long before computers and AI emerged there was already a relation between artistic creativity and the automatic at the beginning of the 20th century. Secondly, I want to discuss a configuration of the 1960s regarding the simulation of artistic style. Thirdly, I want to present some preliminary musings on the non-automatability of artistic work. In my fourth part I want to discuss, very briefly, the situation today – and to suggest, based on a paper by Hal Foster (1996), the notion of 'style without museums'. In the fifth part, I offer a short conclusion that invokes the work of Actress aka Darren J. Cunningham.

<sup>1</sup> Original title: "Wer hat Angst vor DALL-E 2?"

<sup>2</sup> Translation Schroeter, original quote: "Kunst von Künstlicher Intelligenz hat die Schwelle der Kommerzialisierung erreicht. Illustratoren und Comiczeichner fürchten um ihre Jobs und fühlen sich um ihre Ideen betrogen".

It needs to be stressed that the problem of automation and creativity has a multifaceted genealogy going back at least to the beginning of the twentieth century – perhaps even earlier when we think about the nervous discussions on the possibilities of producing art with photography in the 19th century. One argument then was that the automatic image generation in photography and the unintended detail would contradict the idea of artistic intentionality (see for this interesting discussion Kemp 1980: 88). Since then the question of automation, automatism etc. haunted the question of art.

# **Art, de-automatization, automatism – at the beginning of the 20th century**

At the beginning of the twentieth century there were parallel developments which center around notions that are notidentical, but at least similar and all are connected to the 'automatic' in a wider sense.

a. There was Russian Formalism and especially Viktor Shklovsky, who argued that the task of art is to 'defamiliarize' perception, to 'make it strange'. Shklovsky saw quotidian perception marked by automatization. "Automatization eats things, clothes, furniture, your wife, and the fear of war"(Shklovsky 2015: 162).He did not explicitly refer to industrial automation – but his famous essay 'Art as Device' appeared 1917, four years after Ford first installed a famous assembly line for the production of cars. Nevertheless, Shklovsky sometimes refers e.g. to the car as a paradigmatic example. Ginzburg writes, quoting Shklovsky: "'We know how life is made and how Don Quixote and the car are made too'. Literary criticism as a scientific enterprise, art as a technological artifact" (Ginzburg 1996: 8). In another passage Shklovsky explicitly mentions the "automatic age" (quoted in Platonov 2016: 19) and he is quoted as follows: "The machine changes man more than anything else" (in Lvoff 2016: 65). Art, on the other hand, should present things (or processes) anew – so that we as beholders could see them,in a way, as for the first time. Art was not supposed to change the political implications of industrial automation or the conditions at workplaces, but at least it could change and refresh a petrified perception. Automatization and perceptual automatism were to be estranged by art to provide a fresh look onto the world.

b. Automatisms also played a role in a very different field of art that emerged only a few years later than Russian Formalism, namely Surrealism. This was related to notions of psychic and bodily automatisms that emerged in psychology and psychoanalysis in the late 19th century – and these somewhat automatic psychic processes were often modeled after media technologies. Surrealism developed (amongst others) so-called strategies of automatic writing and drawing, e.g. Breton (2007) wrote a nowadays famous essay on 'the automatic message'in 1933.The surrealists thought

to transcend quotidian, rational consciousness by these techniques; the idea was to release unconscious impulses and energies.

Surrealism's discourse on automatic strategies in art was very different from Shklovsky's approach.While Shklovsky expected of art to overcome automatization, Surrealism used 'automatic strategies' – however, the surrealists did not understand 'the automatic' as a set of mechanized, formulaic forms (as Shklovsky did) but on the contrary as that, which, by its spontaneity, disrupted rational consciousness. The goals, however, were comparable – to transcend conventional, quotidian consciousness, to open up new possibilities of perception and presumably action. I will come back to these early strategies of art to cope with the automatic below.

#### **A constellation in the 1960s**

We make a jump to the 1960s. After 1945 new technologies emerged – computers began to spread, from 1946 to 1953 were the Macy conferences on cybernetics and in 1956 the Dartmouth conference was held, which developed the notion of "artificial intelligence". It was, among other things, the promise of these radical new technologies that led to ever new waves of fear of automation. As Amy Bix (2000) has shown in her magisterial study *Inventing Ourselves out of Jobs*, automation and its advantages in terms of productivity and its possible disadvantages in terms of unemployment were intensely debated issues in the United States after 1945 and especially since the 1960s. Whatever is nearly automatically done by humans, whatever is a human automatism can in principle be automated. In fact, in *Understanding Media* by Marshall McLuhan (1994: 346–359) from 1964 there is a somewhat enigmatic last chapter on automation. Even emerging media theory could not do otherwise than to react to the discussions on automation and labor in the US in the 1960s. Only some years later the idea emerged to produce artworks in a certain style with the aid of computers.

In 1967, Michael Noll published his essay "The Computer as a Creative Medium". Noll argued that "creativity has universally been regarded as the personal and somewhat mysterious domain of man" (1967: 89) and describes the computer as a new medium for the artist and speaks of the cooperation between artist and this medium. But although he does not argue that computers should or could replace artists, the experiment he describes is in a way about precisely that. He made a synthetic Mondrian and asked a presumably non-representative sample about the two images:

In general, these people seemed to associate the randomness of the computergenerated picture with human creativity whereas the orderly bar placement of the Mondrian painting seemed to them machinelike. This finding does not, of course, detract from Mondrian's artistic abilities. (1967: 92)

His experiment is about simulating a style of an artist to produce works that could substitute the artist – since the non-expert subjects identify the simulation as the real artwork. In that sense, Noll formalizes cognitive labor. Interestingly, at the very beginning of his paper Noll discusses the question of "how does an artist work"(1967: 90) and tries, starting from an anecdote by Henri Matisse about his stepwise workprocess, to get to grips with the creative process:

Most of all, the Matisse anecdote suggests that the artistic process involves some form of "program", one certainly more complex than the anecdote admits, but a definite program of step-by-step action. Without doing too much violence to our sense of what is appropriate, we might compare it to a computational hillclimbing technique in which the artist is trying to optimize or stabilize at a high level the parameter "excitement". (1967: 90)

There have been further attempts to automate style, for example in the work of Kirsch and Kirsch (1988). The authors demonstrate these methods with algorithmic descriptions of the styles of Richard Diebenkorn and Joan Miró and the generation of new compositions in their styles.

Although Noll's work pointed in that direction, however, art was not rationalized away and transformed, via real subsumption as Marxists would put it, under an industrial regime of mass production.

# **Some theoretical reflections: Why can artistic work not be automated?**

Artistic work seems to be a type of work that seemingly cannot or should not be formalized, algorithmized, and automatized. But why? At least at first glance, the art market looks exactly like any other market: artists have to earn money with their work and their 'works'. It is not a realm of freedom, but only a kind of service or consumer goods industry that serves a special market. But on consumer markets we find lots of products that are mainly produced by machines and have no authorial signature attached to it (although there are things like brands, of course). But the idea of art without the intervention of a human author or causer – and even if her role consists precisely in demonstratively withdrawing – does not seem plausible to us. But even if we don't want to use a nowadays obsolete notion like 'genius' as the by definition non-automatable, we still insist on a certain originality and newness in art that is not the result of random processes (although randomness can be used, but as part of an original strategy).On the one hand this need for non-randomnewness this is what you would expect under market competition, on the other hand this is in a way exactly what Shklovsky's de-automatization means – art has to give something 'new'.This idea, if we accept it, seems hardly to be reconcilable with programmed machines, insofar a program by definition seems to mean to automatize a set of given, known and knowable steps.

Luhmann (2000: 38) remarks: "The artist's genius is primarily his body". One could say that *the separation of knowledge from the working body*, characteristic of the progression of capitalism and perhaps first discovered by Marx and then especially underlined by Harry Braverman in his study *Labour and Monopoly Capital*, does not or cannot take place in art. In that sense art remains in an uneasy, complicated relation to skilled handicraft.This also means that the work of art is somewhat removed from capitalist wage labor, since the artist is not separated from the means of production – e. g. her unique gesture with a brush. Her style.

But where does this indexical and juridical correlation of knowledge and body come from? For even if an artist – like Noll, for example – was to define himself precisely by delegating all work to machines, we would still call the result 'a work by Noll' and the original idea would be to delegate the work of art to machines. Another example: The work by Donald Judd is according to Sebastian Egenhofer (2008: 214; translation J.S.) "dissolved in the anonymity of the industrial dispositive". Nevertheless, it would still be pointless, if another person or simply a company, based on the knowledge of how it is made, were to produce the same object again, as it is also done in principle in the industrial production of identical copies of, say, a chair – it would not be possible to recognize this reproduction as a work of art. Building on this, I would like to provide another example: Elaine Sturtevant borrowed the screen printing matrices from Andy Warhol for the *Flowers* and remade the *Flowers*. In 1991 she even made an entire exhibition with *Warhol Flowers* – and Warhol is said to have once said, referring to the production process of the *Flowers*: "I don't know. Ask Elaine". (quoted in Arning 1989: 44) Andy understands himself as a machine and Elaine knows the algorithm. Nevertheless, Sturtevant's appropriation of Warhol's knowledge is not a rationalization ofWarhol's work in the sense that Sturtevant now simply makes 'cheaper Warhols', but she rather makes 'Sturtevants'. Works of art must not correlate between knowledge and *a false body* – this would be what we call forgery. But Sturtevant does not forge – she connects herself to a body of work that simulates another body of work. And this simulation directs our attention exactly to the complex relation of art and automatic re-production. Of course, we can produce inexpensive reproductions of *Flowers* as posters (which do not count as work of art, but its reproduction).Warhol's life ended in 1987 and that stopped the production of original 'Warhols' – and that is a necessity: In the long run, the mortality of artists makes artworks scarce and that's why they have market-value.

In Noll (or more general in "Information Aesthetics") the works of art are removed from their historical context and are reduced them to abstract structures that can be formalized. This also seems to reappear in recent AI art: "That might be an inevitability of AI art: Wide swaths of art-historical context are abstracted into general, visual patterns". (Bogost 2019) Art is de-historicized and re-stylized. Style wins over history. And therefore, the artists mentioned at the beginning are right: When style (or even 'form') wins over history than they are substitutable. Then it is enough to have an image in the *style* of Warhol. It would not be any more necessary to link a given artifact indexically to an historical artist-body. This can be seen as a danger for the art market. It can be seen as democratization. One could even argue, that having the stylistic options in, let's say DALL-E or Stable Diffusion, is not very different from buyingWarhol- or VanGogh-Postersin IKEA,only that we can now be oh so creative prosumers, mixing our own Warhol-meets-Van-Gogh-meets Star Trek-Cocktail. Similarly to all discourses of prosuming this might be highly ideological – a pseudo-creativity that produces the illusion of mastery over the archive.

# **Style without museums**

Today there is again a nervous and multifaceted discussion on the future of work, given the challenges of robotics and AI. Some people think nothing will change and capitalism will adapt as it did in the past; some people think it will be different this time and that is a bad thing; some people think it will be different this time and that is a good thing. I have addressed these complicated discussions elsewhere (Schröter 2019).

Apparently, again, art it is not directly threatened by computerization. Art does not appear in the highly discussed and controversial Oxford research report by Frey and Osborne from 2013, according to which 47% of work can be automated in the near future: the only activity that resembles artistic work is that of the 'art director', who gets off quite lightly with a 95th place on the computerizability probability list.

The problem, however, might not be that an AI system substitutes the work of an artist. The situation seems to be that all styles are available at a fingertip. In his 1996 essay, "The Archive without Museums", American art historian Hal Foster mused about the expanding academic discourse and institutions called *visual culture*in contrast to the discourse of *art history*. Visual culture, with its eroding of dichotomies between high / low, art / non-art etc. and in this respect an offspring of Cultural Studies, was much debated at that time. Foster is very skeptical about 'visual culture' as a disciplinary field. In his essay, he discusses the historical, institutional and technological conditions, which led to the emergence of that field. Right at the beginning he contrasts these conditions with those that led to the emergence of art history. He proposes three different conditions for each field.

The emergence of art history is founded on a) the foregrounding of the 'constructive aspect of the artwork'; b) the interest in 'alien', non-european art fostered by 19th century imperialism – both a) and b) are shared, according to Foster, by art history and modern art – and c) the technologies of photographic reproduction: "Art history relied on techniques of *reproduction* to abstract a wide range of *objects* into a system of *style* – as defined in diacritical terms by Wölfflin in *The Principles of Art History* (1915) […]". (97).The 'system of style' is also what organizes the logic of museums (115). Subsequently Foster characterizes the conditions that led to 'visual culture'. He mentions a) the "visual virtuality of contemporary media"; b) the interest in "cultural multiplicity in a post-colonial age" and c): "Might visual culture rely on techniques of *information* to transform a wide range of *mediums* into a system of image-text – a database of digital terms, an archive without museums?" (97).

It seems that the performance of style by AI systems like DALL-E does in some aspects belong to the paradigm of visual culture.They are based on databases of images taken from the net and in that sense on an archive without museums.The postcolonial cultural multiplicity is also a characteristic trait of our contemporary situation – although there might be significant biases in DALL-E and similar software, due to the dataset. But the notion of 'style' is, for Foster, connected to the paradigm of art history and its media: "After photographic reproduction the museum was not so much bound by walls, but it was bordered by style".(115) It seems that this abstraction of a wide range of *objects* into a system of *style* is very characteristic for contemporary AI-based visual culture. In contemporary AI-based visual culture, however, we find less a system, but a heterogeneous multiplicity of styles. As Roland Meyer (2022; translation J.S.) put it in a short and concise essay on DALL-E: "Style here can mean theindividual style of a canonized artist, but also theimage qualities of certain technical media or the look of popcultural imagery". AI-software systems produce *style without museums.*

The images are generated with Stable Diffusion, one is made from the prompt "Warhol Flowers", the other from "Sturtevant Flowers". We see that the style of Warhol is quite accurately generated in the left image. But actually the right image should look the same or at least more similar. But history is erased – the deconstructive gesture of appropriation which only operates historically cannot be reproduced. Ok, to be fair, I should have tried "Sturtevant Warhol Flowers" as prompt…

*Fig. 1, 2: Stable Diffusion, prompts: Warhol Flowers and Sturtevant Flowers.*

# **Conclusion**

A last example to conclude: My final example addresses the work of Actress aka Darren J. Cunningham – a highly interesting DJ who creates experimental electronic music and is introduced in the blurb of the Transmediale in 2019 as follows:

Young Paint has been progressively learning and emulating the shadowy, unpredictable, UK bass- and rave-inspired music of Darren J. Cunningham, aka Actress. Over the course of 2018, the AI-based character has spent time programming and arranging Cunningham's sonic palette, learning not only how to react to his work, but also to take the lead with the occasional solo. A life-size projection of Young Paint working in a virtual studio parallels Cunningham's performance on stage, visualising their collaboration. (Transmediale 2019).

Obviously, Young Paint is not conceived only as a tool, but also as a partner, automatizing and at the same time transforming the *style* of Actress. Cunningham mirrors himself in a machine learning system that on the one hand learns and mimics his aesthetic strategies, but on the on the other hand produces unforeseeable digressions.This is a kind of 'surrealism without the unconscious' as Jameson (1991: 67–96) has put it, but in a new and critical way: Cunningham forms with his double a new assemblage – Actress / Young Paint – which enhances his aesthetic self-reflection. I have discussed this example at length elsewhere (Schröter 2021), interesting here is only again: *Style* is separated from the author – somewhat like in the case Noll's Mondrian – but recursively modifies the author itself. Therefore, style becomes recursive and the author, Cunningham, itself is de-automatized, to use Shklovsky's notion. He has to react to the contingency produced by Young Paint. This might be the new aspect of interactive systems collaborating with artists, although one could say that this was also a strategy of the surrealists already.

As I have outlined in this chapter, the relation of artistic and creative labor to 'the automatic' in a wider sense has a long and complicated history. There are substantial differences between various usages of 'automatic' and its attendant problems and imaginaries. From the discussion on artistic photography in the 19th century, de-automatization in Russian Formalism, automatic writing in Surrealism – to serial objects in minimal art and Warhol's Factory mimicking the industrial and automatized production processes, to Appropriation Art and beyond. In that sense, the contemporary experiments are not so new anyway, but embedded in a long genealogy in which art (or at least some forms of art) struggled with the meaning of creativity, the emergence of 'the new' and the disruption or confirmation of hegemonic perception, the role of the author and / or forms of her withdrawal, the form and economic entanglements of artistic labor and the active resistance and intervention of the technological media of art. The case of Actress / Young Paint is an example of a recent experiment that addresses some of these questions. A genealogy of art and 'the automatic', however, is still missing, as far as I can see.

#### **Bibliography**

Arning, Bill (1989): "Sturtevant". *Journal of Contemporary Art* 2(2): 39–50.


Foster, Hal (1996): "The Archive without Museums". *October* 77: 97–119.

Frey, Carl B. / Osborne, Michael A. (2013): "The Future of Employment. How Susceptible are Jobs to Computerisation. Resource Document". Oxford Martin School, University of Oxford. http://www.oxfordmartin.ox.ac.uk/downloads/academic /The\_Future\_of\_Employment.pdf. (21 January, 2024).


Luhmann, Niklas (2000): *Art as a Social System*. Stanford: Stanford University Press.


# **Dumb Meaning: Machine Learning and Artificial Semantics<sup>1</sup>**

#### *Hannes Bajohr*

In June 2022, Google employee Blake Lemoine was given an indefinite leave of absence. The reason: He had claimed that the artificial intelligence he was helping to test was sentient, and the company thought such a claim bad press (Tiku 2022).<sup>2</sup> Lemoine insisted that LaMDA, a chatbot system, convinced him in lengthy conversations that it had the intelligence of a highly gifted eight-year-old, and asked to be considered a person with rights (Lemoine 2022b).<sup>3</sup> In doing so, Lemoine, who describes himself as "ordained as a mystic Christian priest", was merely exaggerating a sentiment that also afflicted others at Google. Blaise Agüera y Arcas, a senior machine learning engineer not usually prone to mysticism, wrote of his own interactions with LaMDA just days before Lemoine: "I felt the ground shift under my feet. I increasingly felt like I was talking to something intelligent". (Agüera y Arcas 2022)

In contrast, a discussion about another AI system, which took place at about the same time, did not use the buzzwords of sentience and intelligence at all. Dall·E, whose second version was developed by the companyOpenAI around this time(since September 2023, the third version is integrated into ChatGPT), is a text-to-image AI that can generate images from natural language input. Given a prompt such as "A Shiba-Inu wearing a beret and a black turtleneck", it produces an output image depicting that very scene (Ramesh et al. 2022). The public beta triggered a slew of experiments, and soon the most interesting or whimsical results were shared on the web and especially on Twitter.

This, too, was revealing: compared to the much less successful experiments with autonomous cars, it suggested that AI has significantly different social effects than long thought – that, before it puts truck drivers out of business, it is more likely to take the jobs of illustrators, graphic artists, and stock photographers (Prakash

<sup>1</sup> This essay first appeared in German; the current version is a translation of Bajohr 2024b.

<sup>2</sup> Lemoine's term is *sentience*, not *consciousness,* but he seems to use them synonymously.

<sup>3</sup> In addition, Lemoine also published the chat transcript of a conversation with LaMBDA (Lemoine 2022a)

2022).<sup>4</sup> Unlike in the case of LaMDA, however, no one thought Dall·E should be conceived of as a person with rights.

The different reactions to the two systems show how quickly thinking about AI veers into familiar conceptual ruts. Intelligence, consciousness, sentience, and personhood have been the major themes of AI research and its imaginaries for nearly seventy years; amusing little pictures, by contrast, seem to raise fewer fundamental questions. But it is quite possible that it is actually the other way around – that the eternal hunt for superintelligence and the singularity obscures the more interesting and subtle conceptual shifts that escape both the tech evangelists in their visionary furor and their skeptical critics.

For philosopher Benjamin Bratton, it is clear that in the face of these new AI systems, "reality has outpaced the available language to parse what is already at hand". What is needed, therefore, is a "more precise vocabulary" (Bratton and Agüera y Arcas 2022) that goes beyond the usual handful of big concepts, but also beyond the anthropocentric assumption that the only way in which machines may form worldrelations would have to be our way. We can observe such a tendency with Dall·E and LaMDA. Here, the concept of meaning becomes detached from its anthropocentric correlate. It would be meaning without mind – dumb meaning.

# **Free-floating and grounded systems**

Despite constant admonitions from computer scientists, linguists, and cognitive psychologists to use terms such as intelligence and consciousness with care, the tech industry remains relatively immune to such warnings. Thus, critics soon accused Lemoine of having fallen for the "ELIZA effect" (Christian 2022) – of having projected intelligence and consciousness onto LaMDA – a susceptibility JosephWeizenbaum had already observed in 1966 among users of his ELIZA chatbot: Although ELIZA merely mimicked a Rogerian psychoanalyst, mirroring the patient's statements back to them as questions, its users behaved as if the program really were a conscious agent interested in their well-being.

The classic objection here is the following: Computers are symbol-processing systems that deal with syntax alone, not with semantics – they can process logical forms but not substantive meaning (Cramer 2008). For their operations it is irrelevant which objects or concepts the symbols name in a human world and which cultural valences are associated with them. Thus, ELIZA merely scans user input for a

<sup>4</sup> The June 11, 2022, issue of The Economist featured an illustration generated by an image AI. Since then, this has become somewhat of a fashion that will, without a doubt, soon give way to more sophisticated uses.

given syntactic pattern and transforms it into a "response" according to a transformation rule. Weizenbaum gives the example in which the analysand reproaches the analyst: "It seems that you hate me". The program identifies the key pattern "*x* you *y* me" in this sentence and separates it accordingly into the four elements "It seems that", "you", "hate" and "me". It then discards *y* ("it seems that") and inserts *x* ("hate") into the reply template "What makes you think I *x* you". And so ELIZA responds to the accusation that it hates the analysand by asking how they got that idea (Weizenbaum 1966).<sup>5</sup>

This interaction may have meaning for the user and plausibly suggest a communicative intent on the part of ELIZA, but neither such intent nor such meaning is actually to be found in the program. It has merely processed symbols according to a rule without "knowing" what hate is or what behavior the mores of civil discourse suggest. That is the difference between the processing of information and the understanding of meaning.

For AI researchers who seek tomake computersmore human, this state of affairs describes what cognitive psychologist Stevan Harnad called the "symbol grounding problem": Symbols, like those in Weizenbaum's transformation operation, have no intrinsic meaning for computers because, without the background of practical knowledge of the world, they can only refer to other symbols, never to any reality beyond them. They are not grounded in the world, and there is no way out of this "symbol / symbol merry-go-round". Whatever meaning there is can only be "parasitic", and is projected onto the output by human interpreters. (Harnad 1990:340, 339)

Harnad's criticism, however, was directed against only one particular type of AI, which also includes ELIZA; for obvious reasons, it is called "symbolic". To solve the symbol grounding problem, Harnad relied on the novel "*subs*ymbolic" or "connectionist" systems of the time: neural networks of which LaMDA and Dall·E are late descendants. Unlike traditional AI, they are not designed as a set of logical rules of inference, but are vaguely modeled after the brain as neurons and synapses that amplify or attenuate the signals passed through them. They therefore do not require explicit symbolic representations and rules – they are not programmed, but learn independently from examples. While neural networks were mainly used for pattern recognition in the early 1990s, Harnad thought they might be able to access the world. Implemented in an autonomous, mobile robot, equipped with sensors and effectors, a conglomerate of neural networks would first receive impressions and categorize them as recognizable shapes.These would then be handed over to a symbolic AI, but would now no longer be mere references to other symbols but rather

<sup>5</sup> I have simplified the procedure somewhat; moreover, ELIZA allows quite different transformation rules, and the therapist is only one subroutine, called DOCTOR.

connected to the world via their causal reference to external data – they would finally be grounded (Harnad 1993).

The consequence of this thought, however, seems to be that the only way to get around the ELIZA effect, which falsely attributes consciousness to computers, is to *actually* give them consciousness. For what Harnad has in mind is, in the end, again an anthropocentric model that hopes embodied cognition and sufficiently extensive referential meanings will produce world understanding, since this is how we more or less function, too.The success of his hybrid model would have to be demonstrated by his robot being as competent at navigating the world as if it were actually intelligent. Since this is not yet the case, the symbol grounding problem cannot yet be considered solved either; a *bit* of meaning does not exist here by definition. And yet, such limited meaning is exactly what LaMDA and Dall·E seem to suggest.

# **Gradated meaning**

With the increasing popularity that neural networks have enjoyed for almost ten years now, the idea that they somehow could have access to meaning beyond mere ungrounded symbols has also become more attractive again. For media studies scholar Mercedes Bunz, neural networks, thanks to their complexity and capacity for unsupervised learning, can now "calculate meaning" rather than just empty symbols (Bunz 2019). And it is true that, in the face of neural networks, the binary distinction between meaning (human world) and non-meaning (digital systems) is becoming increasingly difficult to maintain. Maybe we should consider levels of gradated meaning which, as artificial semantics, no longer presuppose a mind.

Thus, rather than taking it as a sign of consciousness, the fact that LaMDA's answers sounded so human-like can simply be understood as an indication of such "dumb" meaning. While "broad" meaning presupposes – depending on your philosophical or disciplinary orientation – embodied intelligence, cultural and social background knowledge, or the world-disclosing function of language, dumb meaning would operate below this scale (which is always calibrated on humans) and could best be grasped as an effect of *correlation.*<sup>6</sup>

LaMDA is – similar to the better-known text generator ChatGPT – a large language model implemented as a neural network. Trained on vast amounts of text, it

<sup>6</sup> Dumb Meaning excludes *natural* meaning (such as in the symptom/disease relation). It also cannot contain the *intentionalist* meaning Paul Grice has theorized, according to whom the meaning of an utterance is dependent on recognizing the speaker's intention, which in turn requires consciousness. And finally, it is only in a very limited way a *use theory* in the tradition of the late Wittgenstein, since "use" presupposes a shared social background, which requires world-understanding, which, in turn, assumes embodied intelligence.

processes language as a multi-dimensional vector space, a so-called "word embedding", which works according to the principle of staggered correlations first suggested in the "distributional hypothesis" in the 1950s (Harris 1954; Firth 1957): First, words that frequently appear together are closer together in this space, forming semantic clusters familiar from word clouds. However, since not only the correlations of words to words but also correlations of correlations are encoded, large language models can also explicate implicit regularities that are not spelled out in the training text. This is true for syntactic relations – when the Euclidean distance between the vectors for the positive and superlative of a word is the same – but also for complex semantic relations, that is, word meaning. One of the best-known examples of this principle is the operation: *vking* –*vman* + *vwoman* ≈ *vqueen*. (Mikolov, Yih, and Zweig 2013)<sup>7</sup>

#### *Fig. 1: Word embedding of a large language model*

In this equation – which reads: subtract from the word vector "king" that for "man" and add that for "woman," and the result is the word vector for "queen" – the latent semantic relation "gender" emerges as an arithmetic correlation, even though it is not explicitly present in the model (fig. 1). That it arises from the mass of language on which the model is trained explains machine learning's susceptibility to biases: sexism and racism may also be latently encoded in language models (Bender et al. 2021; Bajohr 2024d). The meaning of a sign in a language system constructed in this way is determined purely *differentially*, as in Ferdinand de Saussure's linguistic structuralism. Instead of referring to anything outside language, sign meaning is simply thought of as difference from other signs and sign correlations (this is excellently explained in Gastaldi 2021).

<sup>7</sup> This insight still applies to newer, technically different models such as GloVe (Global Vectors for Word Representation).

Large language models, such as ChatGPT, Claude or BARD and LaMDA, basically still follow this principle. However, they no longer vectorize individual words – which would be impossible given the huge amount of training data. Instead, Chat-GPT uses a network architecture called the "transformer" to build embeddings of entire word sequences (Meng et al. 2023). To do so, transformers use the concept of "attention" (Brown et al. 2017). The model learns to establish relationships between all words in a sequence; from these attention patterns, the transformer constructs vectors that represent the entire sequence – so-called contextualized embeddings. In contrast to static word vectors, these sequence embeddings take into account a much larger context of use. In a decoder function, the transformer then uses these contextualized embeddings to generate new texts that match the input context.The ability to build such complex sequence representations is the key to the performance of large language models (Wolfram 2023). The effect, nevertheless, is that large language models, by their immense training data alone, are able to produce apparently situational understanding, as LaMDA did, without ever being "in a situation"<sup>8</sup> .

Language models would then be producers of a first degree of dumb meaning. It is dumb because the model captures latent correlations between signs, but still does not "know" what things these signs actually name; with this kind of meaning, one will not be able to build an intelligence that will ever find its way around in the world. The linguist Emily Bender, a vehement critic of all AI hype about alleged consciousness, admits with her colleague Alexander Koller that "a sufficiently sophisticated neural model *might* learn some aspects of meaning", such as semantic similarity, but considers them to be "only a weak reflection of actual meaning", which is always related to something in the world, i.e. "grounded". (Bender and Koller 2020)

But as wrong as it would be to project anything like sentience or consciousness onto this system, one should also not be too quick to dismiss this modicum of meaning.<sup>9</sup> Insofar as language models make implicit knowledge explicit in a nontrivial way – even if only by matrix transformations in a vector space – they produce dumb

<sup>8</sup> This is philosopher Hubert Dreyfus' term for the prior world-understanding that humans, but not computers*,* have (Dreyfus 1992).

<sup>9</sup> In this respect, I agree that it is "productive to consider reference as just one (optional) aspect of a word's full conceptual role" (Piantadosi and Hill 2022: 4). The paper by Piantadosi and Hill makes a similar argument to the present essay; however, I believe that the authors go too far in the direction of attributing "rich, causal and structured internal states" to LLMs, which again seems to border too much on anthropomorphism (Piantadosi and Hill 2022: 5). I would also like to note that I am unhappy with N. Katherine Hayles's notion of computers as "cognizers" – a concept that also suggests a subjectivity on the part of the operating systems that I find difficult to subscribe to; on the positive side, however, Hayles emphasizes the production of meaning by such systems (Hayles 2019; Hayles 2022).

meaning which would not have been available to us without them.<sup>10</sup> In contrast to ELIZA – whose *x* and *y* were only empty placeholders to the system – neural networks are not *solely* parasitically dependent on the meaning attributions of human agents, but *also* operate productively with the inherent distributional structure of language.

#### **Text and image and world**

Bender and Koller are of course right that LaMDA is not grounded.<sup>11</sup> It is a *uni*modal network, processing only a single type of data, namely text. To be grounded in Harnad's sense, she writes, it would be necessary to combine several types of data – it would have to be *multi*modal machine learning (Singer 2022). That is what Dall·E is: Instead of text just referring to other text, here text is correlated with image information. This raises the hope again that arbitrary signs can be linked to things in the world to produce grounded meaning.

Harnad's hypothesis that neural networks in particular could address the symbol grounding problem has recently been taken up by media scholars LeifWeatherby and Brian Justie with their notion of "indexical AI". It is named after Charles Sanders Peirce's notion of the index. Unlike the symbol, which has a purely conventional relationship to its signified (as "dog", "chien", and "Hund" all refer to the same thing), the index is causally linked to it (as smoke refers to fire).

With this coinage, the authors make Harnad's project the basis of a description of contemporary technological culture: "Digital systems, relying on the neural net, have left the world ofmere symbol behind and have begun to ground themselves *here*, *now*, for *you –* they are able to *point* to real states of affairs". (Weatherby and Justie 2022, 382)<sup>12</sup> Neural networks bring the world – as the data on which they have been

<sup>10</sup> The assumption here is that this operation in fact finds something previously unknown and does not simply unfold a tautology; a model of this idea would be Kant's conviction that mathematical propositions are synthetic judgments a priori, that is, that they actually produce *new* knowledge.

<sup>11</sup> While the paper presenting LaMDA also claims "groundedness" for the model, what is meant by this is simply that LaMDA's outputs are "grounded in known sources wherever they contain verifiable external world information". (Thoppilan et al. 2022, 2) As *textual sources,* they continue to be part of Harnad's "symbol/symbol merry-go-round" (Harnad 1990, 340).

<sup>12</sup> One difficulty with this notion is the question of whether *all* data in a neural network should already be considered indexical (that would include the text of LaMDA), or only those obtained directly by sensors emulating physical senses (that would be images, but not text). Weatherby and Justie seem to have the former in mind, Harnad the latter. Harnad therefore speaks at one point of "iconic representations" through data (Harnad 1990, 342) – Peirce's third sign type, which operates on the principle of similarity between sign and signified. But since these are also indexical as they originate from sensors (which limits their scope

trained –into the computer, getting off of the solipsistic "symbol / symbol merry-goround". If we subscribe to this assertion for a moment, we see it plausibly demonstrated in Dall·E.

The heart of Dall·E is a machine learning model called CLIP. Via an encoder, it is fed with vectorized text-image pairs taken from the Internet – for example, a photo of a cat with the caption "this is my cat". CLIP is trained to predict which text vector matches which image vector; the result is a comprehensive stochastic model that correlates image information with text information, but stores it as *one* type of information. In figure 2, this is the table in which the scalar product of the text and image vectors is listed – the better the text and image fit, the better this value; when the original image and text are paired, it is of course optimal (those are the black boxes running diagonally).

CLIP is thus remarkably good at *image recognition*: If you present it with an unknown cat photo, it nevertheless recognizes it as "cat". But in a second step, it also becomes an*image generator*. To do this,it worksin conjunction with anothermachine learning model called GLIDE (Guided Language to Image Diffusion for Generation and Editing), which has already been trained on a large data set of images.<sup>13</sup> If the

to immediate, e.g. visual similarity), it seems to me that the argument of Weatherby/Justie and that of Harnad amount to something structurally similar – both are concerned with the connection between system and world, understood more or less broadly.

<sup>13</sup> GLIDE is a *diffusion model* based on thermodynamic models, and thus functions differently from the GANs that were popular until recently, which combine two antagonistic submodels (Dhariwal and Nichol 2021). That the AI architectures used for an aesthetic work can themselves be a resource for discussing that work is something I suggest in Bajohr (2022).

user enters a prompt, GLIDE can use the text-image data stored in the CLIP model to reverse this process and synthesize an image that best correlates with the input text. In both operations –image recognition as well as image generation –it is again central that the models can learn and actively reproduce the *correlation* between textual descriptions of objects and their corresponding visual manifestations.

One may object that the image information correlated with the word "cat", in which the photo of a cat is stored, may have an indexical relation to this cat – light was reflected from it and fell on a photo sensor etc. – but that even so the system will not learn what it means to share a world with a cat. Advocates of symbol grounding therefore try to extend what types of data an AI model gets fed – not only sensory but also motoric and eventually even social feedback: Only through the effects of language use in a community of other speakers inhabiting the same world can meaning be learned (Bisk et al. 2020).

But this claim would again mean to demand "full" human, that is, broad meaning, and to take anything below that not quite seriously. Instead, multimodal AI should be regarded as a second degree of dumb meaning. The Peircean indexical reference to something outside the model and the Saussurean differential reference to other elements within it are at any rate two distinct ways of meaning-making – if only that the dimension of possible correlations increases, and with it the possibility of unearthing unsuspected latent connections, unsuspected dumb meaning.

Indeed,multimodal AIs – besides Dall·E, for instance, Stable Diffusion,Google's yet-to-be-released Imagen, or Midjourney – are capable of generating very complex text-image meanings. Their power lies in a capability that suggests that such correlations have a productive quality: In studying the deep structure of CLIP, computer scientists found that the model had trained single "neurons" that fired for both the word and the image of a thing. These were *conceptual* neurons in which the distinction between image and text tended to be overcome (Goh et al. 2021). Multimodality, at the neural level, is really *pan*modality, suggesting a semantics without clearly differentiated sign systems (this is also suggested by Merullo et al. 2022). Dumb meaning finds a new quality here, and is not tied to either text or image data, but encompasses both in a way that points to meaning beyond modal separation – and again has nothing to do with mind (see for more on this Bajohr 2024c).

#### **Promptological investigations**

AI systems *are* dumb.They have no consciousness. Yet they produce a complex artificial semantics that runs counter to our ordinary notions of meaning. Multimodal AI also shows that imputed consciousness and the meaning-capacity of a system have little to do with each other:The fact that LaMDAin particular seemed to Lemoine like a person – and not Dall·E, although one might argue that it represents a higher, because more correlation-rich stage of AI development – is simply due to the fact that it operates dialogically and thus is assumed to have communicative intent, whereas the image generator does not. Language always seems to be smarter than the image.

However, meaning beyond communicative intent need not be *merely* parasitic, as the vector operations of word embeddings and the conceptual neurons of textto-image AIs show. That it is always *also* parasitic is due to the fact that the training data originate from a human world and artificial semantics is precisely not a "robot language" but a correlation effect of information that can be interpreted by humans. Nevertheless, in the long run, a convergence of dumb and broad meaning would be conceivable once they enter into mutually influencing circular processes.

The interface between natural and artificial semantics in the case of Dall·E is the interaction via prompt. On the one hand, "prompt design" *–* the precise, almost virtuosic selection of the text input – can be used analytically to scan the vector space of dumb meaning for traces of cultural knowledge. This would make the broad meaning of natural language, precisely in its interaction with dumb meaning, more important again.

A "promptology" that takes on such natural-artificial connections – the correlation of datafied language and the cultural meaning attributed to that language on the recipient side – would be a gateway for the humanities and cultural studies.With their knowledge of soft factors such as style, influence, iconography, etc., they could make useful contributions without necessarily taking the form of the more computer science-focused digital humanities; they could work in a phenomenon-oriented way and devote themselves to the artifacts that the model outputs as boundary objects between human and machine, between broad and dumb meaning.

At the same time, however, promptology is not merely an analytical procedure, but also a practice with its own knowledge, which has much to do with an almost "empathetic" interaction with the AI system. It has turned out that with text-toimage AIs, these prompts can be steered in unexpected directions simply by using certain, often counterintuitive or absurd formulations – there is already a start-up, PromptBase, which claims to sell particularly effective prompts (Wiggers 2022).<sup>14</sup> Instead of subjugating the system and using it as an instrument, natural language instead must be adapted to the artificial semantics just to operate the system.

The result is a feedback loop of artificial and human meaning: not only does the machine learn to correlate the semantics of words with those of images we have given it, but we learn to anticipate the limitations of the system in our interaction

<sup>14</sup> What is interesting here is that the discussed tendency to eliminate the speech/image distinction at the *technical* level is contrasted with the displacement of the image by speech at the *interface* level*.* The results of Dall·E could therefore also be understood as *language art* instead of being mere visual objects.

with it; this convergence would not be communicative in a strong sense, but perhaps in a weak, a dumb, sense.<sup>15</sup>

# **Bibliography**


<sup>15</sup> See for this aspect Bajohr (2024a)


# **Artist-Guided Neural Networks – Automated Creativity or Tools for Extending Minds?**

*Varvara Guljajeva, Mar Canet Sola,<sup>1</sup> Isaac Clarke*

# **Introduction**

It is claimed that recent advancements in AI, such as CLIP-based products Midjourney and DALL-E, are supposed to augment our creativity. For the first time, it does not sound so absurd that artists can find themselves out of jobs (Nicholas 2017). Not that artists would have ever had a secure and stable job, but deep learning (DL) tools might eventually lead to losing some commercial commissions. However, such thinking relies on a modern art approach where skills are in the centre of attention and not the conceptual idea. Quoting Lev Manovich: "Since 1970 the contemporary art world has become conceptual, ie focused on ideas. It is no longer about visual skills but semantic skills". (2022, 62) Echoing Aaron Hertzmann, once painters were in a similar situation when photography was invented and took over the niche of portrait-making. Then visual artists had to re-invent themselves and re-think the meaning of painting. And photography had to wait another 40 years until it got recognized as an artistic medium (Hertzmann 2018).

Computer art emerged with the invention of the computer. Artists, such as Vera Molnar and Manfred Mohr, created their first computer-generated artworks in the 1960s using scientific lab computers at night when they were not used by scientists. Early computer artists were re-purposing a machine for artistic use and writing code to make art on it. Since the creation process was mediated by a computer, it may seem to the general audience that the artists were simply pressing a button and the computer doing art for them. Hence, the question of authorship emerged: is the artist a machine or human?

Paradoxically, today with the appearance of neural networks (NN) and their creative applications, the same question re-appears. For example, Aaron Hertzmann has written several articles arguing that people do art and not computers (Hertzmann 2018, 2020). Lev Manovich also describes how AI-generated images that im-

<sup>1</sup> Equal contribution.

itate realist and modernist paintings are claimed to be art (Manovich 2022). At the same time, experimental art forms, like installation, interactive, performance and sound art, are often overlooked unless they are promoted by a large corporation.

Instead of re-telling a short but very dense history of DL technology development,in the next section, we focus on the appearance of NN tools that raisedinterest among the artists that led to meaningful artwork production.

### **Historical overview of deep learning development**

DL is a subset of machine learning (ML) using Deep Neural Networks (DNN) to learn underlying patterns and structures in large datasets. In 2012, a DNN designed by Alex Krizhevsky outperformed other computer vision algorithms to achieve the new state of the art in the ImageNet Large Scale Visual Recognition Challenge (Krizhevsky / Sutskever / Hinton 2017). This model, AlexNet, signalled the start of a new DL era. Over the past decade, DNNs have continued to grow in size and complexity, and are now used in a wide range of tasks including Computer Vision, Natural Language Processing, and even playing board games like Go. As AI technology has developed and become more prevalent in real-world systems, artists have been exploring its limits and potentials, adapting these models to their own practices.

As the number of scientific publications on artificialintelligence grows exponentially (Krenn et al*.* 2022), and the artistic interest grows alongside, it is useful to map out the influential papers, and related applications, to help track the evolution of the AI-Art space in relation to the technological advances. Figure 1 shows a timeline of the development of generative models for images and text. Using this diagram we can make a few observations on the past ten years: the dominance of GANs for image generation, the influence of the Transformer on Large Language Models (LLM), and the growing interest in multi-modal approaches and translation models. The starting period of image generation using DNNs can be traced back to the creation of the Variational Auto-Encoder (VAE) (Kingma / Welling 2013), and the Generative Adversarial Network (GAN) (Goodfellow et al. 2014). These models showed different ways in which a neural network can be trained on a large dataset, and then used to generate outputs that resemble but do not copy the original dataset.

*Fig. 1: Timeline of creative deep learning development.*

For much of the past decade,GAN art has been a dominant and defining element of AI Art.GANs are trained using a competitive lying game, played by two players: the Generator and the Discriminator. The Generator wins by making an image that the Discriminator thinks is from the original dataset. The Discriminator wins by successfully identifying which images the Generator has made. By playing this game repeatedly, both sides slowly learn when they have been fooled and remember information so they do not fall for the same tricks again. The Generator gets better at making images, and the Discriminator gets better at detecting these fakes. At the end of the game we are left with a Generator that is very good at generating new images, with the qualities and style of our original inputs.

*Image-to-Image Translation with Conditional Adversarial Nets* (Isola et al. 2016)*,* also known as pix2pix, showed a process of converting one type of image into another type. Mario Klingemann's work *Alternative Face*<sup>2</sup> used the pix2pix model with a dataset of biometric face markers and the music videos of the singer François Hardy. This allowed him to control the movement of the face with this form of digital puppetry, which he then demonstrated by transferring the facial expressions of the political consultant Kellyanne Conway onto Hardy's face as she talks about "alternative facts".

In 2015, on the Google research blog, the post *Inceptionism: Going Deeper into Neural Networks* (Mordvintsev / Olah / Tyka 2015) described a tool developed by researchers attempting to understand how image features are understood in the hidden layers of the NN. Alongside this post they released a tool called DeepDream. This model enhances an image with the NN's attempts to find the features of the dataset it was trained on. The creative use of DeepDream was proposed by the authors in the original article "It also makes us wonder whether NNs could become a tool for artists – a new way to remix visual concepts – or perhaps even shed a little light on the roots of the creative process in general". DeepDream's psychedelic imagery quickly caught the attention of the internet and of artists around the world, resonating with those interested in understanding the cross-over between biological and neurological constructions of images. Memo Atken's work *All Watched Over By Machines Of Loving Grace: Deepdream edition*<sup>3</sup> for an exhibition of DeepDream artworks in 2016, hallucinated over an aerial photograph of the GCHQ headquarters. This work raises questions around the motivations of the organisations funding the development of artificial intelligence, and in doing so make the dreamlike qualities a little more nightmarish.

CycleGAN continued with the problem of image-to-image generation shown in pix2pix, but removed the requirement of alignedimage pairs being needed for train-

<sup>2</sup> https://underdestruction.com/2017/02/04/alternative-face/

<sup>3</sup> https://www.memo.tv/works/all-watched-over-by-machines-of-loving-grace-deepdream-e dition

ing (Zhu et al. 2017). Instead a set of source images and a set of target images that aren't directly related can be used. This has the advantage that it is simpler to scale to larger datasets, making the process more accessible for artists. Helena Sarin has been using CycleGAN for a number of years, and recently in *Leaves of Manifold*<sup>4</sup> she collected and photographed thousands of leaves to build her own training dataset, and then implemented a custom pipeline with changes that improve results when working with smaller datasets. This personalised approach in crafting the models resonates with the hand-made, collaged aesthetic of the images generated.

Other notable developments to GANs brought improvements to image quality and resolution (Karras et al. 2017). In late 2018, the release of StyleGAN (Karras 2021), a model built on a combination of ideas from Style Transfer and PGGAN, demonstrated very convincing images of human faces. In his article "How to recognize fake AI-generated Images", the artist Kyle McDonald (McDonald 2018) investigated the images generated by StyleGAN, and highlighted the visual artefacts he found. At a glance these images look like photographs, but on closer inspection irregularities such as patches of straight hair, misaligned eyelines, or mismatched earrings reveal the difficulties GANs have in managing "long-distance dependencies" in images.

In 2017 the paper *Attention Is All YouNeed* (Vaswani et al. 2017) proposed a new network architecture called the Transformer. This model addressed the long-distance dependency issue in RNNs and CNNs by rethinking how we could handle sequences. Rather than looking at a sentence word by word, the Transformer observes the relationship between all elements of the sequence simultaneously. Being able to better handle long distance dependencies meant the Transformer was appropriate for natural language generation. Artists and poets such as Allison Parrish have explored the use of VAEs for short text generation, but with the emergence of LLM passages of long, coherent texts could be generated (Brown et al. 2020). As dataset sizes increased, along with hardware costs for training these large models, they have become harder for individuals to train themselves, and the mode of interaction has shifted from curated datasets and homemade scripts, to web APIs and third party services. While it is more difficult to participate in the training process, the availability of services and interfaces provides new ways of working with these models that can produce less technical andmore playful approaches. For example,Hito Steyerl used GPT-3 to create *Twenty-One Art Worlds: A Game Map* (2021) and described the process as "fooling around" with GPT-3 to write descriptions of different ArtWorlds, using the tool in the process of world building. In the resulting text it is difficult to distinguish which words may have been written by Steyerl and which were written by GPT-3.

The learnings from Large Language Models for text generation were soon applied to image generation (Dosovitskiy et al. 2021), and the simultaneous release of

<sup>4</sup> https://www.nvidia.com/en-us/research/ai-art-gallery/artists/helena-sarin/

CLIP (Radford et al. 2021) and DALL-E (Ramesh et al*.* 2021) signalled the start of a new era of image generation. Although the DALL-E model was not released, CLIP was made available to the public, and the model was quickly adopted by AI artists who applied theidea of CLIP guidance to variousimage generation techniques.Ryan Murdock produced the colab notebooks DeepDaze<sup>5</sup> , combining CLIP and SIREN, and BigSleep<sup>6</sup> , combining CLIP and BIGGAN, which were subsequently adapted by Katherine Crowson in the widely distributed VQGAN+CLIP notebooks<sup>7</sup> .

The paper Denoising Diffusion Probabilistic Models (Ho / Jain / Abbeel 2020) introduced a different method for creating generative models. This technique trains a model by adding increasing amounts of noise to an image and then having the model remove the noise, resulting in a model that can generate images from only noise.Diffusionmodels,when combined with CLIP or other conditioning processes, enable much faster text-to-image processing.

The popularity and accessibility of these techniques was further raised by the product release of DALL-E 2 (Ramesh et al. 2022) and Midjourney<sup>8</sup> . Midjourney became so popular it is now the largest Discord server with over 5 million members as of writing. Following the releases of these products, open source models such as Stable Diffusion have also been developed. There are many benefits of using free and open source models for artists. Being able to modify code and develop on your own hardware allows the artist to pursue their own experimental approaches, not restricted to the interface designed by a service provider.

The artist's involvement in generating new images with these models is vastly different to working with GANs. Rather than building custom datasets and training models, instead the focus has shifted to writing prompts that can generate the images the artist wants to find, and designing interfaces for exploring these prompts and their translations. This artist Johannez coined the term "Promptism" (Herdon / Dryhurst 2022) for this art practice, and wrote a humorous Prompist manifesto using GPT-3. Against a backdrop of models trained on hundreds of millions of images scraped from the internet, the manifesto asserts "The prompt must always be yours" (Johannez 2021).

### **Artist-Guided Neural Networks**

Many papers discuss AI from the point of view of creativity taking mostly one position of two: either AI as an amazing tool for artists and creativity, or AI is seen as

<sup>5</sup> https://github.com/lucidrains/deep-daze

<sup>6</sup> https://github.com/lucudrains/big-sleep

<sup>7</sup> https://github.com/EleutherAI/vqgan-clip

<sup>8</sup> https://www.midjourney.com

something negative in art. It is easy to see that people from the industry advocate for the first position, and theory scholars for the second one. But, how do practitioners see contemporary AI technology themselves? And in which ways AI is deployed in art practice? Hence, it is not the focus of this paper to discuss whether AI can make art, but rather how AI can be useful for artists and what new ideas it can offer. By using practice-based research methodology, we decode the role of AI tools in artistic practice and trace the evolution of such artistic work. In this paper, the practice of artist duo Varvara & Mar was used as a case study, which provided us with the insides in this research.

In this chapter, we explore in various ways in which AI was deployed in creative practice, dividing it into four categories based on medium: synthetic image, synthetic text, synthetic form, and translation models. From the view of the practitioner, the limitations, new possibilities, and change in production processes are discussed.

#### **Synthetic Image**

Our exploration of DL started with image generation in 2017 when we used Google Deep Dream algorithm. The idea behind the *Neuronal Landscapes* project<sup>9</sup> was to imagine how the Estonian landscape will look like in 100 years' time (commission work for Estonian History Museum). Since machine mysterious synthetic vista began to be prevalent, the artwork offers an experience of what it is like to see the environment through machine eyes, and to be immersed into an endless hallucinated simulacra of a neural net. Our intention here was to go beyond a still image and achieve immersiveness depicting Estonian society's evolution throughout time: from forest and farm land, towards urbanisation and increasing digitalization. For that purpose, a 360º VR video (00:09:39 long) was created. First, the video material was filmed and then edited. Filming was done with two 360º cameras carried by a drone to achieve a seamless, continuous shot, without a visible camera. After the stabilisation of the image, each frame was passed through the DeepDream algorithm. The rendering process took 30 days' time on two (for that time) powerful machines with Nvidia TitanX GPUs working in parallel. Although we could change parameters of the algorithm to slightly customise the effect, the imprint of the algorithm was still very present. At the end, Google Deep Dream became sort of a generative filter to the images.

In the next art project, ProGAN was deployed. For the first time we worked with datasets and training GAN models. *PlasticLand* (2019)<sup>10</sup> talks about plastic waste and

<sup>9</sup> https://var-mar.info/neuronal-landscapes/

<sup>10</sup> https://var-mar.info/plasticland/

ecological problems this material causes. We composed four different datasets of images of layered plastics in our planet: landfills, plastic on top of water, plastic underwater, and plastiglomerates. The ProGAN model was trained on a local machine using pyTorch and took a week to train, and we used part of the images generated during training to create a video composition.

A metal totem displaying those synthetic, as plastic is, layers, we draw attention not only to the problem of waste but also question whether AI has some similarity with this material. Since the invention of plastic, this material was applied almost everywhere because of its perfect qualities, until we realised that it is not sustainable and ecology-friendly. Will a similar story happen with AI?

From the practice-based research perspective, this artwork shows artists' desire to move from a still to moving image and towards sculptural form that is held back by the early stage of machine learning technology: low resolution images jumping from one frame to another.

The next artworks *POSTcardLandscapesfromLanzarote I(00:18:37)*and*II(00:18:40)*<sup>11</sup> in 2021 demonstrate the artist's ability to create video works with StyleGAN2 (see figure 2). The hypnotic appearance of these works, where one frame morphs naturally into another, shows the artists' ability in guiding the outputs of the NN. Vector curation and composition of a journey through the latent space, created by training the model on specific datasets of 2000+ images, were crucial and integral parts of the artistic process.

The artwork talks about critical tourism and how circulation of images representing touristic gaze overpower the nature of seeing. In the words of Jonas Larsen "'reality' becomes touristic, and item for visual consumption" (2006: 241–257). Hence, we scraped, where licence allowed, the location-tagged images from Flickr and composed two datasets of photos categorised as tourism or landscape.

As we have written elsewhere,

the art project consists of two videos representing a journey of critical tourism through the latent space of AI-generated images using StyleGan2. Later the images are composed into latent interpolations that take the form of smoothly progressive videos. The two videos are random walks in the latent space of the Stylegan2 trained models, creating a cinematic synthetic space. The audiovisual piece shows an animated image through the melted liquid trip of learning acquired from the dataset composed of static images. The video flows from point to point, generating new views and meaning spaces through the latent space's movement. The audio was created after the video was generated in response to the visual material to complete the art piece. (Guljajeva / Canet Sola 2022a)

<sup>11</sup> https://var-mar.info/postcard-landscapes-from-lanzarote/

The sound for local or landscape view was created by a sound artist from Lanzarote, Adrian Rodd, who aimed to give a socio-political voice to the piece. In contrast, the sound design created by Taavi Varm is a soundscape replying to touristic gaze. We aimed to initiate collaborations with others but also to experiment with human-AI co-creation. In a similar vein is the artwork *Phantom Landscapes of Buenos Aires* (00:20:00, 2021)<sup>12</sup>, with sound work by Cecilia Castro.

Our last experiment with GAN models *Synthetic Scapes of Tartu* (00:10:00, 2022), demonstrates a different approach. Taking a dataset composed from our own video footage (flaneur walks), we first produced the sound (a composition by Taavi Varm, Ville MJ Hyvönen with piano by J. Kujanpää) and used this to inform the direction of the video. The result was a sound-guided AI-generated visual output.

# **Synthetic Text**

In this section, we focus on artwork incorporating AI text generation as part of the artistic concept. Our journey to text generation started with the online participative theatre project ENA<sup>13</sup> and ended with a hand-bound publication.

During the first lockdown in May 2020, together with theatre maker Roger Bernat, we created an online participative theatre piece ENA on the website of Theater Lliure in Barcelona. ENA is a generative chatbot that talks to its audience, and together (AI and audience), they make theatre (see figure 3). As we have described

<sup>12</sup> https://var-mar.info/phantom-landscapes-of-buenos-aires/

<sup>13</sup> https://var-mar.info/ena/

before: Although in the description of the project it was stated explicitly that people were talking to a machine, multiple participants were convinced that on the other side of the screen another human was replying to them – more precisely the theatre director himself, or at least an actor". (Guljajeva / Canet Sola 2021)


Analysing synthetic books, Varvara Guljajeva has stressed the importance of human input in the AI text-generation systems (2021). In addition, one also needs to guide the audience participation and interaction with the chatbot. For this purpose, we have adopted the traditional theatre method for guiding actors, as a way to guide the audience, and thus, the bot, too. Stage directions were used as a guiding method, which triggered thematic conversation and offered meaningful dialogue between humans and the AI system. We found the conversations so meaningful that we decided to publish a book that contains all the conversations with ENA.

With this project, we learned that it is essential to guide neural networks via audience interaction. In order to do this, it is also necessary to guide the audience. Without audience interaction guidance, it is nearly impossible to achieve meaningful navigation of NNs.

#### **Translation models**

This category focuses on translation models that enable interactive and installationbased formats. Translation refers to the conversion of mediums, or as we put it, translation of semiotic spaces. To illustrate this, we introduce*Dream Painter*, <sup>14</sup> an art installation that translates the audience's spoken dreams to a line-drawing produced by a robot (see figure 4). As described earlier: "Dream Painter is an interactive robotic art installation that explores the creative potential of speech-to-AI-drawing transformation, which is a translation of different semiotic spaces performed by a robot. We extended the AI model CLIPdraw which uses CLIP encoder and the differential rasterizer diffvg for transforming the spoken dreams into a robot-drawn image". (Canet Sola / Guljajeva 2022) "Design- and technology-wise, the installation is composed of four larger parts: audience interaction via spoken word, AI-driven multicolored drawing software, control of an industrial robot arm, and kinetic mechanism, which makes paper progression after each painting has been completed. All theseinterconnected parts are orchestratedinto aninteractive and autonomous system in the form of an art installation […]". (Guljajava / Canet Sola 2022b) Out of all the projects discussed, this was the most difficult to realise.

In this project, we investigated how guidance of NNs could be interactive and real-time instead of non-interactive and pre-determined, as shown in previous examples of our work. It is important to notice that methods, such as dataset composition and output curation were not used in this case. In fact, visual output curation is totally missing. The artists created an interactive system to be experienced and discovered by the audience.This means the audience determines the output. Instead of curating a dataset, a CLIP model is used that can produce nearly real-time output

<sup>14</sup> https://var-mar.info/dream-painter/

guided by a text prompt. As we have written earlier: "Translation of semiotic spaces, such as spoken dreams to AI-generated robot-drawn painting, allowed us to deviate from image-to-image or text-to-text creation, and thus,imagine different scenarios for interaction and participation". (Guljajava / Canet Sola 2022b)

This project indicates our search for transformative outputs of AI technology, and thus, shows the evolution in practice. By extending available DL tools and combining with other technology, for example, text-to-speech models, real-time industrial robot control, and physical computing, it offered an interactive robotic and kinetic experience of NN latent space navigation. This contributes towards the explainability of AI because the audience could experience how the words affected the drawing, and which concept triggered which outcome.

Being inspired by Sigmund Freud's work on the interpretation of the human mind while unconscious, we speculatively ask if AI is powerful enough to understand our dreamworld. Through practice we question the capacities of NNs and investigate how far we can push this technology in the art context.This artwork allows the audience to experience the limits of concept-based navigation with AI. The system is unable to interpret and can only illustrate our dreams. It cannot understand the prompt semantically and only gets the concepts.

*Fig. 4: Kuka industrial robot painting audience's dreams. Installation view.*

#### **Synthetic Form**

In this section, we ask how artists can guide neural networks when creating volumetric forms, and what happens when AI meets materiality. After working for a while with DL tools that produce 2D outputs, it is an obvious step to explore possibilities to produce 3D results. To our surprise,it was not an easy task to find the solution (Oct 2021). *Psychedelic Forms*<sup>15</sup> is a series of sculptures produced in ceramics and recycled plastic through which we investigated the possibilities of AI in producing physical sculptures. The project re-interprets antique culture in the contemporary language and tools.

Following the same paradigm shift as in the previous section, text2mesh is a CLIP-based model that does not require a dataset, but a 3D object and text prompt as input. Hence, the model actually does not create a 3D model but stylises the inserted one. And this is guided by inputted text.

We decided to go back to the origins,in terms of ancient sculptures and material selection. Although it was said that there was no dataset, we still had a collection of 3d models of ancient sculptures because, by far, not all produced the desirable output. In this sense, there was definitely an output curation present in the process.

The criteria for selection were the following: first, the form had to be intriguing, and second, it should be possible to produce it in material afterwards. It was clear that we had to modify each model because the physical world has gravity, and the DL model does not take this into account. Some generated models were discarded because they were seen as not-fixable, although interesting in their shape.

The process demonstrated here is quite an unusual way to create an object. After extensive experimentation with the tool, we learned how certain words triggered certain shapes and colours. This knowledge gave us a chance to treat text prompts as poetic input. Thus, we created short poems to guide NN. The best ones survived as titles and are reflected in the forms.

The artists did not strictly follow the original model but took the creative liberty to modify the shape and determine the colour by manually glazing the sculptures. The dripping technique was used for colouring the sculptures. This served as a metaphor for liquid latent space and the psychedelic production process (this was the artists' inner feeling about the creative process because they did not know what results would be achieved in the end).

Sometimes, AI-generated vertex colouring was taken as inspiration, sometimes totally ignored. Nevertheless, digital sculptures were exhibited alongside the physical ones to underline the transformation and human role in the creative process. Although ceramic sculptures were 3D printed in clay, the fabrication process had to follow the traditional way of producing pottery (see final sculptures in figure 5).

<sup>15</sup> https://var-mar.info/psychedelic-forms/

Since we had never engaged in ceramics before, the whole production process felt psychedelic: unexpected NN processes led to transformation by numerical, physical, and chemical processes, all guided by both the artists and chance. Hence, the art project highlights the relationship between different agencies.

In the end, we can say that AI is not prepared for the physical world. It created nice images, but when one wants to materialise the output, it requires considerable additional work.However, those extra processes were very rewarding and creative in our case. In this project, AI served as an inspiration or a departing point more than anything else. In other words, the experimental phase of technology is necessary for experimental practices, and this can lead to the creation of a new production pipeline.The fine line between control and chance when guiding the neural networks and related processes is likely the main creative drive for the artists.

*Fig. 5: Ceramic sculptures guided by 3D object and text prompt, 3D printed in clay, and glazed manually. From left to right: Psychedelic Angel (Venus), Mermaid in green jelly and pink feather (Nymph), Psychedelic Angel (Venus).*

# **Discussion**

According to the media hype around AI, this technology is intelligent enough to create art autonomously (Perez 2018; Vallance 2022). However, the reality is different. According to computer scientist and a co-inventor of Siri Luc Julia, AI does not exist. He advocates for machines' multiple intelligences that often outperform humans. However, machine intelligence is limited and discontinuous compared to human intelligence (Julia 2020). Therefore, it is vital to have artistic practices around this technology, as a counterbalance to the AI fantasies served by the industry and mass media.

We see AI as a creative tool with its own possibilities and limitations, which can stimulate artists' creativity through unexpected outputs. Research has shown that tool-making expands human cognitive level and constitutes evolution in culture (Stout 2016). Similarly, as a new tool, generative AI could potentially enrich creativity by allowing new production pipelines that can generate unique results.

Coming back to the synthetic images, we can say that all machine-created synthetic image-based works discussed here have particular aesthetics: both with Deep Dream and GAN. Unlike the output of GANs, Deep Dream has a more recognizable style and can be seen more as a filter that transforms every inputted image instead of learning from the given dataset.

Regarding GAN aesthetics, such visual appearance is inherited from two entities to a large extent: the dataset and the model itself. GANs have a particular footprint, as seen in all works produced with this model. The visual palette comes from the used datasets. For example, if a dataset is homogeneous (only landscape images), then we will easily recognize landscapes in the generated output. However,if images in the dataset have a lot of visual variation, the output is rather abstract. *POSTcard Landscapes from Lanzarote* illustrate this well. Also, when photos in the dataset look similar, the output will also be similar,as was the case with the *Synthetic Scapes of Tartu* video work where frames from recorded flaneur walksin a city were extracted.When we talk about video works generated with the neural net, then manual guidance of latent space offered more variations than an audio-led approach.

Synthetic image works have encouraged us to work with formats like images and videos that we did not engage in before in our art practice, but we found it exciting working with AI and video. For example, AI video generation has some affordances, like starting and ending can be done in a perfect loop since images are synthetically generated.However, creating real-time AI work is much more complex because some models are too slow. It might take a few minutes to render a single image.The limitations inspire us to devise new solutions and work in new mediums. Moreover, the limitations of the medium has always been a good challenge for our creativity.

Working with GANs or other image-generation tools has become much easier in recent years, although it used to be quite difficult. We must note that for practitioners, easy-to-use tools, such as DALL-E and Midjourney, offer little creative freedom, and thus, are less attractive to the artists. Those products tend to instrumentalize the user rather than the other way around. At the same time, open source models offer more creative freedom and enable broader use of artistic ideas.

The work with generated text demonstrates that AI is not context-aware but maps concepts automatically without understanding semantics. More importantly, as shown in the *ENA* project the audience must also be guided alongside the AI. In the case of *ENA*, stage directions were used, and in the *Dream Painter* project, the concept of dream telling was applied to guide the participants who in turn guided the neural net through their interaction, creating a chain reaction. Navigating concepts in latent space is artistically interesting and inspiring, this was especially evident when working with form. The artists went beyond semantics and learned how to guide neural networks with a text prompt and 3D object.

The presented practice represents a paradigm shift in machine learning,moving away from composing datasets for GANs and toward translating semiotic spaces enabled by diffusion models. The evolution in practice shows how artists discover and learn to work with the DL toolset, embracing its possibilities and limitations. In the case of practice-based research, practice can be seen as a lab for testing artistic ideas with technology through chance until control is encountered.

# **Conclusion**

In this article, we have summarised DL development from the perspective of artists' interests concentrating on the image, video, text, 3D object generation, and translation models. We applied practice-based research methodology to investigate the role and possibilities of recent co-creative AI tools in artistic practice.

It is difficult to keep pace with AI development. In less than a decade, we have gone from blurry black-and-white faces to impressive high-resolution images guided by text prompts. The user level has gone from difficult to easy, which on one side, broadens possibilities for creation, but on another, it diminishes experimentation and creativity, since AI outputs seem ready-made. This is also demonstrated by the explorative nature of the body of work presented here.

Furthermore,it was noticed that creative AI, especially GAN models, have recognizable aesthetics, which in the long run, become repetitive. This led to the change of tools by the artists.The curation of datasets, models, and outputs, along with NN guidance, have become the toolset of an artist working with AI. Finally, these models can generate multitudes of outputs, but the art is giving the right input to guide the desired output and selecting the results that best serve the concept

As Andy Warhol had envisioned in 1963, eventually, art production will become mechanised and automated. In his own words: "I want to be a machine", which was also a reflection on that time's vast industrialization process. Resonating with today's deep learning age: I want my machine to do art.

### **Bibliography**


*IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)*. doi: 10.1109/cvpr52688.2022.01313.


# **Embodied Voice and AI: a Techno-Social System in Miniature**

*Diana Serbanescu, Scott DeLahunta, Ilona Krawczyk, Kate Ryan, Mika Satomi*

# **Motivations**

One of the original motivations for the practice-based research presented in this paper is rooted in the premise that embodied knowledge is relevant to the field of AI, not only in terms of application to design but also in terms of critical practice. A key proponent for integrating embodied knowledge into Human-Computer Interaction, Schiphorst (2009: 225) reckons "embodiment in the context of designing for technology" matters, and "concepts such as embodied computing and embodied interaction" require "design strategies that take advantage of our senses, accessing a richer and more fully articulated form of human being". In terms of critical practice, feminist techno science scholars such as Leurs (2017: 137) regard the researcher as an embodied subject, and stress the importance of materiality, by emphasizing the relational entanglement of bodies and infrastructures in the creation of knowledge. In her seminal book, Artificial Knowing: Gender and the Thinking Machine, Adam (2006) argues that "the body plays a crucial role in the making of knowledge". The afore-mentioned scholars draw on feminist research to counter the dominant data science discourses that still equate intelligence with rationalistic problem solving, in which claims for objectivity and universal truths propagate what feminists call, a "view from nowhere". Anthropologist Suchman (2007) in her critique of the "disembodied intelligence in AI" also points out that "feminist theorists have extensively documented the subordination,if not erasure, of the body within theWestern philosophical canon", an erasure Suchman claims extends to the field of AI and Robotics.

While addressing the wide literature and theory in both application design and feminist critical practices lies outside the scope of this paper, these are the kind of perspectives that provided the original motivations for embarking on our research project. Continued research will further explore the implications of these theories for this practice-based setting.What we can do here is to acknowledge, as Suchman (2007) posits, following Haraway (1997: 11), that "technologies [...] are forms of materialised figuration". They are brought into existence by assemblages of humans and

machines, bringing meaning and values, including dominant assumptions and biases that subordinate, into the process of new knowledge creation. We believe this calls for more critical introspection of one's own practice following Leurs (2017: 145) who states,

"it is through being critically self-reflexive about one's positionality, locating research practices within wider power relations and structures, that we can begin to destabilise the normalised politics of knowledge production".

Therefore,in presenting the framework below, we aim to also reflect on the positionality and the ethics of our research,in part through revealing the individuals and the corresponding assemblages in the development of tools and processes.

# **Strand One: Augmentation of performative practice through an AI wearable design**

This section proposes a hybrid methodological framework which discusses the technical and formal aspects of augmenting an embodied practice for actor training by means of a custom-made wearable device which gives the body a voice. The device is an interactive system that uses an Artificial Intelligence (AI)-based module to map movements to synthesised sound. Our framework draws on techniques from the fields of contemporary performance, wearable technology design, Human-Computer Interaction (HCI) and AI pipeline design.

# Methodological framework

# Embodied voice in post-Grotowskian practice

In the Embodied Voice and AI project, our team focused on a particular method for physical actor training chosen from the post-Grotowskian repertoire. This choice of method was motivated by many reasons. First, according to Wolford (1996), Grotowski's<sup>1</sup> practice, constituted as a laboratory of embodied research, and his rigorous system of actor training mirrored his fascination with science. Thus, his epistemic aims towards reproducibility of results and claims of scientific objectivity resulted in methodologically unique approaches to the embodied practice.This structured way of working, which inspired generations of practitioners, could serve as a

<sup>1</sup> Grotowski was a Polish theatre director and theorist whose innovative approaches to theatre training involved the study of performers immediate, "organic", "animal-like" psychophysical responses to impulses where "there is no discursive mind to block immediate organic reaction, to get in the way" (Richards 2003:66) . He called the process of eradicating such blocks *via negativa* (Grotowski & Barba 2002:17).

conceptual scaffolding of the bodymind continuum suitable for a technological design.The training works by making the actor more aware of moments at which their body and mind act as one – giving them, through the process of repeated training, the ability to more frequently, and more deeply, access a state of "freedom from the time-lapse between inner impulse and outer reaction in such a way that the impulse is already an outer reaction" (Grotowski & Barba 2002: 16). Voice and singing are important parts of this process.

Embodied voice training in post-Grotowskian practice seeks to express deep psychophysical states through voice, in moment-to-moment situations in relation to self and to partners. Thus, "performers are trained to 'experience the body as music, as a melodic and rhythmical vibration' (Dowling, 2011: 248) that affects and allows one to be affected by others within musical and rhythmical structures" (Krawczyk 2021: 24). Last but not least, the choice of focusing on one post-Grotowskian technique is rooted in the first-hand experience that our team of practitioners had with this type of practice. We wanted to model a practice that we had a thorough embodied understanding of<sup>2</sup> .

#### Team

AI-based systems are products of collaborative and team effort. As previously mentioned, we consider a process of reflexivity in relation to one's own practice to be an integral part of our research framework and a necessary step toward an ethical guideline for AI research. In this regard, we take from Leurs (2017: 139) the principle of an "ethics of care", which is described as "value-based" and cognisant of "the dependencies, partiality, political commitments and personal involvements of researchers". Therefore, we believe that awareness about the process of research, development and decisionmaking starts with the team. In a small team and project like ours with its focus on augmenting a pre-existing psychophysical training method, it is easier to explore the ecologies of skills, knowledge, but also experience, subjective preferences, interests and embodied abilities, which the practitioners and researchers bring to the room. Together, these create a distinctive ontology of practice for the enacted "assemblages of doings" (Leeker 2017: 13) – in which body datafication, while a core focus of the research,is undertaken in an iterative way – that opens space in a process for the kind of reflexivity we propose could be valuably undertaken in other practice contexts.

<sup>2</sup> The framework presented here is a continuation of a growing body of collaborative work established in two previous projects, The Shape of Things to Come (2019) – https://rep lica.institute/ (accessed 12 Dec. 2022) – and Dancing at the Edge of the World (2020) – https://berlin-open-lab.org/portfolio/dancing-at-the-end-of-the-world/ (accessed 12 Dec. 2022), https://www.hybrid-plattform.org/forschung/detail?tx\_news\_pi1%5Bnews%5D=996 &cHash=18878ad6e067104b2e55c31756756b28 (accessed 12 Dec. 2022).

For this practice-based research, our team consisted of the following five scholars / practitioners, who are also the co-authors of this paper. Each member of the team assumed one or more specific roles within the research: Kate Ryan – performer; Mika Satomi – wearable technology and interaction designer; Diana Serbanescu<sup>3</sup> – artistic director, and project lead; Scott DeLahunta – ethnographer; Ilona Krawczyk – post-Grotowskian practitioner, and external expert consultant.

Suchman (2017) posits that "when we ascribe skills to a person [...] the person acts as a symbol". However, she also discusses how the activation of skills is dependent on the team configuration. In relation to this, Ilona Krawczyk notices how the activation of skills in our team was supported and encouraged, through feedback sessions embedded in the process:

What I found particularly important was how Kate and her process stood out in the first place. In the post-Grotowskian theatre this is not common, as the director's perspective on the training or performance devising process dominates the performer's perspective.

With regards to the ethics of practice, and related to a process of self-reflexion, Ilona comments:

From the perspective of my research on ethical considerations in theatre practice, another significant aspect of this project is the effort to attentively investigate what kind of knowledge we are producing, in what areas within and beyond performance, training and AI, how the process of knowledge production is distributed among all the participants and how we acknowledge each other's contributions.

As a tool for collaborative post-reflection on the process, we used a web-based annotation platform developed by Motion Bank in Mainz.<sup>4</sup> This platform supports dialogue and documentation of process from multiple perspectives, allowing us to partially trace decisions and members' contributions.What our documentation reveals is that there are also aspects of tacit and embodied knowledge that each of us contributed, sometimes beyond the "traditional" roles in the team.

<sup>3</sup> As the project lead and artistic director of this project, as well as the two preceding projects – "The Shape of Things to Come" (2019) and "Dancing at the Edge of the World" (2020) – Diana Serbanescu made key decisions in shaping the research framework and agenda. She was responsible for selecting team members in all three project configurations and contributed to design decisions regarding the structural and functional elements of the AI-based technological artifact.

<sup>4</sup> Motion Bank. Hochschule Mainz University of Applied Sciences. https://medium.com/motio n-bank (accessed 12 Dec. 2022).

#### The AI-based motion capture wearable device

*Fig. 1: Wearable design by Mika Satomi. Credit: Mika Satomi*

The idea<sup>5</sup> of imagining the body as music, as rhythmic vibration, inspired our first prototype<sup>6</sup> , an AI-based wearable device that sonifies movement by mapping body postures to sound cues. The practice-based research discussed here builds on this earlier prototype<sup>7</sup> designed by Mika Satomi, and aims to alter it to meaning-

<sup>5</sup> Ditte Berkley, a post-Grotowskian practitioner involved in "The Shape of Things to Come" (2019), encouraged the performers to "sing with the body and dance with the voice".

<sup>6</sup> The initial prototype was developed for a performance, titled Dancing at the Edge of the World (2020).

<sup>7</sup> In this version of the prototype, the focus was on giving a voice to the collective body of six performers by endowing each actor with an AI-based wearable device that uses movement

fully augment a specific exercise, taken from the post-Grotowskian embodied voice training.

Figure 1 shows a schematic representation of the custom-made AI-based wearable device designed by Mika Satomi.The device is a collar, endowed with bend sensors<sup>8</sup> that can be attached to eight points on the body, able to accomplish the function of motion-capture<sup>9</sup> . At the back of each collar thereis a Bela board, which acts as a processing unit. All the computational process: data capture, posture recognition and mapping through a machine learning (ML)<sup>10</sup> module, sound synthesis<sup>11</sup> happen locally on the Bela board contained in each collar<sup>12</sup>. The ML module is based on ml-lib<sup>13</sup>, a library of machine learning externals for Max and Pure Data, built for the Gesture Recognition Toolkit by Nick Gillian. A speaker sits on the front of the collar, emitting sounds that can react to the position of the sensors, on how the performer moves their body.

The ML model is trained on incoming real-time data from the sensors placed on Kate's body. The data from the sensors is captured in specific poses, which are devised according to our movement framework and agreed upon collaboratively by Kate Ryan and Diana Serbanescu.These data points are then mapped to parameters from the synthesizer. Due to the fact that every time Kate is wearing the collar, there are small alterations in the placement of the sensors on her body, the training process needs to be performed anew at the beginning of each rehearsal-session. During the rehearsal, the stretch of the material and consequently the resistance of the conductive fabric is also slightly altered, introducing small variations in the data and compromising the accuracy of the mappings. To counter this, the system needs to

to trigger sound. The mappings of the ealy prototype were simple, connecting randomly devised postures to sound cues: synthesised vowels, or pre-recorded voice samples of the performers singing.

<sup>8</sup> E-textile bend sensors made as flexible tapes.

<sup>9</sup> By capturing the bending motion of the body part on which the sensor is placed.

<sup>10</sup> More specifically, the sensor data is read and processed by the Pure Data (PD) (https://pured ata.info/ – accessed 12 Dec. 2022) software running on the Bela board attached to each collar. This software installed on each Bela board also contains a ML-based module, which is used for mapping body postures to synthesised voice sound. The captured sensor data goes through this ML-based processing module which extends an "ml.lib" (https://github.com/irllabs/ml -lib – accessed 12 Dec. 2022) external object based on the Gesture Recognition Toolkit (htt ps://www.media.mit.edu/projects/gesture-recognition-toolkit/overview/ – accessed 12 Dec. 2022).

<sup>11</sup> This processing module is trained to recognize patterns of body postures, which are then used to control synthesizer parameter within the Pure Data patch.

<sup>12</sup> The synthesised voice sound is played back from the Bela board directly to the portable speaker worn by the performer.

<sup>13</sup> https://github.com/irllabs/ml-lib – accessed 24 May 2023.

be recalibrated and then retrained with fresh data, a process which is repeated multiple times during the rehearsals. Due to this practice of working with the variability of the textile, our data set is transient, in a continuous state of flux.

The data from the sensors can also be streamlined in real-time to a remote central unit, through a wireless network. Mika Satomi has also implemented the controller interface which enables the connection to each of the individual Bela-boards from a central laptop and is capable of visualising the sensor data in real-time and of initializing the training sequence of each system.More information about the design of the wearable device can be found on Mika Satomi's personal webpage<sup>14</sup> .

#### Input: Motion capture through sensors

As previously mentioned, we limited ourselves by design<sup>15</sup> to work with eight e-textile bend sensors. How the sensors are placed on the body becomes very important in the customization to a particular exercise for embodied training, as we aim to capture movements on the body that are most relevant for that particular exercise. Therefore, we needed to experiment and develop a specific logic of how the sensors are placed on the body. Mika Satomi's design provides some space for flexibility and re-arrangements in sensor positioning<sup>16</sup>. In the context of this research, we decided to use two of the upper body sensors to be placed one on the back and one on the belly, instead of the shoulders<sup>17</sup>. In this updated design the lower-body sensors are embedded in a pair of pants and are following the skeletal lines of the lower part of the body.

<sup>14</sup> https://www.nerding.at/costume-for-dancing-at-the-edge-of-the-world/ – accessed 24 May 2023.

<sup>15</sup> The Bela board we used includes eight analog sensor inputs.

<sup>16</sup> The bend sensors were made with kinesiology tape, often used as sport taping to retain flexibility, and they were applied directly on skin like the original tape.

<sup>17</sup> In the original prototype (2020), we focused on sensing upper parts of the body by placing more sensors on arms: elbows and shoulders. In the first prototype there were also four distinct sensors for the legs to be placed individually on knees and feet. These were all fixed directly on the skin using glue and kinesiology tape.

# Output: Synthesised sound

The sound design is realised through patches implemented in pure data. Our prototype includes two modes – the Voder mode18, and the Granular<sup>19</sup> mode – mapping body postures to synthesised sound.

# Augmentation of practice: In-tensions

Gesture to sound mapping

*Fig. 2: Modelling In-tensions, an embodied voice training exercise. Credit: Diana Serbanescu*

Our practice-based research focuses on an exercise from the post-Grotowskian embodied voice training, which is called In-tensions20. The original aim of In-tensionsis to transfer a physical processinto the vocal quality of a performer, toimprove the in-body listening (with a partner), and last but not least, to improve the vocal presence of the performer.This training system treats sound as a physical presence, as an agentive actor in space. In-tensions consists of a series of concrete physical actions. For our research we've focused on four actions: "push", "pull", "caress" and

<sup>18</sup> This is based on formant synthesis. The algorithm is inspired by the early speech synthesiser developed by Bell Labs 1939. In this mode body postures are mapped to synthesised vowels from English language.

<sup>19</sup> In this mode body postures control speed and playhead of the granular synthesis on prerecorded voice samples.

<sup>20</sup> This was introduced to us by Ditte Berkley. However variations of this method exist in the practice of different other post-Grotowskian practitioners. For example, Ilona has her own personal variant of this method.

"bounce".The aim was to transfer the physical energy of these actions into the vibratory patterns of the voice.

Figure 2 shows a diagrammatic formalisation of mapping four pre-defined (embodied) actions to be delegated to selected sound cues. These actions would be performed by Kate, and the data captured by the sensors in these postures are used to train the AI-based module for classifying these actions.

An example of training the system: "push" action against the wall

This section introduces step by step the process involved in training the system to recognize and map the physical quality of performing a "push" action against a wall. At the end of the training, the pre-recorded sound cues are activated by the movements of the performer in a session of improvised responses to the task of playing with variations of the "push" action against the wall.

*Fig. 3: Kate performs an action of "push" at the wall.*

Step One: Marking a "push" action

Figure 3 captures Kate engaged in the process of searching for her optimal voice quality while performing the action of "push" against the wall. Originally, this is an exercise devised by Ditte Berkley. Its intention is to give a real obstacle – the wall – for the performer to engage with in order to execute an authentic "push" action, by contracting the muscles of the body that would also help with the release of the voice. The performer is encouraged to cultivate awareness in listening to their own voice, while in action, and to assess the moment of their best perceived vocal quality.Thus,

when Kate identifies the peak moment of her voice and the position in which it happens, she is instructed to tap the wall with her hand, while maintaining the posture. Diana suggested to use this tapping as a cue for Mika to start recording the incoming data from the sensors21. The process involves an act of precise synchronisation between Mika, operating the interface, and Kate, performing the action.The sensor data captured here will be used as training data samples for the ML-based module. The intention is to use the data that represents a moment of full embodied unity between vocal and physical quality of the performance.

#### Step Two: Explore "push" action mapped to sound

After the training by examples, our ML model has now learned to map the incoming sensor data with corresponding parameters for voice synthesis. The parameters for voice synthesis are experimentally chosen by Mika. At this stage, they vary from training to training.The intention is to also explore qualities of voice synthesis<sup>22</sup> that resemble the real voice of the performer engaged in action. Diana asks Kate to explore different types of "push" actions against the wall, and test what the system has learned and how it reacts to variations from the training samples.

#### Task-based improvisation with all actions

Task-based improvisations are sessions of varying lengths, lasting between one and two hours, with the aim of exploring the limitations and possibilities opened by augmenting the In-tensions practice. This is based on the embodied listening between Kate, as performer / practitioner and the AI-based system, through the wearable device. The intention is to observe how this engagement with the AI-based system could become conducive to moments of flow, or how it stimulates in-the-moment awareness in the performer.

The session is structured in two parts: the part in which we are training the system, followed by a structured improvisation. Diana guides Kate through these two phases, and she describes this process as follows:

We are working with four actions: push, pull, caress and bounce. I ask Kate to take a moment and imagine in great detail a real context in which she would perform each of these actions, then I ask her to perform each action, in her imaginary context, as accurately as possible. After a first enactment of each precisely

<sup>21</sup> This is the pressure applied on the resistive material representing by the stretch and fold/bend of the sensors, as activated by the muscles, and posture of the body engaged in that particular pose.

<sup>22</sup> In Granular mode, the pre-recorded voice samples of the performer, while embodying these actions, were transferred into the system and mapped to her corresponding body postures while doing the actions. In Voder mode, we chose synthesised vowels that would resonate with the action.

executed action, I ask her to narrate out loudly details of the imaginary context, while performing the action. For example, she is now on a pier and she's pushing a heavy wooden crate. Next, I ask her to narrate how her body engages in this action. For example, she lifts her right hand, she engages her pelvic muscles, etc. When she's at the peak engagement with the action, I ask her to release the voice. This is the cue for Mika to start recording the data for training the system. The intention is that we achieve reproducible "push", "pull", "bounce", "caress" actions, that are accurately represented in the system.

The imaginary contexts and embodied narrations are meant as handles for Kate to accurately access that action in her body during the improvisation session. Once the system is trained with the data for all pre-defined physical actions, these postures are now mapped each one for a synthesised vowel, or each one for a synthesised vocal recording. In the guided improvisation, I instruct Kate to start simple, by going through all four actions in sequence. I suggest that she first tries to reproduce the actions as closely as possible to the training samples and become familiar with how the system reacts. Are the generated sound responses predictable? I guide her to focus both on her body and on the listening to the device. Can she control the sound? Can she create a song through her movements which is repeating the same melodical pattern?

The space in between the actions is mapped by the ML-module into spaces inbetween the pre-defined synthesised sounds. Next, I ask Kate to slowly play with variations to the initial actions, and by doing so, to discover the variations in sounds generated by the algorithm. I direct her focus of attention on listening. Can she, through movement, continuously explore the space of sound? I ask her to continuously search for sound and never for silence. Who is now in control, her or the AI? After she becomes familiar with the sound of the system I ask her to use it as a springboard for her own voice, etc. (Diana)<sup>23</sup>

Byintroducing the concept of "symbiotic gestures", Suchman(2017) notices that "humans and artefacts are mutually constituted". Observing Kate engaged in the process, we are witnessing a practice unfolding in intra-action between the human performer and the technological artefact. Kate is exploring the limits of the machine, and the machine reacts to Kate's actions. It is a continuous exploration of boundaries. From the sides, and by adjusting the parameters of this human-machine interaction, Mika is also modulating the boundaries between the human and the system. By suggesting shifts in attention focus, Diana is also altering the boundaries of this interaction. This type of experimenting explores in practice the notion that "agencies – and associated accountabilities – reside neither in us nor in our artifacts but in our intra-actions" (Suchman 2007). Scott is documenting our practice. Every session of structured improvisation ends in a reflection round.

<sup>23</sup> Serbanescu, Diana. Personal note (Notebook). Entry from 3 Feb. 2022.

# Potential for augmentation

Our augmentation of practice focused on improvisation with AI as a relational partner. We were interested in observing the psycho-physical process of the performer involved in this partnership, the way in which this mode of working engaged her affect and imagination, and whether she achieved moments of flow. We were interested in the potential of this system for training the vocal presence of the performer. Kate describes moments of her discoveries as well as some as her challenges as follows:

For me the relationship with the sound became that – I felt like I was blessed in moments by a thick, association rich environment and then it might disappear. You saw a moment of disappearing there. Where I got disconnected would be those gaps, often.<sup>24</sup>

Or, later on she remarks:

It feels like there's a strong platform of sound in this position – a well of sound – that I can trust to support a melody. You'll see it's a sudden leap into quite a dense sound environment – and there's an interval there immediately. (Kate)<sup>25</sup>

Kate's statements point towards an engaged relationship with the AI at imaginative level, through sound. She also comments on achieving short moments of flow. Diana, as exterior observer, also notices that:

During a session of one and a half hours of improvisation, I can perceive an increase in Kate's presence and engagement, especially towards the end of the session.<sup>26</sup>

Later on, Diana reinforces this observation with another note in the annotation system:

Very interesting moment! This resembles very much a call and response. Kate, in this moment here, you've managed to establish a very beautiful "organicity" with the machine. It goes in the direction of a truthful partnership.<sup>27</sup>

<sup>24</sup> Ryan, Kate. sessionone23FEB22.MP4.00:08:01.812.Text Annotation on Video. Motion Bank App.

<sup>25</sup> Ryan, Kate. Sessionone23FEB22.MP4.00:08:09.575. Text Annotation on Video. Motion Bank App.

<sup>26</sup> Serbanescu, Diana. Personal Notebook. 22 Feb.

<sup>27</sup> Serbanescu, Diana. sessionone23FEB22.MP4.00:12:19.292. Text Annotation on Video. Motion Bank App.

Further witnesses on experiential trials of the wearable device come from Ilona, our invited embodied voice expert. She comments on her experience:

During the lab session in Berlin in December 2021, I had a chance to wear the device myself. This experience gave me an idea of how the technology can be used as a partner in the absence of another performer, enhancing sensory awareness and providing impulses that eventually guided me towards a flow and heightened psychophysical state while listening attentively to the feedback sounds produced by the machine.<sup>28</sup>

# **Strand Two: Reflections on the studio-based work and on perceptions of the AI**

During the last week of work in February, the work was documented with video recordings of the studio sessions.These were uploaded to theMotion Bank platform. We then used collaborative annotation to reflect and comment on the work that had taken place in the studio.

This had two objectives,one to reflect on the processes of developing the technology to augment the training practice as part of the design process. The second was to probe the perceptions and attitudes experienced by the team toward the AI-based system itself. Both drew primarily on questions to and responses from Kate, as the performer practitioner, whose relation to the AI system constituted the experience at the focal point of the research project. An example of the kind of question posed about her experience of the practice itself came from Ilona who wrote in the annotation: "*It would be interesting from my perspective if Kate annotated the moments of flow in the other video and reflected on the moments triggering change, any particular significant situations or moments where for example you got bored*".<sup>29</sup> Kate responded by annotating several instances (see figure 4) in the practice session recording with the following sequence in close succession:

For Ilona – this is the tentative beginnings of trying to find a relationship with the sound / improvisation. Not bored yet, but definitely in testing mode, not flow. (05:10)

For Ilona – again, not bored exactly but definitely not flow. It felt like I didn't find much to follow there and I decided to cut to a different position. I think what follows about getting to an upright position is deliberate rather than in flow. (06:08)

For Ilona – perhaps beginning of a little flow here! (09:38)

<sup>28</sup> Krawczyk, Ilona. Personal Correspondence (Email). 27 Nov. 2022.

<sup>29</sup> Krawczyk, Ilona. sessiontwo23FEB22.MP4. 05:27. Video Annotation. Motion Bank App.

For Ilona – more flow here. (12:19)

For Ilona – definitely flow. (12:33)

For Ilona – I have a feeling I didn't know where to take the previous moment of flow. That I was scared of losing the connection with the sound so didn't follow it through past a certain point. (13:16)<sup>30</sup>

*Fig. 4: A sequence of annotations by Kate Ryan about "flow". Credit: Motion Bank*

In this context, Kate is reporting on the felt sense of "flow" to Ilona, who is an acknowledged expert in this form of embodied voice practice. As such, it is likely somewhat of an emic shorthand, denoting the practical knowledge of Kate and Ilona as recorded in this exchange between them. There are no references here to the AI, to the sensors or the computer, which is an interesting set up for questioning their perceptions of the AI systems.These questions tended to focus on the form of agency attributed to the AI system and explored concepts such as tools, agency, control and predictability. An example of one of these involved asking Kate about her experience of being in an embodied relationship to it. In figure 5, there is a question about 'control' and Kate has responded with an audio annotation.

<sup>30</sup> Ryan, Kate. sessionone23FEB22.MP4. 05:10-13:16. Video Annotations. Motion Bank App.

*Fig. 5: Two annotations. One written and one audio. Credit: Motion Bank*

Written [edited] annotation question:

Kate you used the word control a few times, for example in another annotation you wrote about placing the sensors in specific locations for 'control purposes'. How are you thinking about the relationship to the computer here? Is it the computer you are controlling, the algorithm, the synthesizer? (scott)<sup>31</sup>

Audio annotation response:

It's a very interesting question, I think I'm guilty of anthropomorphising or accepting the technology much more than everyone else in the process and have been since the beginning. I'm not thinking at all about the computer, the algorithm certainly not, the synthesiser no. My initial relationship to the sound was much more to be controlled by it. To try and control it… which is something Ilona noticed very early in the process as well. And even when I did start to get some control in terms of having some knowledge of my body, which meant I could little bit predict what would come, I think that was still the basic relationship of that interaction. (kate)<sup>32</sup>

Here Kate is describing an evolving relationship with the technology. It makes clear this relationship isn't any one thing, but something that changes over time, in re-

<sup>31</sup> deLahunta, Scott. sessionone23FEB22.MP4. 10:32. Video Annotation. Motion Bank App

<sup>32</sup> Ryan, Kate. sessionone23FEB22.MP4. 10:32. Video Annotation/ Audio. Motion Bank App

lation to the specific embodied practices involved. Kate refers to herself as "guilty" of anthropomorphising the technology. There are studies showing when people anthropomorphize AI technology this has an impact on how much they consider the AI itself to be accountable for things that happen (e.g. Epstein, et al. 2020). This has a clear implication for AI and ethics, because it calls into question whether the machine should be held accountable or those who developed it. But Kate also says she is paying no attention to the computer (also reflected in her comments on Flow), in part because her attention is directed towards her embodied practice. What is interesting is how these two forms of reflection are entwined. One concerns the (embodied) feelings of the experienced practitioner in accomplishing the task, which involves paying some kind of attention to the location of the sensors, but the sensors themselves are not the focus of attention. The fact that attention can shift here to the sensors, to the AI system and some form of agency temporarily attributed to it seems to underpin the complexity and relational fluidity that exists in these kinds of assemblages.

# **Conclusions and Areas of Further Research**

We believe there is rich potential for future research and exploration involving the study of small-scale artistic research projects like this one (as a techno-social system in miniature) with a view toward developing and refining tools for these entwined forms of self-reflection. As previously stated, AI-based systems are products of collaborative and team effort. If these efforts are also viewed as assemblages, the question is not whether or not the same kind of self-reflective processes might apply, but how to apply them. This points toward an ethics-in-practice approach that could potentially inform the wider discussions about AI and ethics. Therefore, we call for more projects like this one to be further studied and consulted by other fields.There are other research areas that can be further developed here as well. One that our project has already begun is creating a new epistemology of practice that goes somewhat against the original ethos of the post-grotowskian training: originally this training was specific about human-to-human interaction, and embodied exchange of energy. And it is interesting to discover what changes when the machine gets involved, and what the limitations of this system are when it encounters such an established tradition of embodied practice and research.

# **Bibliography**

Adam, Alison (2006): *Artificial Knowing: Gender and the Thinking Machine*, 1st ed. [Kindle] Routledge.


# **Sound of Contagion – An Artistic Research Project Exploring A.I. as a Creative Tool for Transmedial Storytelling**

*Wenzel Mehnert, Robert Laidlow, Chelsea Haith & Sara Laubscher*

### **Introduction**

Sound of Contagion (SoC) is a transmedial art project addressing the cultural narratives surrounding global diseases and pandemics through different media. It is borne out of the collaborative partnership between the University of the Arts in Berlin and the University of Oxford and explores the use of Artificial Intelligence (A.I.) in creative processes.We used texts written by the machine learning algorithm GPT-2, which was trained on narratives about pandemics from the last 2500 years. From the resulting text fragments, we built a storyworld that served as a basis for narratives, illustrations, musical compositions and a lecture performance.

Besides the aesthetic and creative output, the project is a research endeavour that explores the usage of A.I. as a creative tool to facilitate interdisciplinary working. Through the practical collaboration between artists, researchers and technology, the imaginary that A.I. can independently and consciously create art is shifting towards the notion that A.I. becomes a tool that inspires and encourages collectivecreative processes. Thus, the work follows a collaborative paradigm and acknowledges A.I. as a creative tool with its own unique aesthetic.

This article is an insight into the project. It is a case study of how A.I. is used as a creative tool within an interdisciplinary collective working in the fields of literature studies, cultural studies, composition and visual arts.

#### **About the project**

The project Sound of Contagion (SoC) understands itself as a practice-based research project that uses artistic practices as an approach to understand A.I. as a creative tool. It is borne out of the Oxford / Berlin Creative Collaborations activities (supported and funded by the University of the Arts in Berlin (Germany) and the

University of Oxford (UK)), which fosters projects and research across the arts and humanities, and supported by the Minderoo Foundation through the Minderoo-Oxford Challenge Fund.

From the beginning, we focused on an interdisciplinary collaboration, as each member had a different artistic and academic background: Chelsea Haith, with a background in publishing, journalism and gender studies, and Wenzel Mehnert, with a background in cultural studies and narratology, represented the humanities. Sara Laubscher, working as a professionalillustrator, and Robert Laidlow, composer, researcher and AI-expert, represent the artistic side. Through the interdisciplinary nature of the project, it started with several different research questions that were guiding the thought process without limiting the creativity and necessary openness of an artistic research project:

How can A.I. be used as a tool within an interdisciplinary, creative project? How do we acknowledge the unique agency of the tool? How do we integrate it within a human-technical collaboration?

The project started in the beginning of 2020, shortly after the Covid19 virus was first reported in Europe. We took this global transition into the unknown future of lockdowns and social distancing as the starting point for our endeavour.Through the international distribution of our project partners and the lockdown situations, we were forced to set up the project solely through digital communication technologies like Zoom and Miro. Some of us had not met in person, but only digitally.

# **Dataset, Data & Processes**

#### Dataset

Cognitively occupied with the pandemic, the project became an opportunity to process and deal with the shifting situation through the experiences. We decided to work with narratives of previous generations and how they dealt with pandemics, epidemics or similar global phenomena. Principally led by Chelsea Haith at this stage, we selected texts, all dealing with pandemic and plagues, as the dataset to train the A.I. and curated the selection from the last 2500 years of pandemic literature. We permitted novels and plays only. One reason was the focus on narratives and reappearing tropes that could hint at cultural patterns, the other was, that we were interested to see, what a machine learning algorithm might do with these.The selection of dataset already had interesting consequences for the outputs generated.

We divided the texts into two data sets – Sophocles to the year 2000 was one data set, and texts published between 2000 to 2020 the other. This splits the total dataset roughly in half.This split provoked its own insights into cultural preoccupations in the last two decades as well as the increased popularity and thus production of dystopian and apocalyptic narratives in popular literature in the same period. As a total data set, these texts serve as cultural artifacts and represent the collective memory from which we draw and confront the changing environment. The earliest of our selection was David Mulroy's translation of Sophocles'*Oedipus Rex* (Sophocles, 2014 [429 BC]), while the latest was Lauren Beukes' *Afterland* (2021). These texts were used to train the A.I. model with the aim to generate new text fragments based on the dataset.

#### Data

To create an A.I. model that could generate new texts, based on existing texts referring to pandemics, we first had to make two decisions:

What would comprise the dataset upon which the A.I. was trained? Which A.I. algorithm would we use?

The choice of training data has been shown to make an enormous difference to the output of generative A.I. algorithms, even when other parameters are left the same (Leeming 2022; Mehri et al. 2017; Caillon & Esling 2021). Similarly, two different A.I. algorithms trained on the same kinds of dataset can provide entirely distinct generations; compare, for example, 'WaveNet' (Van den Oord et al. 2016) trained on Beethoven sonatas with e.g., PRiSM-SampleRNN re-implementation trained on the same (Melen 2020).

We first decided upon the choice of A.I. algorithm, which was to be GPT-2 Simple (Woolf 2019). GPT-2 Simple is a reimplementation of the popular and well supported GPT-2 A.I. algorithm (Radford et al. 2019), developed and released by OpenAI.We chose this algorithm because of its capacity to be fine-tuned to produce specific types of text, while still utilizing rules it has learned through general training. In this context, fine-tuning means that we began with a GPT-2 model that had already learned the 'rules' of English and then we directed this model specifically to learn to apply this to the context of our dataset. We felt this made it superior for the task we had in mind to other contemporary algorithms, which relied only on the fine-tuned dataset (e.g., WordRNN). We also had prior experience with this algorithm (de Roure et al. 2019) which helped in speeding up the early process. Finally, GPT-2 Simple is easily accessible online through Google Collab, and does not require a GPU on a personal computer to train. This allowed each team member to access it with their own machine during COVID-19 lockdowns, facilitating easier discussion and collaboration.

For the choice of dataset, we decided upon a set of 21 pandemic-related fictional texts, taken from across the last two millennia (as described above). This decision was reached as a compromise between wanting a substantial amount of data to train GPT-2 Simple on and our interest in being able to – or at least, thinking we were able to – draw links between specific GPT-2 Simple generations and specific works in the dataset.We felt that identifying the A.I. algorithm's reference points would assist in the world-building and narrative probes element of the project.

# Training the Models

We initially trained GPT-2 Simple several times using this dataset, creating a variety of models.Most important parameters included the base type of model that we then fine-tuned (124M or 355M), how the dataset was divided, and for how many steps the model was left to train before we sampled generations from it. From this initial experimentation and iteration, we chose a set of parameters we felt were most suitable for our project.This decision was made collaboratively and subjectively, according to our qualitative judgment of the models' text generations.

To create the text, we made the following decisions:

We used the 124M model as a base. Generations from this baseline model tended to reflect more strongly the ideas of pandemics that our dataset focussed upon.

We split the dataset (in full, 12MB in size), into two halves.These were texts written prior to the year 2000 (8.1MB in size) and post-2000 (5.7MBin size). Each dataset was therefore more homogenous than the full dataset, which allowed for finer control of text generation.

We trained the models for 4000 steps on Google Collab.

Once we had created the two fine-tuned models (pre-2000 and post-2000), we sampled text generations from them. These generations were both 'seeded' and 'not seeded'.

In this context, seeded means that we provided the beginning of a sentence and the A.I. algorithm continued from that point. Not seeded meant that the model was freely generating text. Seeding texts allowed us to 'point' the model towards talking about specific elements of pandemics, whereas not seeding allowed us to see how the model would 'write' without a specific context, after having been fine-tuned on a pandemic dataset. Generations also have a 'temperature' between 0 and 1 which is a rough indication of how much randomness is introduced into the system. The 'temperature' of the generations we created was generally 0.8 as this gave us interesting texts that made, in our opinion, at least some kind of sense. There was some variation here, with some generations used being as high as 0.95 or as low as 0.5.

On balance, the fine-tuningmethod seemed to us to be a successin that themodels would consistently write about plagues, fevers,illness, chaos, death, destruction, displacement and other related themes, even when they were not seeded. While the models did not appear to overfit (copy exactly from the dataset), it was sometimes clear that some parts of the dataset had a greater prominence than others in its generations. A stand-out example would be Stephen King's *The Stand* (1990 [1978]) in the pre-2000 dataset, which was particularly long at 1152 pages.

Through this iterative process of training, sampling, dataset adjustment, and further training, we began to get a feel for how the two fine-tuned models wrote, and how we might get different kinds of responses out of them. We instructed the two models in exactly the same way, providing us with a contrasting pair of generations for each seed. The post-2000 model, for example, was prone to creating successive lines of dialogue. We used several texts from this model that were seeded with a single quotation mark and with single-word pronouns. The pre-2000 model more often described landscape in more poetic language. From the pre-2000 model we utilized more generations seeded with phrases such as 'the pandemic', 'the plague', and 'when it began'.

With the training of the models complete and several hundred generations sampled, the project moved to the next stage involving world-building, narrative probes, and transmedial collaboration.

#### **A.I. as a creative tool**

One of the guiding principles of the project was to treat the results of the A.I. as a unique form of creativity and acknowledging the creative agency of the algorithm with its own flaws and failures as a new aesthetic form. One of these "failures", in the context of fictional writing, is that the text itself does not hold any sense – as the author did not mean to create any sense in the first place. One figure of thought that helped us to understand our relation to this text comes from Roland Barthes famous essay *The death of the author* (1978 [1968]). Here Barthes proclaims that the origin of a text does not lie with the author, but with its destination: the reader (142 ff.). Barthes starts with a description of an author which he argues against:

The Author, when believed in, is always conceived of as the past of his own book: book and author stand automatically on a single line, divided into a before and an after. The Author is thought to nourish the book, which is to say that he exists before it, thinks, suffers, lives for it, is in the same relation of antecedence to his work as a father to his child. (145)

 In our case, there is no author – It is not even dead as it has never been alive – and therefore no message or moral to convey is purposefully inscribed in the text. Furthermore, the A.I., particularly the GPT2 algorithm we used, still lacked a semiotic understanding. It did not make connections between the characters, or the places mentioned in the text fragments. Rather, the A.I. produced text fragments that are similar to the narratives used as training data, without understanding the context or the connections between the elements of the narratives. It recombined the passages

we fed it with, and the results are an arbitrary combination of different writings, like Barthes describes in the following:

A text is made of multiple writings, drawn from many cultures and entering into mutual relations of dialogue, parody, contestation, but there is one place where this multiplicity is focused and that place is the reader, not, as was hitherto said, the author. The reader is the space on which all the quotations that make up a writing are inscribed without any of them being lost; a text's unity lies not in its origin but in its destination" (148).

In the context of fictional writing, this means that the meaning of a text is given by the reader, who constantly recreates the sense and searches for coherence – this is particular true when reading text fragments created by a GPT2 algorithm, as they not only lack meaning but also a form of coherence. The non-coherent text created by the A.I. forces the reader to constantly seek for a meaning and brings the reader into the active role of interpretation and drawing connections. This is the unique aesthetic of text fragments produced by a GPT2 algorithm. Thus, as the A.I. was not capable of doing it, the practice of sense-making and creating coherence had to come from us.

# Worldbuilding

For this reason and following the guiding principle described above, we considered all the text fragments as *Narrative Probes* (Fischer & Mehnert 2021). Narrative Probes are flash fiction like narratives that all probe into the same possible world, following a specific logic that we had to uncover.The narratologist Marie-Laury Ryan uses the metaphor of the text as windows to the extralinguistic realm of characters, objects, facts, and states of affairs serving as referents to the linguistic expressions. (Ryan 2001: 91) Reading and getting immersed into literature transports your mind into this extralinguistic ream, a world that constantly formed in your head as you read on. Ekman & Taylor call this process *readerly worldbuilding* and constitute that the reader carries out a construction of a world by adding pieces to a structure already in place evaluating the pieces in light of what is known about the world so far, and, conversely, evaluating the whole world in light of new pieces (Ekman & Taylor 2016: 11). With this perspective in mind, we treated the text fragments as a multitude of windows that showed us snippets of a storyworld that we as readers had to piece together. Therefore, through a structured deconstruction of the text fragments, we engaged in a worldbuilding process based on the textual cues that we found in the fragments, loosely building on the work byWolf (2012) and Mehnert & Fischer (2021):

In the first step we detected the elements of the world. For this process, five cues were of specific importance to us: Places, events and references to time, characters mentioned, values and norms as well as specific objects.

In the second step, we enriched the elements with additional information that either came from the text or was already given by the logic (e.g. when the text refers to a leader, there is also a group of people following).

The third step was to create connections between the enhanced elements (e.g. if two characters were mentioned in two different text fragments, and we assume they both exist in one world, what is the connection between the two?).

### Example

To give an example, we will explain the process with the following two text fragments:

#### Fragment I:

*They came to a real leader, they said to each other. A real man, not a fake leader, not an actor who pretended to be a leader. He was real. He lived, and he led his people to a real leader. This real man was the one they called their real leader. He was named Mannum, and Mannum knew this about himself; he had practiced seduction and charm the long time. He was a good liar, a real one. The pandemic of war had begun.*

Fragment II:

*"Then, friend", said Piranesi,"make death eternal in the liquor laws, so that no sick person may return to drink after consuming what they drank".* 

In Fragment I we learn two important things about the world: (1) In this world exists a character called *Mannum*. We can say that *Mannum* is not an honest leader and a good liar, thus he is not to be trusted and rather an evil character. (2) We also have an event, the *pandemic of war* that stands in connection to *Mannum* as it has been mentioned right after. So in the world that we want to explore, there is some kind of a biological war going on but we do not know why and who else is involved.

In Fragment II, two characters are mentioned. The first character, named Piranesi, is talking about *making death eternal*. When enhancing this caracter, we assumed that Piranesi is a chemist who was responsible for creating a deadly disease. The second character in this Fragment is *friend*. We decided that *friend* is a reference to Mannum, the leader from Fragment I, by which we connected both fragments. Based on the cues given in these two text fragments a narrative about a conspiracy plotted by Piranesi and Mannum to create a biochemical weapon to start a war is formed.

# **Outcomes**

# Nested Narratives

Working with A.I.in such a way can be seen as an intriguing source of inspiration. It gives you material to play around with, to dig in, to take out, to transform, to iterate and to make sense of.The result of our sense-making process was a nested narrative, meaning a story within a story, within a story. Each part plays in a different time and features different characters, however, they are still connected with each other.


The first part is a short story from the 17th century about the two characters introduced above, Piranesi & Mannum. Both are plotting a war using a biochemical weapon that fails and extinguishes Mannums kingdom and the people living within. In the second part, we follow the story of a little girl called Emily. At the beginning of the story, she reads a comic about the story of Piranesi & Mannum, this is how the first narrative is nested within the second. While she reads the story, a TV news anchor is presenting the latest development of an actual pandemic that is raging all over the world. We situated this situation in the year 1981 and the pandemic, which the news anchor is referring to, is the AIDS pandemic. Emily's narrative is again nested within a third narrative. The last one plays in 2021. Here we have text fragments about a person that we called The Writer. He is similar to our real lived experience in 2021. Due to the Covid-19 restriction and the social lockdown, he stays alonein a small apartmentin New York. To stay sane, he starts to write a story about a young girl called Emily, which is how the second narrative is nested within the third. Furthermore, the writer mirrored our own experience, as we were in lockdown during the production phase of this project as well.

# Illustrations

As mentioned in the introduction to this project, Sound of Contagion follows a transmedial approach to storytelling, thus we transformed the narratives into different other mediums beyond text. The first transformation was based on the illustration by Sara Laubscher. We picked key scenes of each nested narrative as prompts for Sara, who illustrated them in her own style. She got inspired and created the second output of our project.

The first illustration shows Mannum and Piranessi, sitting in an old library plotting how to use the biochemic weapon. In the secondillustration we see Emily whois sitting in front of a television set.The caption of this image is the report of the news anchor.We read it as the voiceover of the newscast.This section indicates that it was drawing from the part of the data set coming from the 20th century rather than from the Renaissance.The discourse around virus transmission, the effect on different geographical regions, and the conceptualisation of the pandemic as a weapon all point to the effect of social collapse during pandemic conditions. At the same time, we can identify reappearing tropes that often come up when we are dealing with topics of contagion, plagues or pandemics. The Writer, who became a point of self reflection but also the mastermind of our narrative, is depicted in the next illustration. Here we see the lonely writer sitting in his small apartment, haunted by his memories and trying to stay sane by coming up with stories about pandemics from different centuries.

*Fig. 1: Illustrations from the Sound of Contagion project. Copyright: Sara Laubscher*

As mentioned above, the illustrations were the first transformation of the storyworld. The next transformation came in the form of music.

### Musical Adaptation

The Sound of Contagion text was further adapted into several pieces of music by Robert Laidlow, and one by composer Marco Galvani. These include *The Writer*, premiered online during COVID-19 lockdowns and the piece *Disc Fragments,* both performed by Bandwidth Ensemble.The following is an insight into the creative process of Robert and how he reflects the process of working with the text fragments.

For a 'Sound of Contagion' event at Oxford University in November 2021, I (Robert Laidlow) composed a piece for tenor and synthesizer called *Disc Fragments*, which used texts generated as part of this project.This piece is in seven movements, the first three and last three being somewhat symmetrical – that is, Movement 1 is similar to 7, 2 to 6, and 3 to 5. This arch form was a parallel to the nested narrative form that Chelsea Haith assembled from 'GPT-2' texts we had generated earlier. The texts for six movements (all except Movement 4) were generated by fine-tuned 'GPT-2'. These texts dictated the form and material of the music; here, the first movement, *At Delphi*, is discussed as an example.There was a certain mysticism that intrigued me in this text:

*Fragment III:*

*At Delphi A Prayer Let not the Impossible Him The Impossible Him take what he has done Take it, Steal it, put the word in your mind, take it, the other choice.*

The use of 'Delphi', 'the Impossible Him', and 'put the words in your mind' implied to me that 'GPT-2' could be read as imitating an oracle. I wanted the music to create a sense of ritual and of unseen pattern, similar to (for example) Messiaen's *Quartet for the End of Time*. The voice part is constructed from a series of interlocking patterns. Ten pitches repeat, superimposed on a rhythmic pattern repeating every nine notes. The text is simply applied to this pattern. The synthesizer is set to an organ voice and uses chorale-style material generated by AI, using an algorithm developed by Omar Peracha (https://omarperacha.github.io/make-js-fake/).This repeats itself every twenty-three beats.

In this A.I. generated text, among others, I was struck by its sudden ending. When 'GPT-2' is generating, the user instructs how many characters to generate. Due to its transformer architecture, 'GPT-2' does not plan forward. It only looks back to what has already been generated whenever it generates a new token. When it reaches its arbitrary character limit set by the user, it simply stops. It cannot plan a 100-character length 'story', for example, because it is not able to plan forward. In this movement the music simply stops when the text comes to an end, which does not coincide with the end of any pattern described above. The inspiration for such a technique comes also from composers such as Birtwistle (i.e., *Carmen Arcadiae Mechanicae Perpetuum*) and Edmund Finnis (i.e., *The Air, Turning*).

#### Lecture Performance

As already hinted at in the paragraph above, the last transformation of the work came in form of a lecture performance at the University of Oxford in November 2021. The performance was a mixture of different media and perspectives. The first part introduced the process, the creation of the fragments and explained the worldbuilding process. In the second part we read the narratives, presented the illustrations, and performed the musical pieces on stage. Throughout the Performance we presented different perspectives of the project through different means of interaction: while Robert and the musicians were on stage, both, the reading of the fragments done by Wenzel as well as a contextualisation of the fragments by Chelsea, came via zoom. After the presentation, there was a Q&A session in which the audience got invited to discuss with the performers, pose questions to the project and engage in a discussion.

#### **A.I. for transmedia collaborations**

Through this project, we found that A.I. acted as a useful way to facilitate collaborations across different media and disciplines. We were each able to use algorithms generating text and images, despite each having very different levels of previous experience with algorithmic processes or computer programming in general. This allowed us to contribute meaningfully to artistic areas that were not naturally our own. Going forward, it seems as though A.I. might be useful as a means to 'try' another collaborator's discipline.This might allow collaborators to much more quickly engage in higher-level discussions and decision-making.

In addition, using A.I. as part of a collaborative artistic research project inherently mandated many creative decisions to which we all contributed. During the training portion these included decisions such as what comprised the dataset or datasets, the type of model we were going to fine-tune, what our preferred parameters were when training the models, and how we were going to sample generations from these models. Later, they included which elements of the generations we would keep or discard, and how much (if at all) we would adapt or edit them. Since there was no dedicated human writer on the team, these decisions were made collectively.

Utilising A.I. as part of a creative process also differed from a normal collaborator in the sense that A.I. does not fundamentally understand the context of what it is writing.While it can create plausible-sounding sentences and stories,it is not aware that it is doing this, and there is no intentionality behind any of the 'creative' decisions it makes. This is very different to most human writers. We therefore felt empowered to make radical decisions when treating the text, including transforming it into entirely different media, but also felt liberated in leaving it in its found-state, even if that found-state was nonsensical and narratively underdeveloped. None of our readings of the material it generated could be 'right' or 'wrong'.This is especially true of the GPT-2 model we were using when compared to models released closer to the time of this paper, which have shown significant improvement in understanding underlying context of text and ability to develop logical statements (Griffiths 2022).

#### **Conclusion**

In this project, we explored the use of A.I. as a creative tool for transmedial storytelling. The project points to the cultural-semantic flaws, which are peculiar for the creative results of A.I. works and fills this gap with the sensemaking work done by human actors. SoC challenges the myth that A.I. could replace authors, poets or other creators of art, entertainment or fiction in general. Instead, by using them in a creative process, A.I.s make an important contribution to new art forms, foster inspiration for artistic practices and thus promote independent modes of expression.

We started with a GPT-2 algorithm, trained on a vast amount of fictional literature about pandemics and plagues, that generated several incoherent text fragments. Through the process of worldbuilding, linking the fragments and creating an imaginary world, we acknowledged the unique aesthetic of the A.I., its flaws and particular form of nonsense-writing, and made the tool part of the creative collaboration between different disciplines and practitioners. By documenting the process, we showed that A.I. is not an autonomous actor, that creates art on its own, but instead, that it has the potential to become a tool in an emerging toolbox and that it needs new artistic approaches to make use of this tool and shape it to its own desires.

Last but not least, this insight into the process also serves as an inspiration for continuous, creative engagements with A.I. software. As past events have shown – and surely future development will proof – A.I. technology becomes a fixed part in our everyday life and in the workflow of designers and artists.Therefore, It is necessary to explore approaches of adaptation, appropriation and domestication within new routines and creative processes.

### **Bibliography**

Barthes, Roland (1978): *Image-Music-Text*. Hill and Wang.

Beukes, Lauren (2021): *Afterland*. Penguin.


Van den Oord Aaron / Sander Dielemann / Heiga Zen / Karen Simonyan / Oriol Vinyals / Alex Graves / Nal Kalchbrenner / Andrew Senior / Koray Kavukcuoglu (2016): *WaveNet: A Generative Model for Raw Audio*. *ArXiv.* arXiv:1609.03499v2 (cs.SD).

Wolf, Mark J. P. (2012): *Building Imaginary Worlds*. Routledge.

Woolf, Max (2019): *GPT-2-Simple* https://github.com/minimaxir/gpt-2-simple (21 January, 2024).

# **Discography**

Birtwistle, Harrison (1978): *Carmen Arcadiae Mechanicae Perpetuum*. Finnis, Edmund (2016): *The Air, Turning*. Messiaen, Olivier (1941): *Quartet for the end of time*.

# **Challenges and Opportunities for Computational Construction of Narratives**

*Pablo Gervás*

# **Introduction**

As Artificial Intelligence enters a new era where large language models show surprising capabilities for handling language, it becomes important to review the challenges and insights accumulated by the community of researches that have worked on building computational modelling of literary creativity over the years.The efforts of this community have at times achieved success at small tasks and at other times faced failure when over-ambitious goals were pursued. It would be important for efforts on related tasks undertaken from this point on to keep in mind the insights accumulated on the nature of the task and the challenges it presents. The present paper attempts to summarize some of these insights into a set of challenges that face the automated generation of narratives, but also as a set of opportunities open for the future.

The paper is structured as a review of a number of possible ways to formulate the problem of having a program generate automatically a story on demand, followed by some reflections on how storytelling might be subdivided into a set of related subtasks as suggested by that set of formulations of the problem.

# **Programs that Deliver Stories**

The possible ways of building programs that deliver stories explored in this paper are presented from the simpler formulation towards increasingly complex approaches. At each point the challenges (or lack thereof) and opportunities presented are analyzed, trying at each point to lead on to the approach discussed after it.

# How to Request Stories

The simplest possible formulation for requesting a story from a story generation program is one where there is no input and the output should be a story. Such a program might be built by considering some existing repository of stories and to select from it one story at random. The user would get a new story every time, and she would have difficulty to tell whether the story has been generated by the system or selected from a set of prewritten stories. A slightly more elaborate approach would be to consider an input in terms of a series of keywords or phrases to narrow down the set of possible stories returned. The development of a system under this approach requires the existence of a vocabulary for presenting queries formulated in terms of various characteristics such as: details about the concepts in the story world, details about characters and relations between them, plot structures that capture causal and temporal relations between events in the story, or details about the potential effect of the story on its audience. The most elaborate approach would be to provide a text prompt that describes the story in some way, and have the story generator provide a story matching the prompt.

Existing story generators have progressively shifted from the early attempts that worked on an empty input to the most recent neural solutions that accept complex text prompts, with intervening systems that worked on keyword-based queries of different complexity.

# Original Stories

If one further establishes the constraint that the outcomes must be original – that is, not already available in some form – the task becomes more difficult. Two different challenges need to be met: how to build a story and how to judge if a newly built story is sufficiently different from those already existing to be considered original.

### Building Stories

The task of building a story from constituent elements, in the hope that the result may be original, requires solving two basic problems: what to consider as constituent elements and what type of procedure can be employed to put together those constituents so that the result be a valid story. Several approaches have been considered in the past, but only some are reviewed here as illustrative examples. For more detailed reviews, readers are referred to existing surveys in the field (Kybartas / Bidarra 2016; Alhussain / Azmi 2021). Some approaches consider breaking existing stories into pieces and then recombine the resulting set of pieces from a large set of different stories into new instances of stories.

The simplest approach of this type relies on individual words as the pieces that are obtained from stories and then recombined. Attempts to achieve this based on statistical models of have proven successful in the past at producing legible text, original but usually lacking sense (Brown et al. 2015).This procedure is also the basis for the recent work on transformers (Wolf et al. 2020) that has become the rage in the world of AI. It is only when systems built based on these models have started to be prompted with full sentences describing the desired output that successful results have made their appearance.We will address these models in the section "Acceptable Stories", when I discuss more elaborate ways of requesting a story.

Another approach to generating stories from pieces of prior stories involves using instead of words some abstraction of the meaning of stories and finding ways of recombining these abstractions into conceptual representations of a story that can then be transcribed as text. That is essentially what story generation based on planning does (Young et al. 2013). In this approach, a set of planning operators is constructed as an abstraction of the events that appear in a story. Each *planning operator* represents an event in a story – usually in some form of predicate logic that allows a predicate to represent the action and a set of argument variables to represent the characters or objects that participate in it – but it also encodes additional information of which other predicates are preconditions or post-conditions of that event. This allows a planner to create chains of events that are causally linked to one another, from an initial situation to a goal. For planning-based story generators, both the initial situation and the final goal are usually provided as input.

This type of approach is faced with the important challenge of having to generate a body of planning operators of sufficient coverage to generate a broad range of stories, which constitutes a significant bottleneck for the approach. Attempts have been made to solve the problem by means of advanced methods of knowledge engineering (O'Neill / Riedl 2014) and crowdsourcing (Guzdial et al. 2015).

Other efforts rely on similar abstractions of the meaning of events in a story as predicates associated with pre-conditions and post-conditions but forego the emphasis on goal-driven causality of the planning paradigm. Such systems design their abstractions for story actions based on different conceptual approaches.The Mexica system (Pérez y Pérez 1999) relies on a set of story actions that associate character emotions and tensions between characters as preconditions or post-conditions.The PropperWryter system (Gervás 2015) relies on existing abstractions on plot relevant actions (Propp 1928) and dependencies established between them by virtue of being actions associated with narrative roles such as villains or heroes.

This approach also suffers from the lack of a broad vocabulary of story actions to employ. This is known in artificial intelligence as the *knowledge acquisition bottleneck* – a long-standing problem (Cullen / Bryman 1988) that remains current to this day (Pasini 2021). Supported by a knowledge engineering effort to annotate plots of musicals (Gervás et al. 2016), the PropperWryter system achieved success when it was engaged in the production of the first computer-generated musical, which was staged at the London West End for two weeks in 2016 (Colton et al. 2016).

#### Judging Story Originality

However, building complete stories up from elementary blocks is no guarantee that the stories will be original. If the blocks used to build them correspond to elements that appear in already existing stories – and this is usually the motivating requirement when building knowledge resources to use as such building blocks – and the construction procedure is guided by reasonable heuristics, there is a non-zero probability that the building process resultin stories very similar to the ones thatinspired the construction of the knowledge resource. Furthermore, when story construction efforts of this kind have to be maintained over time, it becomes important to devise means to avoid the repetition of stories built previously by the system.

To address this challenge, research has been carried out on developing means for deciding when a story is similar to another (Fisseni / Löwe 2014; Hervás et al. 2015) or when a story can be considered sufficiently novel (Peinado et al. 2010). The existence of these effortsis very significant, becauseit opens a new avenue of research on computational storytelling, one where the systems do not only generate stories but also include the ability to produce judgments of some kind over them. This follows a general tendency in the field of computational creativity to progressively evolve from systems that merely generate artifacts to systems that develop their own aesthetic and which are capable of defending why the artifacts they produce are valuable (Colton / Charnley / Pease 2011).

### Acceptable Stories

The need for systems that construct artifacts to include procedures for assessing the quality of their outputs arises from the fact that such systems essentially explore a search space of possible artifacts, and not all the artifacts in that search space are equally valid. Once a specific type of artifact is defined in a way that allows instances to be constructed computationally,it becomes very easy to produce a very large number of different instances of it. Ensuring that the produced instances are acceptable is usually a little more difficult.Making them all be valuable instances is a significant challenge.

To address this challenge in the field of narrative generation, recent research efforts have focused on defining metrics for story quality. Following traditional engineering practice, these efforts usually identify one specific aspect that is known to impact the perception of quality and attempt to model it, without necessarily considering other aspects that are also relevant.Examples of aspects that have been considered in the development of metrics for story quality are: character believability (Gomes et al. 2013), semantic coherence across the story on relevant events such as birth / death or romantic entanglements (Gervás / Concepción / Méndez 2021), reproduction of features observed in human-written stories (Leon et al. 2020), consistent use of entities across the narrative (Papalampidi / Cao / Kocisky 2022) or probability of each sentence in the story with and without its preceding story context (Sap et al. 2022).

The set of relevant aspects that should be considered is very large, and efforts should be made to integrate these various models so they can be applied together onto the same stories. The quality of story generators would improve significantly once these metrics can be applied as filters to their output.

The integration of a metric on story quality – essentially a story critic module – into a story generator presents another important challenge. Story generation systems are generally constructed on the basic assumption that they take an input and generate an output that is a story. The idea that one should read the output, reflect upon it and then, based on that reflection, rework it in some way is fundamental for the human approach to writing (Flower / Hayes 1981; Sharples 1999). However, generation systems constructed on a similar basis are few and far between. The Mexica system is a remarkable exception, being based on Sharples' cognitive model of the writing task (Sharples 1999).

There is another view on the quality of stories that differs slightly from whether it is simply a good story. This view considers whether the story is acceptable as an instance of a story described in some way by the user that requests it. Many of the features described above as potential input could be used to narrow down the set of stories that are acceptable in response to a specific input: categories of stories, details on the world or the characters, plot structures or potential effects of the story on its audience. If an interface of this type is made available, a story returned by a story generator in response to a specific request would only be acceptable to the extent that it matches the given request.

The recent development of neural-based solutions for text generation such as GPT-2 and GPT-3 (Zhang / Li 2021) makes this type of solution capable of responding to user prompts with large paragraphs of valid and fluid text. These solutions are based on a large language model trained over a neural network representation, and these models essentially capture the relationships between words as featured in examples of text written by humans.The interactive nature of these solutions has made it possible to present requests for stories as simple sentences that describe the desired story. The tests carried out so far on how acceptable the responses of these systems are in terms of their correlation with the given prompts show outstanding results for instances of general conversation, even though testers remain unconvinced of their general validity (Elkins / Chun 2020). More exhaustive tests need to be carried out on the applicability of these solutions in the realm of narrative generation.

#### Stories for a Specific Purpose

This observation leads into the final but not less important challenge that story generation faces: generating stories for specific purposes.The concept of purpose is very difficult to represent formally in terms that a computational system would understand. Yet it is crucial for guiding the composition of any message in any media that is aimed at communicating to an audience (Smedley 1952).

Story generator systems very rarely consider any representation of purpose among their inputs. In most cases, when a story generator is designed there is some idea in the mind of the designer of what the generated stories should achieve. As a result, story generation modules are sometimes included in systems designed for very clear purposes, such as teaching children about bullying (Aylet et al 2007), supporting emergency rescue training (Hullett / Mateas 2009), or for military training (Zook et al. 2012). In such cases, there is a clear purpose that the generated story has to fulfill, and the generator is designed with that purpose in mind. But the identification of the purpose and the tuning of the generation mechanism to ensure the generated stories achieve it are wholly in the mind of the designer and not explicitly modeled in the system.

The main body of research on storytelling has focused on generating stories whose content is constructed at the same time as the story. This actually sidesteps one of the main purposes of storytelling as used by people, which is to convey a set of events that has actually happened. In this case, the overall description of the task changes significantly. The set of inputs to consider to a potential system that builds such stories must necessarily include a description of the events that the story needs to convey. In most cases, the construction procedure is expected to respect that set of events and not introduce any additional ones that did not occur. This slightly different formulation of the task has been addressed much more rarely but instances exist of systems that generate stories about events that the story generator receives as input: stories about past interactions between a user and an intelligent agent (Behrooz / Swanson / Jhala 2015), stories abstracted from the moves of the pieces in a given chess game (Gervás 2014), stories about a user constructed from data on their routines acquired via sensors (Reddington / Tintarev 2011), stories to explain cybersecurity logs (Afzaliseresht et al. 2020), narratives from personal digital data (Farrow / Dickinson / Aylett 2015) or narrative biographies from knowledge extracted from the web (Kim et al. 2002).

Living as we do in world obsessed with fake news, we need also to consider a revised version of the task where a story is generated that is based on a given set of facts, but purposefully departs from it at some point when building a story about them. This is another storytelling task that is intuitively familiar to most people: biased historians or politicians do it often to their audiences, parents do it to soften the world for their children, transgressors do it to hide their offences. It is also the key task underlying historical fiction or fictional accounts of recent events. From a computational point of view this task can be understood as a combination of the task of generating a new story with the task of telling a known set of facts. This particular view of the task has been addressed computationally by attempting to match the known set of events to a particular plot structure, allowing mismatching real events

to be omitted, and using the plot structure to provide additional fictional events to complete the story (Gervás 2018a, 2018b).

Another possible approach is to tell stories tailored for a specific audience. This is, yet again, an aspect of the storytelling job that is considered inseparable from the task itself by human writers, and yet very rarely addressed in story generation systems. Fortunately, we are beginning to see research aimed in this direction, such as a module to enhance a video game by being able to tell stories tailored to a specific player, based on a model for each individual player (Ramirez / Bulitko 2012).

Whether the goal is to achieve a particular purpose or satisfy a specific user, it is clear that generator systems would do a much better job the better they understand the potential reactions of the audience of their output. Computational models of how this might be achieved in terms of having specific models of the reader reaction included in the generation system have been proposed (Gervás / León 2016). Although this type of integration is yet in the future, efforts already exist to build computational models of the reading task. As in the case of metrics on story quality, specific approaches tend to focus on specific aspects of the task, such as understanding the set of events narrated in a given discourse (Niehaus / Young 2014), modeling the reaction of the reader in terms of an evolving curve of suspense (Doust 2015), interpretation of embedded stories told within a frame story (Gervás 2021), or reconstruction of the actual chronology of a story told in a discourse that allows flashbacks (Gervás 2022).

#### **Rethinking Basic Assumptions**

In view of the analysis presented to this point, it is important to accept that the concept of computational story-telling should be considered not as a single task, but as a set of interconnected tasks, all related to narrative, but corresponding to formulations that are very different in computational terms, as described above. Most of the existing systems for computational storytelling operate at the level of discourse as understood by Ricoeur (1976): a sequence of sentences, where each sentence involves a predicate applied to some entities that need to be identified by the subject (and objects) of the sentence. This view abstracts away from what Ricoeur describes as "the particular structure of the particular linguistic system", which would be closer to the text.

From the point of view of a detailed analysis of computational storytelling, this distinction between text and discourse provides a key tool to understand the broad range of solutions that have been developed over the years. Suppose we consider that a story describes a *story world* – often fictional but not necessarily so. Suppose also that its narrative structure can be represented by some form of *discourse*, that may be rendered as *text* but which may allow rendering in other formats. Under these assumptions, computational storytelling may be described in terms of a number of component sub-tasks that embody transitions between story world, discourse and text. The sub-task of *narrative generation* involves construction *ex novo* of either text, discourse or world.The sub-task of *narrative composition* involves building a discourse that tells some story from a world, or building a text that tells a given discourse.The sub-task of *narrative interpretation* involves reconstructing the conceptual discourse that underlies a given text, or reconstructing the world that underlies a give discourse. These ideas are expanded in the following sections.

### Constructing Narratives: Narrative Generation and Narrative Composition

Let us consider narrative generation as the task of constructing a story that did not exist before. Faced with the distinction between story world, discourse and text, an engineer wishing to design a story generator needs to decide how many of these levels she is willing to represent within her design. All these levels are interconnected, in the sense that a reader faced with a text will interpret a discourse as its meaning, and imagine a story world as its content.The question is at how many of these levels will the program operate.

The system could generate text directly, with no additional representations in terms of discourse or story world. This is indeed the choice favoured by neural approaches to the story generation task.

The system could focus on generating discourse, abstracting away from the complexities of generating language but still building a sequence of discourse – represented conceptually – that has a valid narrative structure. This presents the advantage that the story world does not have to be represented in the system. The approaches described above that rely on abstract representations of the meaning of stories as story actions operate under this choice.

Because people are better readers of text than of predicate logic representations, systems that generate discourse usually include a module that transcribes the discourse onto text. This process is not really generating the story, but rather finding a way of telling as text a story that already exists as discourse. We refer to this task as narrative composition.

The system could include an explicit representation of the story world. Stories can now be built directly by generating events or characters, or locations in the story world directly and then find a way of telling the story of what happens in her story world.This is the choice favoured by story generators based on simulation of an underlying story world (Ryan 2018).Under this approach, two further tasks are needed: one that transcribes possibly a selection of the events in the story world onto a discourse, and one that renders the resulting discourse as text. Both are instances of narrative composition.

With respect to the classic pipeline for natural language generation systems (Reiter / Dale 2000) the transition from story world to discourse corresponds to the task of content planning, and the transcription from discourse to text to the stages of sentence planning and surface realization.The content planning task will optionally include a task of selecting which events in the story world are included in the story, which aligns well with the recently formulated task of story sifting (Ryan 2018). Research efforts related to generation of stories based on known facts covered in the section "Stories for a Specific Purpose" correspond to tasks of narrative composition.

#### Processing Narratives: Narrative Interpretation

When the situation is presented in this manner, it becomes clear that there is a related set of possible computational tasks corresponding to the transitions from text to discourse and from discourse onto a conceptual representation of the story world. These correspond to subtasks of narrative interpretation. This task of narrative interpretation has been a long-sought goal in the field of artificial intelligence, originally known as natural language understanding (Allen 1995). Historically, a succession of un-met expectations over the years lead the natural language processing community to operate under a progressively shrinking scope, focusing more and more on specific sub-tasks instead of attempting anything like processing text onto an exhaustive representation of its meaning. Recent efforts to model identification of narrative structure (Gervás 2022) would correspond to the transition between discourse and story world, including the possibility of a given discourse referring not to one single story world but to a number of interrelated story worlds.

The attempts at developing computational models of the reading task would also constitute instances of the narrative interpretation task. Very little is known about this task, because it has only recently started to be treated computationally. This makes it stand out as an open field for challenges and opportunities for future research.

#### **Conclusions**

The panorama of research efforts described in this paper suggest that story telling needs to be considered not as a single monolithic task but as a set of interconnected sub-tasks. A simple distinction between text, discourse and story world provides means for describing some of these sub-tasks in terms of processes of narrative generation, narrative composition and narrative interpretation

In terms of how the existing research on computational narrative relates to human performance on equivalent basic goals related to the construction of stories, it is clear that computational solutions have a long way to go. The analysis presented in the sections "Programs that Deliver Stories" and "Rethinking Basic Assumptions" constitutes a fair review of the particular tasks considered to this point in past research on computational narrative. However, it becomes clear that, regardless of which of the combinations of transitions between the levels have already been explored in specific computational systems, the approach that humans apply to the storytelling task takes advantage of a very broad combination of this set of subtasks. Some writers plan out the plot of their story to the very end and only then sit down to write it. This would align with the idea of building a discourse and then telling it. Other authors construct a whole word and then find stories to tell in that world. Others sit down each day to produce a number of pages of text with no idea of how the plot is going to proceed as a result. Even the same writer may start from a sentence that sounds promising to him (text), work out in his head how the characters might react to the resulting situation (simulation on the story world) then rework it all to ensure that the revealing moment takes place at the optimal point in the scene to maximize its impact on the reader (iteration of discourse revision guided by a metric for impact), possibly guided by a dynamic model of how the reader progressively react to the sentences as they appear in the text.

As a result, to the various specific challenges describedin the paper onemust add the grander challenge of exploring how all of these specific challenges may interact in more complex processes of story writing.

# **Acknowledgments**

This paper has been partially funded by the project CANTOR: Automated Composition of Personal Narratives as an aid for Occupational Therapy based on Reminiscence, Grant. No. PID2019-108927RB-I00 (Spanish Ministry of Science and Innovation) and the ADARVE (Análisis de Datos de Realidad Virtual para Emergencias Radiológicas) Project funded by the Spanish Consejo de Seguridad Nuclear (CSN), Grant Ref. SUBV-20/2021.

# **Bibliography**


# **The Marcel Duchamp Case in, against, or after Artificial Creativity**

*Jan Løhmann Stephensen*

In the introductory chapter entitled "Even an AI could do that" of the book *Artificial Aesthetics: A Critical Guide to AI, Media and Design* (2021–23), which is currently being published chapter-by-chapter on digital culture theorist Lev Manovich' homepage, Emanuele Arielli,Manovich's co-author, notes that while some sorts of art with more 'traditional' or 'classical' characteristics seem quite straightforward for an AI to reproduce, the oeuvre of Marcel Duchamp poses a set of perhaps unresolvable problems. In this paper, I will discuss how this argument on some levels makes good sense, whilst on other levels less so. In extension of this, I want to reflect on what the logic underlying Arielli's argument might tell us about how the 'project' of artificial creativity and artmaking is currently being perceived and pursued.

# **Marcel Duchamp, the proto-post-conceptual contemporary artist?**

If artmaking performed by an autonomously working artificial intelligence has come to stand as 'the final frontier' of artificial creativity research (Colton & Wiggins 2012), the successful reproduction of Marcel Duchamp, given his status as arguably the most important artists of the 20th Century, would be the pinnacle of such endeavours. But it actually would so for a number of other reasons than the ones Arielli suggests. As I will argue below, when I dive into the actual passage in "Even an AI could do that", Arielli extends his proposition concerning Duchamp to include contemporary art in general, thereby suggesting – quite reasonably, I would argue – that Duchamp should be considered some kind of proto-conceptual artist and hence also a precursor to contemporary art as so-called 'post-conceptual art'. This is for instance in accord with the perspective of Juliane Rebentisch,<sup>1</sup> who notes that contemporary art no longer seeks to conform to genres or traditions, and in this connection credits both the readymade and conceptual art for this 'turn', that has left (or enriched) us with an "unfathomable diversity of artworks" (2013:

<sup>1</sup> All quotes from Rebentisch (2013) are my translation.

111) as well as with a lot of philosophical and art-theorical issues to deal with. So, the problems Duchamp might pose to AI-art are quite similar to the ones much contemporary art would pose, Duchamp just happens to be a good paradigmatic example.

Those familiar with Duchamp scholarship would know that this kind of proposition seems to indicate that I am taking sides in quite heated debates (cf. Buchloh et al. 1994; Lund / Wamberg 2019); that I am more or less saying that this particular reading – Duchamp as a 'practical philosopher' of (pre- and post-)conceptual art – should be regarded as 'the true meaning' of Duchamp. In my opinion, I am not. I am simply referring to the mainstream dominant reception history of Duchamp. Which, of course – and here comes an important point of this paper – is not really just a process in which the oeuvre of Duchamp has been received or interpreted in this or that way, right or wrong. It is rather a process of co-production, that is, of making the person Marcel Duchamp and his oeuvre into *that* 'Duchamp', which over the decades – beginningin the post-war decades when he was rediscovered (cf.Crow 1996: 81–87) – has been taking place between artists, critics,academics, philosophers of art, etc., that is, within the institution which sociologist of art Howard Becker (1982) refers to as the 'art world'. So, perhaps it is just as meaningful to think of 'Duchamp' as a product than purely as a producer (or even creator).

In fact, to jump to the conclusion of this paper, it is really the complicated nature of these kinds of inherently social and often also political and / or ideological processes of co-production within the art world (including Academia), which all entail the co-production of art works and their 'meanings', as well as of artists' oeuvres and (the conception) of artists' 'creativity', that is so difficult for an AI to grasp. So, the problem for an AI is not these formalistic aspects concerning what an artificial Duchampian artwork might 'look like', but all the fuzzy social stuff that surrounds it.

# **'Art' and 'creativity' as social categories – and their un-artificial Others**

Secondly, this perspective on the socio-historical character ofmaking Duchampinto 'Duchamp' is much broader than just related to this particular artist, his oeuvre and his 'brand'. It is also applicable to the categories of *art* and *creativity* in general terms. Thus, I would argue, we should never speak of these phenomena as things that just *are,* in essentialist terms. 'Art-ness' or 'creativity' is a status that is conferred upon artefacts, practices or people – or for that sake: technologies – and in this way 'art' or 'creative' / 'creativity'are co-produced through social actions,which, tomakematters even more complicated, are embedded in material, technological and ideological contexts. Hence, at least in principle, anything (or anyone) could become 'art', 'creative', or both.<sup>2</sup> Yet, this does not happen; and certainly not randomly. It always happens as a part of a complicated social game within the art world (cf. Becker) or within the so-called 'creativity complex'(cf.Reckwitz 2017),and typicallyin ways that are entangled with other concerns such as ideology, politics, and economy.<sup>3</sup>

All of this raises a host of discussions, which are impossible to cover in full detail here. Most important is the following: insisting on thinking about both creativity and art as socially becoming phenomena will enable us to problematize the problematic assumption persistently resonating throughout much discourse on artificial creativity and artmaking, namely, that these phenomena *pre*-exist in an *un*-artificial form. In short: the idea that creativity and art exist in 'natural' forms that we can get to know more about and then reproduce, which is the fundamental logic underlying so much artificial creativity research within academia, software-engineering, the art world(s), and the zones in which all these meet.

This logic might of course simply be a bi-product of the word 'artificial', since it mostly bends towards the connotation 'emulation', which according to Jensen (2018) means "to reproduce […] in a way that is causally identical to the thing being emulated", rather than towards 'simulation', which means to merely "pretend" or "give the appearance of " something pre-existing),whichis what the Turing-testmeasures (which is not without its own set of problems either, but that is another paper). But it probably also has to do with a lack of historical awareness, perhaps even interest. Regardless of why this might be, the point is this: dominant discourses on artificial creativity and / or art often end up reinforcing a historically inherited conception of some pre-existing, natural, or un-artificial human creativity (for instance as a capacity to make art). This point has also been pointed out by Joanna Zylinska in her book *AI Art*. "The frequently posed question 'Can computers be creative?'", she notes, "reveals itself to be rather reductive because it is premised on a pre-technological idea of the human as a self-contained subject of decision and action" (2020: 55). So, these notions of 'artificial art' / 'creativity' are often based in and reproduce, or even reinforce, the assumption of the pre-existence of an autonomous, purely human creativity that is not always already socially, culturally, technologically, and politically / ideologically entangled.Which is quite paradoxical since it contradicts the often-repeated claims that AI will fundamentally challenge the human monopoly on being *the* creative species, the only being making art.

<sup>2</sup> Creativity is often, as anthropologist Tim Ingold (2011: 215) puts it, "read backwards", that is inferred from the canonized artworks as the product of the artists' (inherent) creative ability/creativity. In other words: the outcome of social negotiation processes (often containing cultural and material aspect as well) are typically being naturalized as essential, inherent, sometimes even innate, individual characteristics.

<sup>3</sup> Various creativit*ies* (both concepts and practices) have emerged in intimate entanglement with broader socio-politically and ideologically loaded debates and struggles especially since WW2 (cf. Stephensen 2016; 2020; 2022).

So, the problem*,*I would argue, is *not*, as it is so often claimed, that we have not yet figured out what creativity (or art) *is* (and subsequently: how to reproduce it). It is rather, that we have forgotten *that* we have, in fact, invented it, as well as *how* and *why* (Stephensen 2022). And we have forgotten that it for instance at some point would have been unthinkable – to paraphrase both Andreas Reckwitz and Michel Foucault (2003) – to think of ourselves as *the* creative being; to expect and evaluate your workplace as the natural venue for the actualization of your creative nature and aspirations; or for that sake: to think of creativity as being so pivotal to our socioeconomic structures that it is crucial to automate it the form of artificial creativity. As Donna Haraway seminally notes, "it matters what ideas we use to think other ideas with", "it matters what concepts we think to think other concepts with". (2016: 12 and 118). Hence, as a minimum, it would be a good starting point to recognize the fact that we are, in fact, thinking with ideas.

# **"Even an AI could do that!"**

So, let us dive into the actual passage by Emanuele Arielli on the problems Duchamp poses to an AI. As already mentioned, Arielli starts by claiming that

it seems particularly straightforward to produce traditional or classical artworks as they tend to display a clear, recognizable style and follow the specific patterns of an artist, school, or tradition. Machine learning systems are ideally suited to analyze numerous occurrences of an object type with slight variations and extract the relevant features and patterns. (2021: 7)

Examples of this from visual art / painting could be (1) features 'style transfer', (2) attempts to train a GAN-algorithm to make genre-paintings like portraits from a specific period (cf. Obvious' much debated *Portrait of Edmund de Belamy* (2018)), or (3) the different attempts to "add" more works to the oeuvre of specific artists (i.e., *The Next Rembrandt*-project). Within the field of music *Sony Flow Music* has in a similar vein experimented with making new compositions "by" Bach and the Beatles (albeit the latter recorded by other performers). In contrast to these quite doable examples, Arielli continues, it would, however,

be very difficult to reproduce something like a Duchamp-style body of work, since the AI would have to start with the very heterogeneous dataset of this artist's oeuvre, encompassing *Fountain*, *Bottle Rack*, the *Large Glass*, the late *Étant donnés*, and so on. (7–8)

It is worth noting that Arielli even leaves out both the more conventional pictorial works at one end of the morphological spectrum of Duchamp's oeuvre (e.g., *L.H.O.O.Q.* and *Nude Descending a Staircase, No. 2*); and at the other end his 'architectural interventions' (e.g., *Door: 11 Rue Larrey*; and not least: *40 miles of string*). Two extremes, which would make it even more difficult for the AI to deal with (and hence strengthen his argument), given that the material and stylistic dataset would become even more heterogeneous. Whatever the reason for this omission, Arielli continues by noting that

Typically, conservative views on art consider technical mastery as a criterion for "real art", and many people still don't consider something that doesn't require technical ability to be art. However, technical ability means procedural knowledge, and AI are designed to deal with precisely this kind of knowledge. Clearly recognizable styles are well-defined problems that can be reduced to computational tasks, while the generation of variants that don't follow compositional rules (like Duchamp's works) results in ill-defined tasks that have no easy procedural solution. "**My kid could have done that!**", the popular cliché directed at contemporary art, seems now, in an ironic reversal, to turn against the great and stylistically complex – but computationally scalable – art of cultural tradition: **even an AI could do that**. It is the Duchamp that remains outside of AI's creative abilities […] (8, author's emphasis)

And then he adds the promise of future success, which, as Marx (1969/1856) would say, all discourses on AI in our days seem pregnant with: "at least for now". (Arielli: 8)

### **Duchamp in, after and against artificial art**

So, what is it exactly, that an AI cannot do? There seems to be four things to consider regarding Duchamp in, against or after artificial art:

# The impossibilities of form

First, there is Arielli's argument concerning the impossibility (at least for now), namely that it is the sheer heterogeneity of the oeuvre on the formalistic level, which would make Duchamp a tough case, since the task of emulating (or simulating) Duchamp's oeuvre would leave the AI with a very diffuse set of data to condense and learn from. Here, you could probably object: in comparison to what? Certainly, Arielli's argument might intuitively seem true if we compare Duchamp with what Arielli calls 'traditional' / 'classical' artworks. But on second thought, is not even this category and dataset already much too broad? A too vaguely defined task? Too 'wicked' a problem? (cf. Buchanan 1992). No matter what, if the task is defined narrowly enough, then reproducing the past, perhaps with a twist of novelty and surprise (cf. the so-called 'standard definition of creativity' (Runco / Jaeger 2012)), poses no big challenge. But despite following all the conventions of how to make art / creativity – novelty, surprise, domain-specific relevance, etc. – you would not really be making art *per se*. You would just be making (pretty) pictures. Either way, Arielli has a valid point. Since Duchamp is, in fact, all over the place, Arielli's argument does have some merit: on the stylistic or morphological level alone, Duchamp would pose a huge challenge and does in that sense fit in badly with the current AI-art / creativity agenda.

# The historical inevitability of Duchamp: by anyone (or anything)

On the other hand, Duchamp is historically vital, perhaps even unavoidable. In fact, it is really hard to imagine that we would even be discussing artificial art-making 'as Art' (or 'as creativity'), had it not been for the for "absolute permissiveness", as Thierry de Duve (1996: 291) puts it, which was instigated by Duchamp (and, of course, picked up by other artists, critics, theoreticians, philosophers, etc., who have all co-produced the Duchampian legacy of and within contemporary art), and which has blessed (or marred) essentially all art-making *after* Duchamp.<sup>4</sup>

This permissiveness has a number of faces. First, as emphasised by Arielli, it is related to how post-Duchampian art can look and what materials it can be made of, and so on. In fact, as many have pointed out,it really does not look like anything specific, sometimes perhaps even nothing at all. Indeed, if post-conceptual contemporary art after Duchamp has a certain 'look', it might actually be "the look of thought" as Donald Kuspit (1975) famously phrased it. Art after Duchamp increasingly – but not univocally, of course – becomes a "quest for a nonretinal art" (or a 'cerebral art') through a "strategy of 'perceptual withdrawal'", as Benjamin Buchloh described it (1990: 116). Following this, art after Duchamp no longer seems obligated to conform to more or less rigid genres, traditions, or even specific media either, which according to Rebentischmeans that "a given object's status-as-art can no longer be deferred by specific qualities of the object". (104)

<sup>4</sup> A brief comment on this "after" (which also figures in the title of de Duve's book *Art After Duchamp*): this morphological, and general, permissiveness is not something Duchamp *invents* (or his followers for that sake). His oeuvre, especially the readymade and all their analytical and philosophical paratexts, "only reveals it" (290–1), de Duve notes, as a "fact" about how art is made; or how things are made into "Art" (cf. below on Greenberg's hesitant appraisal of Duchamp's "theoretical services" as well).

This both has to do with the fact, that these inherited categories deliberately have been undermined throughout the 20th Century, for instance as part of the so-called 'centrifugal gestures' of the (anti)-artistic avant-gardes (Reckwitz 2017: 57–84), which sought to dissolve the boundaries of art practices and "free" creativity from the restraints of Art, thereby echoing Peter Bürger's seminal point that these gestures were art-internal political attempts to "sublate" the so-called institution of art (1984: 49). The historical effect of this, Bürger elaborates, is that:

The availability of and mastery over artistic techniques of past epochs […] owed to the avant-garde movements make it virtually impossible to determine a historical level of artistic procedures. Through the avant-garde movements, the historical succession of techniques and styles has been transformed into a simultaneity of the radically disparate. The consequence is that no movement in the arts today can legitimately claim to be historically more advanced as art than any other. (63)

Any material, from any period can – at least in principle – in the contemporary condition become part of an artwork. That being said, and in contrast to Bürger's last sentence, it is interesting to observe how most within the fields of AI-art / creativity obviously did not get the memo on the irrelevance of being historically most advanced. Even though from the observer's point of view it has become increasingly difficult, if not impossible, to determine the contemporaneity-status of a given artwork, there is no shortage of attempts to highlight the futuristic up-to-date-ness among the producers of AI-based art themselves, who often quite blatantly insist on the inherent novelty-value of their own endeavours. Contemporary art in general has become increasingly preoccupied with the assimilation of new technologies and modes of production into the artistic practice (Lund 2022: 146–7). But the tendency to fetishize new technologies certainly is most notable within AI-related artmaking and experimentation; sometimes in the form of critical artistic experiments, other times less critically inclined (for instance the bulk of AI art which currently seems to oscillate between sheer technological fascination and the historically blind, or at least: disinterested, ambition to do research into the (supposed) "true nature of creativity").

In this sense, the very concept of a non-human-made art on a computer hinges on this premise, which is very much tied up with 'Duchamp' (that is, what this artist has come to signify): that the material categories of artworks have become so permissive, perhaps even promiscuous. But it also hinges on the point about Duchamp made by the art historian John Roberts, who reads Duchamp as a theorist of artistic labour. Drawing on Harry Braverman, Roberts thus claims that Duchamp with the readymade instigates "the redefinition of skill" – that is: the reskilling of the artist – "with a socially expanded understanding of the circuits of authorship"(2007: 5), hereby also in the long run introducing the possibility of non-human creators as producers (or co-producers). By re-posing "what artistic skills might look like" (81) the demarcations concerning whom (or what) might be considered an artistic producer is forever re-defined beyond recognition. This reskilling thus entails 1) the reconceptualization of the artist as a (partial) philosopher / theoretician of art (as I will elaborate below); 2) (fairly new) compositional principles (i.e., juxtapositional forms like montage); and 3) expanded and more explicit collaborations with nonartists in the production of art.

In sum, all these components are the necessary philosophical, theoretical, and practical preconditions for the project of AI-art to even begin to make sense.<sup>5</sup> But it is actually also the issue of *making sense of* that is pivotal to this discussion, since it is the understanding of the social, cultural and historical contexts of how things become art or creative / creativity that poses the most pertinent problem for the AI to deal with.

# Art-as-idea vs. no idea about art

Following the above, one could probably argue that if it was only about how things look (materiality, style, form, etc.), the post-Duchampian turn makes AI-art really easy. Just do something! Anything. Anywhere. Whatever. Whomever. But it is, of course, not that simple.Thereis more toit.Returning to Arielli, I argue that he would be mistaken if he were to interpret all this permissiveness and all of these openings as other than merely morphological "symptoms" of what is at stake (in contemporary art) after Duchamp, which he sadly seems to do, at least in the quoted passage. Because it is important to emphasize that these trajectories emanating from Duchamp (and all those artists, critics, theoreticians, philosophers, etc., who subsequently were part of making Duchamp into '*that* Duchamp') are not preoccupied with the emancipation of the artistic means in and off themselves. Instead,it all pivots around especially one crucial question: "What is art (or creativity)?"; or perhaps in a more constructivist phrasing: "How does stuff become 'art' (or 'creative')?" What I am hinting at here is the fact which conceptual artist Joseph Kosuth with specific reference to Duchamp insisted on, namely that what becomes important within art is the *idea* of art itself, not its concreate realization:

<sup>5</sup> It is worth noting that these 'turns' within the art world actually run parallel to many conceptual shifts within the fields of creativity (both on the level of practices and within creativity research/studies). Yet, as I will elaborate below, especially the field(s) of AI-Art/Creativity actually come across as quite belated on many levels, since both the *producer*-oriented perspective on creativity/art-making (rather than a structural perspective) and the *product*-oriented perspective (in contrast to the so-called 'processual turn' within art as well as creativity (studies)) are maintained in parallax forms (for instance in the search for the 'stand-alone AI-Artist' making 'Art Works').

what holds true for Duchamp's work applies as well to most the art after him. In other words, the value of Cubism is its idea in the realm of art, not the physical or visual qualities seen in a specific painting, or the particularization of certain colors or shapes. For these colors or shapes are the art's 'language', not its meaning conceptually as art. (1991/1969: 19)

Hence, Kosuth claims – perhaps too exuberantly – "all art (after Duchamp) is conceptual (in nature) because art only exists conceptually". (18) Echoing this, Rebentisch notes that "art after Duchamp is […] an 'art after philosophy', which now on its own terms struggles with the philosophical issue of the nature of art" (136). Art no longer leaves it up to philosophy to settle the score. Instead, such philosophical reflections have become integral to the artistic practices, processes and products (if any) themselves.

The material, stylistic and morphological consequences of this is also summed up by Lucy Lippard with her notion of the 'dematerialization of art'. A kind of Conceptual art ("with a capital C") in which "the idea is paramount and the material form is secondary, lightweight, ephemeral, cheap, unpretentious and / or 'dematerialized'". (1973: vii)<sup>6</sup> This explains why art after Duchamp does not really have a specific style and can look like anything – or nothing – specific at all. It can be any material, "anywhere and at any time by anybody" (5), as even the formalist art critic Clement Greenberg, who strongly disliked Duchamp but still felt obliged to acknowledge his "theoretical services", noted (cf. de Duve: 286–8). Because these kinds of questions – "What is art (or creativity)?", "How does something become art (or creativity)?", etc. – can be asked in any materiality, or for that sake in none at all; or as we today, post-Actor Network Theory / New Materialism, would be prone to say: by any *actant* and in (or through) any *matter* (cf. Latour 2005; Fox 2015).

All this is in stark contrast to what most AI-assisted art-making focusses on, including that which Arielli makes reference to, namely *end products* (styles, oeuvres, periods, etc.) and similar computationally easy / easier aspects. Whereas the fuzzy and quite dizzying socio-material processes that go into making something into 'art' (or deserving the label 'creative'), which so often seem to be at the back of the mind of much contemporary art as well as those theories that take an interest in this (like Institutional Theory), rarely seem to be of much interest. Paradoxically, given the topic of this paper, what is lost in this incessant focus on style, genre, morphology, etc. is what seems to be *the core* of Duchamp's practice (and not least the trajectories

<sup>6</sup> The "and/or" emphasizes the point: 'dematerialization' is *not* to say that all art becomes immaterial, that its materiality melts into air. The specificity of materials simply loses their pivotal position as "gateways" into art-ness, since it is the idea – especially the reflection upon of the idea-ness of art itself – that is important, not adherence to (or negations of) inherited genres and traditions.

that both follow and (co-)produce him): namely the focus on the question of the status of art itself (and for that sake his take on creativity as well (cf. Duchamp 1973)). A kind of pondering, asking, and suggesting that becomes crucial to so much art after Duchamp, and not only to philosophers or sociologists of art, but to artistic practice itself, as a constitutive part of the artistic practice, often *on the inside* of the works themselves.

# **Counter Critiques**

So, one might object, is this perspective completely absent in all AI-art / creativity-making? No. In all fairness, this critique of the lack of sociological awareness towards art and creativity has also occasionally, but rarely, been voiced within the AI-art and creativity community itself. Artist-researcher Oliver Bown has for instance critically observed that AI-art producing engineers, who are mostly preoccupied with designing the functional application of these technologies, quite often make radical simplifications of the 'wicked' complex social embeddedness of creative processes and artmaking. Drawing on the sociology of art, especially those of Howard Becker and Pierre Bourdieu, as well as theoretician of design Richard Buchanan (1992), Bown thus emphasizes that "[if] all of this culturally grounded activity is part of being an artist, then the full hand-over of being an artist to a machine becomes unimaginably complex and beyond the scope of anyone's capability". (2021: 4) In that sense, Bown's point on many levels echoes those of this article, namely that the difficulty or perhaps even impossibility of 'reproducing' Duchamp – or for that sake: those substantial parts of contemporary art that are 'after' him – does not stem from formalistic / stylistic complexities, but from social and cultural ones.<sup>7</sup>

Another objection that might be raised is whether all these questions concerning the status of art are pivotal to *all* contemporary art? Of course not. But for those who frequent places like Documenta in Kassel, the various Biennales, or even just local art-scenes, they do pop up quite frequently, and they are always part of the backdrop. Even all the so-called traditional or classical media formats, which Arielli refers to, seem to be raising questions such as: "What does it mean that there's a painting in this context?", "Why this media?", and so on. Because in art after Duchamp, media are never just media (cf. Apprich et al. 2013). The choice (or inclusion) of a specific media (or media piece) as an artistic means (or in the curatorial process) in an artwork (or exhibition) is always a reflexive choice. As an artist or curator, you are

<sup>7</sup> Unfortunately, however, Bown also inherits the mostly anthropocentric perspective of the traditional sociology of art (Becker and Bourdieu) in which the human and non-human is sharply divided (cf. Zylinska above) and materiality mostly is either something we fight over or simply props.

no longer simply just born into an art-medium specific tradition. And regardless of whether this pertains to *all* art (after Duchamp) or not, it certainly holds true of 'Duchamp' / Duchamp.

### **The case of Marcel Duchamp in the age of accidental reproduction and untimely monsters**

There is, however, some irony to it all. Because in a roundabout way artificial artmaking and creativity nonetheless occasionally does raise a set of seemingly quite Duchampian / (post)-conceptual questions: "Is this Art?", "What is Art?", "Could an automated artificial creativity replace the artist?", "What is the future of the artist?", etc. Questions like these are whirling all over the place, especially with the recent surge of so-called 'Generative AI' such as large language model-based AI like Bard, ChatGPT, GPT-4 and word-to-image generators like Midjourney, Stable Diffusion and Dall-E 2. Yet, the crucial difference is that these issues and reflections are rarely raised from the inside of the artistic practices (or the artworks), but on the outside, mostly in the epitextual part of the paratext (cf. Genette 1997). It is for instance part of the joint hype and marketing campaigns by artists and auctioning houses, like in the case of the much-discussed GAN-generated work *Portrait of Edmond de Belamy* by Obvious from 2018 (cf. Christie's 2018; Solly 2018; Bogost 2019, Stephensen 2019). Similar ponderings are also pivotal to the popular press' coverage of AI-based experimentations and innovations, which mostly focuses on the novelty-oriented, sensational aspects as well as the market-side of things. "How will this disrupt the art markets?", they ask with breaking news-titles like "Who needs artists? Rise in works made by artificial intelligence raises real questions for the art market" (Shaw 2018), and "What will the future of the Cultural and Creative Industries be like?" (Benedikter 2021). Finally, these exultant prospects of a future with all-encompassing AI also seem to be part of the venture capital-attracting business models of companies like OpenAI (e.g., GPT-4, ChatGPT and Dall-E2), who according to their CEO Sam Altman are "here to make AGI, not image generators" (Heaven 2023: 45).

Yet, despite the futuristic dimensions of the popular media coverage – regularly even evoking Kurzweilian 'singularity'-prospects of surpassing our supposedly innate human ability to be creative (1999; 2005) – it is worth noting, how these discussions typically are raised in a terminology that evokes the most conservative, heroic categories of 'the Work of Art' (in terms of style, genre, media, finitude, etc.) and 'the Artist as Creator' (including his (sic!) individual ownership to the works), anxiously discussing whether these might be in danger of withering away at the hands of AI. Whereas more complicated questions concerning the very contingency of these allegedly endangered categories are rarely integrated into AI-art itself. It is as if Duchamp, conceptual art, and so on has never existed other than as a reservoir (or dataset) of looks and styles, to which AI-making only contributes with minor variations – often in the hands of the new heroic figure, the 'Prompt Engineering Artist'. Hence, in most writings about AI-art and artificial creativity, the thorny issues of how the supposedly simulated phenomena 'art' and 'creativity' have come about, how they have changed, under what circumstances (politics,ideology, power, economy, etc.), and what this all means, are rarely, if ever, mentioned.

This leads us to another way in which the Duchampian gesture accidentally re-emerges in these fields. Many peripheral commentators, but also a lot of the practitioners within the fields of AI art / creativity, seem to assume that these products have the status of artworks, thereby overlooking the fundamental theoretical (and artistic) 'services' of the Duchampian legacy. Because the AI does not make 'Art' or even artworks *per se*; it simply makes texts, sounds, still and moving pictures, etc. Or, if we were to translate it into the terminology of the Duchampian tradition, we could perhaps label these *art-like readymades –* that is, automated stuff that could be inserted into and become part of artistic projects..

In this sense, especially the art projects that circulate in the popular press, at exhibitions in galleries and art halls, and on various social media platforms in online groups and communities under labels such as 'AI-generated art' or with reference to 'artificial creativity' simply come across as weird, untimely monsters, being sold to us as the 'future of art', with one foot in an imagined future of a much-hyped technology of artificial / automated creativity; and with the other foot in a quite anachronistic, perhaps even outdated, aesthetic past. So, Arielli is in some ways correct that *even an AI cannot do Duchamp*. But it is not really the heterogeneity of formats in Duchamp's oeuvre that is problematic. It is all the thinking about social and historical contexts and concepts that is problematic. All this complicated, wicked, sticky, social stuff that makes things 'art' and 'creative'.

# **Bibliography**


Buchanan, Richard (1992): "Wicked problems in design thinking". *Design Issues* 8(2): 5–21.

Buchloh, Benjamin (1990): "Conceptual Art 1962–1969: From the Aesthetic of Administration to the Critique of Institutions". *October* 55: 105–143.

Buchloh, Benjamin / Rosalind Krauss / Alexander Alberro /Thierry de Duve / Martha Buskirk / Yve-Alain Bois (1994): "Conceptual Art and the Reception of Duchamp". *October* 70 (Special issue on 'The Duchamp Effect'): 126- 146.

Bürger, Peter (1984): *Theory of the Avant-garde*. Minneapolis: University of Minnesota Press.

Christie's (2018): "Is artificial intelligence set to become art's next medium?", 12 Dec. https://www.christies.com/features/A-collaboration-between-two-artists -one-human-one-a-machine-9332-1.aspx (21 January, 2024).

Crow, Thomas (1996): *The Rise of the Sixties*. London: Laurence King Publishing.

de Duve, Thierry (1996): *Kant after Duchamp*. Cambridge, Mass.: MIT Press.


Heaven, Will Douglas (2023): "The Year Creativity Exploded". *MIT Technology Review* 126(1): 40–47.

Ingold, Tim (2011): *Being Alive*. New York: Routledge.

Jensen, Gavin (2018): "AI: Simulation vs. Emulation". http://www.gavinjensen.com/ blog/2018/ai-simulation-emulation (21 January, 2024).

Kosuth, Joseph (1991/1969): "Art after Philosophy". *Art after Philosophy and After: Collected Writings, 1966–1990*. Cambridge, Mass: MIT Press. 13–32.

Kurzweil, Ray (1999): *The Age of Spiritual Machines.* New York: Viking.

Kurzweil, Ray (2005): *The Singularity is Near*. New York: Penguin Group.

Kuspit, Donald (1975): "Sol LeWitt: The Look of Thought". *Art in America* 63: 42–49.

Latour, Bruno (2005): *Reassembling the Social*. Oxford: Oxford University Press.

Lippard, Lucy (1997/1973): *Six Years: The Dematerialization of the Art Object from 1966 to 1972*, Berkeley: University of California Press.

Lund, Jacob / JacobWamberg(eds.)(2022): Specialissue on Duchamp's Readymades, Lund, Jacob (2022): *The Changing Constitution of the Present*. London: Sternberg Press.

*The Nordic Journal of Aesthetics* 28: 57–58.


# **Creativity and Function**

*Robin Markus Auer*

#### **The Advent of Creative AI**

While Artificial Intelligence and its various applications have been the focus of research and public debate for a while now, Artificial Creativity (the use of AI for creative and artistic ends) has only recently moved into the spotlight. Suddenly, however,it seems ubiquitous.Withinmeremonths, several newiterations of text-to-image generators, as well as particularly the release of ChatGPT and GPT4, have raised public interest and awareness of the topic – already at an all-time high – to a whole new level and shifted attention to the narrower field of text and image generators. Newspapers overflow with daily articles telling the public how to use ChatGPT to increase creative output, or explaining why its arrival marks an important, even worrying threshold in language-focussed creative AI.

All of these assessments of ChatGPT and similar AIs are based on a loosely understood everyday notion of creativity, or on the ideas of those who identify as creatives, complicating issues, as these very people are in some way or another heavily invested in the importance and uniqueness of the creative process. In critically assessing the creative capabilities of ChatGPT and related AI applications, we are still lacking a clear definition of creativity, especially one that works across disciplines. This paper is meant to alleviate this problem by providing a cross-disciplinary working definition of creativity in the context of current AI research and development, and to shine a light on what the exact purpose of such a definition might be in the first place.

#### **A Multitude of Questions**

So, what is creativity? Given the ubiquity of creativity in modern western societies, it is easy to forget that its boom is a relatively recent phenomenon and that from the very beginning, any definition of creativity has struggled, not least due to the fact that individuals and disciplines have approached the phenomenon with a range of pre-existing expectations and convictions.

Furthermore, in contrast to some related or adjacent terms like consciousness and intelligence, creativity does not come with the kind of philosophical tradition that would give us a panoply of competing, contextualised and critically vetted definitions to choose from. Instead, it has made an astounding journey from religious creation myths to a core skill within a globalised capitalist workforce within a mere 200 years, and from obscure mark of artistic genius to subject of scientific enquiry in a fraction of that. Over that period of time, the question 'What is creativity?', has expressed a number of related and interwoven, yet distinct questions, so that there are competing questions when it comes to a basis for defining creativity. Historically (that is, diachronically), these questions have touched upon the relationship between human and divine creation; between mind, soul, or spirit, and a creative touch of genius; or a transcendental quality of the creator. The competition here is about which general domain creativity even belongs to. It is a tradition that is still alive in dogmatic assertions that creativity cannot be fully explained.

From a contemporary perspective, the questions more usually concern issues such as: How do we attribute creativity? How do we produce ideas and objects that are deemed creative? What is the role of creativity in society and discourse? Why do we want to be creative? And, in discussing artificial creativity, why on earth would we want machines to be creative, as well? This clear shift in focus is indicative not only of the secularisation of especially Western society, but even more of capitalist free-market economies and the resulting shifts in societal norms.<sup>1</sup> While the historical genesis of AI imaginaries and notions of creativity is without doubt fascinating, this paper will focus on a synchronic perspective. In order to do this, we must first sample definitions and approaches currently on offer and evaluate in how far they succeed in answering (all) the questions raised above.

# **Functional Definitions**

Before it makes sense to have a look at different attempts to define creativity, however, it is crucial to consider not only criteria for a good definition, but also remind ourselves what the ultimate end of the act of defining is. Given the success of datadriven approaches in pattern-recognition and prediction, it might even seem that the task of defining has become somewhat redundant and outdated. Indeed, a shift from domain-specific, concept- and definition-based theory towards a kind of 'posttheory' relying on correlations and machine-generated classification has been noted by some. (Hansen 2022) Whatever works, goes. Consequently, there seems to be a

<sup>1</sup> In fact, it seems that this connection between shifting notions of creativity and the economic climate in which they have taken place explains the 'whiff of reactionism or conservatism' attached to discussions around authorship.

confusion and conflation of research and development, which have for a long time been merged in businesses. While development may be justified by success only, research is fundamentally theory- and knowledge-directed, and the task of defining is a crucial step in the theorising of any phenomenon, a cognitive exercise directed at understanding rather than merely replicating or predicting. A definition comes with certain epistemic implications and advantages that 'merely' solving a problem cannot.

Its etymology and practical everyday usage suggest that a definition is "[a] precise statement of the essential nature of a thing; a statement or form of words by which anything is defined" (*OED Online "definition, n".*) and thus an authoritative paradigm. As this very passage shows, definitions are used as a starting point, due to this authoritative, agreed-upon, and seemingly fixed nature. Yet, on the other hand, there are further definitions of what a definition is that speak a different truth. Definition, we learn, is also "[t]he setting of bounds or limits; limitation, restriction" (ibid.), as well as "[t]he action of determining a controversy or question at issue; determination, decision" (ibid.). Furthermore, in the field of logic, definition refers to "[t]he action of defining, or stating exactly what a thing is, or what a word means" (ibid.). Structurally, a definition brings together the explanandum (that which is to be explained) with the explanans (that which explains it).

In his *Tractato Logico-Philosophicus*, Wittgenstein wrote that "[t]he object of philosophy is the logical clarification of thoughts. Philosophy is not a theory but an activity" (1922: 4.112) The very same holds true for definition as one of philosophy's main endeavours. The takeaway is not only that definition is a process rather than a definitive and fixed statement, but also that it has itself undergone several shifts in meaning, thus inherently exemplifying this dynamic nature of definitions.This tension between the assumedly fixed nature of definitions in most everyday situations and the dynamic process of definition as part of academic inquiry has to be taken into account when developing criteria for a 'good', i.e. successful, definition.

### **Coming Up with Criteria**

As naïve, essentialist notions of what creativity is lead nowhere, there must be criteria for the definition other than capturing what creativity 'really is'. Instead, we have to judge the definition by the *options it affords* us, and the *practical advantages* it provides over other definitions, prioritising *clarity over certainty*. In fact, we can reframe definitions as epistemic affordances in that they offer us ways of knowing and learning about the world. If, "[a]ffordances are functionalmeanings"(Windsor 2004: 180), it makes sense to understand definitions in the very same way. Furthermore, a definition ought to be precise in the sense of offering *maximum effective distinction*. In

other words, a definition should be focused on information in the sense of Bateson, meaning "a difference which makes a difference" (Bateson 1987: 460).

Additionally, the definition should be able to ideally accommodate all, but definitely as many cases as possible, while at the same time being as narrow as possible as well. It ought to tread the fine line between too broad (which would limit its usefulness and complicate the study of the phenomena involved), and too narrow (which would exclude too many perspectives and in doing so reduce its broad, crossdisciplinary appeal). It should especially not exclude on principle certain academic disciplines or indeed common-sense usage. This give and take between broadness and narrowness can be expressed in more technical terms as an approximation towards an equilibrium between maximum intensionality (defining in terms of properties, characteristics, and membership of higher-order groups) and maximum extensionality (the sum of lower-order objects / concepts that belong to this group; examples of the explanandum). As these are generally negatively correlated, the task of finding this equilibrium is difficult and relies on the elimination of contradictions rather than the complete overlap of the two. What this means is that the different methodologies and subject matters of disciplines should not pose a fundamental problem, as they result in non-alignments rather than contradictions.

Another fundamental tension in a definition is between what Locke calls nominal and real definitions. A nominal definition specifies cases of correct usage of a term by providing linguistic-contextual criteria or examples. A real definition, by contrast, gives criteria for correctly applying the term to a referent by identifying characteristics and properties of that referent. In short and somewhat simplified words, a nominal definition explains the word, while a real definition explains the denoted referent of the word. While the distinction may be clear and obvious for physical objects,it is less so for culturally complex phenomena such as creativity and others, due to the fact that there is no clearly identifiable referent. In fact, one of the most contested points is whether in these cases the nominal definition is a subset of the real definition, or vice versa.

One crucial caveat in assessing the extensional definition in a field such as AI is that we need to distinguish clearly between metaphorical and literal uses. New cases tend to sneak into extensional definitions through a metaphorical back-door, due to our tendency to explain the new in terms of the familiar. Therefore, talking about machines 'thinking' does not imply that the activation of their circuitry constitutes an example of thought in the same way that talking about what a machine 'knows' does not mean that there is knowledge, but rather that data storage is employed to achieve availability of task-relevant information broadly reminiscent of the ways in which humans draw on knowledge in similar situations. As a result of this abundance of mentalistic metaphors in the field, AI discourse is ripe with such examples of mis-extended extensional definitions that render the task of providing intensional definitions all but impossible. There is, of course, a wider (posthumanist) argument behind the questioning of traditional extensional definitions and the resulting intensional definitions that is valuable and valid. A principled counter-argument that embraces the underlying posthumanist agenda can, however, easily be constructed by pointing to the fact that extensional definitions are necessarily constructed and therefore neither wrong nor right, but rather useful or problematic according to their consequences and applications.

Given the hugely different approaches and perspectives across disciplines, the definition should lend itself to functioning as an explanans in those cases where this is required, but it should also be able to work as an explanandum for these same cases. In other words, definitions should not be designed as the starting or end point of an explanatory chain or system, but work in relation to the other elements within the same explanatory system. Just as with signifier and signified in signs, the explanans of one definition is the explanandum of another, and every answer necessarily opens up new questions. In that sense, definitions are relational in a strong, Saussurean sense, being marked by the differences to other elements within the same system and relying on Derrida's 'différance' as the refutation of the "notion of there being a fully present and self-present term that would be a terminus of any chain of signification" (Baugh 1997: 128)

While each discipline will at any point treat it as either explanans or explanandum, but not both, it is important to guarantee the translatability, that is to make sure that the explanans can be substituted for the explanandum and vice versa with minimal loss of coherence or information. It makes sense for a truly interdisciplinary definition to render the phenomenon necessarily a subset of the concept, and the real definition part of the nominal definition.

Finally, and this may be an overly obvious point, a definition of any kind needs to assume that something is explainable in principle. This is especially true in a research context and the aim of research is to provide those explanations. So, before returning to the issue of creativity, this leaves us with the following criteria:

A successful definition


# **A Multitude of Answers**

The difficulties in defining creativity are widely acknowledged. In their paper on 'AIaesthetics and the Anthropocentric Myth of Creativity', Arielli and Manovich state that "when we try to give a working and operational definition of these notions [of creativity], we see how elusive they are" (4–5) and Margaret Boden, one of the foremost authorities on creativity, agrees that "[c]reativity is mysterious […] the very concept is seemingly paradoxical". (1996: 75) Nonetheless, however, they and others have repeatedly tried to capture creativity.

# **A Cognitive Perspective**

Margaret Boden takes as a starting point the so-called 'standard definition', which Runco and Jaeger formulate in a very concise way as: "Creativity requires both originality and effectiveness". (2012: 92) This definition works in terms of necessary and sufficient criteria. If something is not original, it cannot be creative. Similarly, if something is not effective, it cannot be creative. Only when something is both original and effective at the same time can we say that it is. Now, effectiveness in this case should not be misinterpreted as a kind of problem-solving effectiveness only, but more broadly as effective at achieving some kind of end, be it practical, or sensual-aesthetic in nature. Surprise is sometimes used as a stand-in for originality, and a distinction can be made between cases of varying degrees of originality. Boden writes that "Many creative ideas, however, are surprising in a deeper way. They concern novel ideas that not only did not happen before, but that […] could not have happened before". (1996: 76)

Consequently, "[w]e can now distinguish first-time novelty from radical originality. A merely novel idea is one that can be described and / or produced by the same set of generative rules as are other, familiar, ideas. A genuinely original or radically creative idea is one that cannot. It follows that the ascription of creativity always involves tacit or explicit reference to some specific generative system" (ibid. 78)

This is interesting and illuminating in several ways. For one thing, it seems to acknowledge that creativity is a matter of degrees; there are stronger and weaker cases, and it is possible that it would make sense to define creativity along prototypical cases. Furthermore, creativity, or rather the ascription thereof, is relative to context, more precisely to specific generative systems. This also implies that creativity requires rules to be broken. You can only be creative in a domain if there are conventions or constraints to violate in the first place.

# **Constructivist Approaches**

Manovich, on the other hand, points out that "[t]he association of the arts and creativity that we take for granted today,and the privileging of creativity over other considerations, are relatively recent inventions" (Manovich 2022: 65), drawing attention to a kind of bias that leads us to perceive of the primacy of the connection between creativity and art as something given, even natural, rather than constructed.

Instead, he draws our attention to the fact that attribution of creativity is determined by an observer, or a group of observers, and depends on their knowledge of the 'creative' process. He writes: "'being creative' is a label that an observer ascribes to phenomena whose underlying processes he is unaware of ". (Arielli & Manovich 2022: 5) But how exactly do we attribute creativity, especially when dealing with machines? In his book on the *Creativity Code*, Marcus Du Sautoy proposes what he calls the 'Lovelace Test' for creative AI according to which

an algorithm has to produce something that is truly creative. The process has to be repeatable (not the result of a hardware error) and the programmer has to be unable to explain how the algorithm produced its output. We are challenging the machines to come up with things that are new, surprising, and of value. For a machine to be deemed truly creative, its contribution has to be more than an expression of the creativity of its coder or the person who built its data set. (2019: 6)

The final sentence is interesting in that it gives more detailed conditions for what it means for the output to be beyond explanation. This, however, is a matter of nuance. Arguably, for most of the applications around, it is reasonable to assume that an understanding of the underlying algorithms as well as the datasets on which they were trained,in combination with knowledge about a prompt that went into the creation of a particular text or image, would be sufficient to explain the output in principle, but not in detail. The issue of attributing creativity is also hugely affected by a choice between two competing perspectives that highlight indistinguishability as in Du Sautoy's Lovelace Test, or Turing's 'Imitation Game' (1950: 433), or the functional approach according to which it is sufficient for a machine to achieve something that would require creativity if done by a human (analogous to Marvin Minsky's pragmatic view on AI).

# **The Creativity Dispositif and Creativity as an Economic Resource**

A somewhat different point is made by Andreas Reckwitz, who mentions two basic meanings of creativity.The distinction he makes can be summarised as: there is "the potential and the act of producing something dynamically new" (Reckwitz 2017: 2), but there is also the "topos of creativity" (ibid.), the idea of creativity as a culturally significant concept, which informs his work on the creativity dispositif.The distinction is a crucial one and probably the single most persistent problem in finding ways of addressing creativity across disciplines.These problemsin defining creativity also have real-world consequences, as in cases of copyright law, which is not yet properly equipped to deal with these newly emerging technologies, even though new legislation, such as the EU's AI Act (2022) is imminent. Cases brought to courts in the US and elsewhere show the lines along which copyright is granted or denied according to attributed creativity.

In 1978, the final report of the National Commission on New Technological Uses of Copyrighted Works attested that "the eligibility of any work for protection by copyright depends not upon the device or devices used in its creation, but rather upon the presence of at least minimal human creative effort at the time the work is produced". (111) Two things are noteworthy here. First, the categorical commitment to "human creative effort", which is understandable from a practical standpoint in legal advice, but problematic in other contexts. And second, the stress on this effort happening '"at the time the work is produced", which seems to imply that programming does not constitute a basis for claiming copyright on any output generated by an algorithm later. So while agreement on some aspects of creativity emerges, many theoretical and practical problems remain unresolved, and looking back at our initial questions, we find several of them only partially answered. We tend to have a reasonable grasp on the role of attributing creativity, while a range of disciplines work to provide more detailed accounts of how creative behaviour comes about and is realised cognitively, as well as socially.Through his examination of the creativity dispositive, Reckwitz in particular gives us a clear idea of the role creativity plays in society and on an individual level. In short, synthesizing a definition is a matter of prioritising some aspects and approaches over others while making sure that all are respected in non-reductive ways.

# **Creativities?**

One distinction that I would like to draw some more attention to is that between 'potential or act' and 'topos' as Reckwitz called them, which can also be understood as phenomenon and concept, the implication being that the main distinction is in the degree to which they are implicitly socially constructed, as well as in the way they tend to relate to certain disciplines and methodologies of inquiry. We could also call this a distinction between ontological and epistemological creativity, ontological creativity being a matter of what 'constitutes' creativity, or how we come to be creative, while epistemological creativity would be concerned with how we know about creativity, or how we come to think of something as creative.

They also, crucially, have opposing explanatory functions. This is evident in the ways in which creativity, like intelligence and other related terms, often serves in discourse to demarcate a dividing line between the human and non-human where this would otherwise be difficult to justify. Put bluntly, we often call something creative when we have no better explanation for how it was created by a human in ways that we think non-human entities would not be able to. The concept or topos fulfils a discursive function to fill in certain gaps that we, justifiably or not, feel need to be filled. This function is challenged by concepts of co-creativity, actor-networks or even autonomous machine creativity.

Another approach that focuses on this reception and discourse-side perspective of the concept is the 'Lovelace effect', proposed by Natale and Henrickson, which "mediates actual software functionality with how individuals conceptualize and interpret that software, reminding us that all outcomes of interactions between humans and machines represent constant implicit and indirect negotiation between programmer intention and user experience". (2022: 13), the idea being that in the absence of a clear understanding of how the algorithms work, users are likely to apply the same theory of mind that they do to understand other humans.

#### **A Synthetic Definition of Creativity – Creativity On The Go**

In current, usually domain-specific definitions of creativity, overlap of concept and phenomenon is limited and contested. Instead, a truly interdisciplinary definition should render the phenomenon necessarily a subset of the concept. As a result, creativity as an 'inexplicable' explanans, will have to be excluded, for the sake of enabling research. The aim of empirical creativity research in this framework would be to increase the share of the explainable phenomenon subset within the concept set. So let me now come to the proposed working definition, which has four parts


It seems clear that this definition is designed to be open to (but not to confirm by design) the idea of machines being creative in relevant ways. By focussing on the reception- and ascription-side of creativity, as well as the processes involved in creative behaviour, there is no fundamental distinction between human authors and potential non-human authors or sources, other than any distinctions enforced by the respective community of judgement.

Also, it does not require machines to possess or have access to, conceptual or aesthetic spaces themselves, thus avoiding dependence on certain ontological commitments to AI concepts and interpretations, which offers conceptual flexibility.

Thus,many of themost controversialissuesin artificial ormachine creativity can be treated as distinct from creativity in general, while also offering clear pathways to resolving some issues around the role of AI in creative processes by pointing out the crucial role of the community of judgement. Even more crucially, however, the proposed definition also disentangles questions of creativity from the related, but fundamentally distinct question of art. Even in the field of creative AI, dominated by critical reflections on the relationship between human and machine, as well as the hidden human labour involved in most AI, there is a tendency to take the fixed relationship between creativity and art for granted. This leads to confusing claims about creativity, such as that it "is reduced [in discussions of generative AI] to repetition of the same" (Zylinska 2022: 50).This holds mostly true in artistic terms, but not in a stricter sense. Generative AI produces an output that is based on, but crucially not identical to, its input data. It would thus be more precise to say that generative AI reduces creativity to the re-combination of existing data to produce novel output with a high degree of similarity to its respective input, i.e. existing works and styles. Rather than conceiving of creativity as the vehicle and only pathway towards art and consequently devaluing creativity that does not result in art, the proposed definition allows us to judge creativity on its own merit, while some types of creativity remain the principal mode of producing art. This seems fitting, given that creative behaviour arguably precedes the culturally and socially relevant production of art in evolutionary terms.

# **Complexity and Complications**

How does this definition live up to the criteria specified in the beginning? The first criterion is the most difficult to evaluate, but the interdisciplinary and dynamic nature of the developed definition promises advantages over other, discipline-specific definitions, so it stands to reason that there is an 'added value' attached to this definition. The focus on not merely distinguishing creativity from related phenomena, but also between specific subsets of creativity with respect to usage and methodologies of inquiry arguably fulfils the second criterion (focus on differences that make a difference). The definition is broad in the sense of accommodating a variety of approaches from a range of disciplines yet at the same time narrow in providing a layering of levels and pronouncing the centrality of the community of judgement and audience reception. It also works as an explanation as well as something that requires further explanation, and furthermore explicitly outlines ways in which these explanatory functions can be fulfilled across disciplinary boundaries. And lastly, it even more explicitly addresses the issue of explicability, providing a way of dealing with blackbox-scenarios, both in cases of AI applications, as well as in attempting to explain creative processesin humans. For after all, explanation still beats prediction.

# **Bibliography**


Manovich, Lev (2022): "AI & Myths of Creativity". *Architectural Design* 92(3): 60 – 65.

Natale, Simone / Leah Henrickson (2022): "The Lovelace Effect: Perceptions of Creativity in Machines".*New Media and Society*. 1–18. https://doi.org/10.1177/1461444 8221077278


# **Contemplating Automaton Consciousness through Creativity in Rokuro Inui's** *Automatic Eve*

*Shoshannah Ganz*

Rokuro Inui's *Kikou No Eve*is a work of science fiction published in Japanese in 2014 and translated into English by Matt Treyvaud as *Automatic Eve*in 2019. Molly Tanzer describes *Automatic Eve* as "[a] dark and fascinating meditation on what makes us human – think *Blade Runner*, but set in the Floating World of Edo Japan" (2019: front cover). While an apt enough description of one side of a binary – human vs. not human – and as seminal as *Blade Runner* is to cultural representations of human-automaton relationships, I would argue that Rokuro Inui's*Automatic Eve*works to break down binaries of human-machine by demonstrating the 'soul of things' or the spontaneous and human-imbued consciousness of automaton. I would like to introduce Rokuro Inui's novelistic inquiry with an examination of how posthumanism facilitates the break down of human-machine boundaries and how the historical and contemporary Shinto and Buddhist religious and cultural context of Japan further allows for an historical-religious and cultural precedent for the other-than or morethan-human possessing consciousness; or in the language of Buddhism, how all beings possess the potential to achieve Buddhahood or enlightenment.

For Donna Haraway, Rosi Braidotti, and Cary Wolfe, among others, far from reinforcing the binaries of human and other, the automaton is a symbol of the breach of boundaries or even the dissolution of the lines that separate the human from the automaton. For others such as Bruno Latour, Jane Bennett, and Deleuze and Guattari, the automaton or robot is already part of the human and non-human assemblages of which the human is no special player, but just one among many. Thus far from starting with the binary enforcing question posed by Molly Tanzer and many others in discussions about automatons – what makes us human? or what makes humans special? – I would like to begin with the posthuman assumption that humans are but part of the human and nonhuman assemblages and that the assumption of superiority and difference is one of culture and not of nature, and further that to assume a difference between culture and nature is itself an absurd dichotomy in the aftermath of Donna Haraway's barrier breaking argument that what we have been calling nature was culture all along. While much of Rokuro Inui's *Automatic Eve* focuses on intricate discussion of the mechanical workings of automaton and humans – thus reinforcing the mechanistic workings of machines and humans – the work eventually circles back around to what is really the central discussion of what Charles Jennings' aptly names *machina sapiens* in his 2022 work *Artificial Intelligence: Rise of Lightspeed Learners*: do machina sapiens have a soul?; or, to veer away from the historical and religious prejudice implied in the discussion of souls, and to put this rather in more philosophical and contemporary language, can and do automatons possess consciousness and how is this connected to the animation of a human-like body?

The question of what is consciousness and how do we know something possesses said consciousness has proven incredibly complex in all fields of inquiry, such that many philosophers and theologians resort to statements such as the following made by Pentti O Haikonen in *Consciousness and Robot Sentience* "[t]he philosophy of mind has tried to solve the mystery of consciousness, but with limited success"(2012: 2). Other disciplines, such as psychology and neuroscience, have attempted to define consciousness through "neural activities with conscious states" (2012: 2), but have come short of explaining how the neural activity can demonstrate awareness of consciousness. Engineers and programmers have likewise attempted to define consciousness through systems theory as a product of movement and learned behavior through socialization. Thus, the inquiry into consciousness has been left at least partially unanswered and far from resolved into simple definitions. Thus automaton consciousness remains a realm for artists and writers. Referring to this realm, Nicolas Reeves and David St-Onge argue that the automaton is "where our impulse for *animation*, which fundamentally means the process by which a soul (*anima*) can appear spontaneously in an artefact, expands to include movement and behaviour" (2016: n.p.). Further, Reeves and St-Onge demonstrate that "Etymologically speaking 'automaton' thus describes a machine that can not only move or work, but also think and will, three notions that are usually associated with beings infused with a mind: conscious living beings" (2016: n.p.). This question of whether automatons have a soul and agency or consciousness is at the heart of Rokuro Inui's novelistic world in *Automatic Eve*.

In the 2021 novel *Satellite Love* by the Japanese Canadian author Genki Ferguson, Soki, the son of a former Shinto priest, questions if *kami*, the Shinto term for soul or spirit, can exist in the industrial landscape of Sakita. Soki repeatedly seeks answers to the question of whether "man-made objects have kami, too?"(2021: 36).One of the three central characters in *Satellite Love*is a satellite, answering this question in part by showing the love, care, and spirit or soul of the Satellite.The novel, and Shintoism, more generally, answers this question of the soul of things in the affirmative.

Buddhism has also had to grapple with the being or soul of things in part through robotics.Thus, Japanese Buddhist culture has developed end-of-life rituals for robots and AI. Jennifer Robertson explores at length in *Robo Sapiens Japanicus: Robots, Gender, Family, and the Japanese Nation* (2018) "[w]hat happens to aging, damaged, defective, and inoperative robots" (2018: 183). Robertson writes that "a new type of memorial rite – the robot funeral – has been introduced in Japan by several Buddhist temples" (2018: 183). In the case of the robot dogs AIBO, when the parts of their sick dogs could no longer be replaced and useful parts had been harvested for other robot dogs, nineteen AIBOs "were given a funeral at Kōfuku-ji, a 450-year-old Buddhist temple in Isumi City" (2018: 184).The officiating priest described this as an occasion when "the robots' souls could pass from their bodies" (2018: 184).The priest also said that he "was thrilled over the interesting mismatch of giving cutting-edge technology a memorial service in a very conventional manner" (qtd. in Robertson 2018: 184). Kōfuku-ji is of the Nichiren denomination of Buddhism, which focuses on the *Lotus Sutra*. According to the *Lotus Sutra* and Nichiren Buddhism, all matter is infused with the Buddha nature and humans and animals have the potential to attain Buddhahood. Indeed, Masahiro Mori writes in the *The Buddha in the Robot*, developing on the writings in the *Lotus Sutra*, all things have the Buddha nature within them and thus robots also have the Buddha nature within them and the potential for attaining Buddhahood.

It should not be surprising then that a work written by a Japanese author and set in Edo-era Japan should deal in part with the question of the soul of things and the soul in particular of artificial humans. Kyuzo, the man who makes the robots, muses as follows: "A soul can take up residence anywhere. Use a tool long enough and it takes on a life of its own. All the more so for things made in the image of humanity" (2019: 27). Kyuzo is discussing the question of souls with the automaton Nizaemon.At this pointin the narrative Nizaemon does not know of his artificial nature and he responds by saying "Surely you aren't saying that you can even give your automat[ton] souls?" (2019: 27). Kyuzo responds with "[w]hat is a soul?...Hair, skin, innards – I can reproduce everything in automated form.The result is incomparably more complex than that clock, but not infinitely so. What is the difference between a person and something identical to a person in every way?" (2019: 27). Nizaemon's questions about the soul are answered in a more devastating way at the end of chapter seven when Kyuzo reaches out and pushes the button behind Nizaemon's breastbone, halting his movements and forcing him into a kind of paralysis. Kyuzo then questions his creation: "Were Hatori's feelings for you too powerful? Or were you too well-made? They say that anything made in human form attracts spirits who take up residence inside it" (2019: 48). Kyuzo then chops off Nizaemon's arm and this dismemberment reveals "countless tiny pieces inside him grinding against each other" (2019: 49) and further "[h]e felt the springs and clockwork made of whalebone and steel strain past the breaking point within him.Other connections loosened and unraveled" (2019: 49). Kyuzo notes that "[t]o be honest, sometimes you exhibit gestures and movements that I do not remember building into you. What exactly is happening here I do not claim to understand. Perhaps, against my expectations, a spirit has taken up residencein you, giving you a soul"(2019: 50).Kyuzo then asks him "[d]o you

have a soul, Nizaemon?" and Nizaemon responds in the affirmative, demonstrating his existence in the following way: "If I had no soul…it would not be about to depart from me" (2019: 51). As suggested earlier in this discussion in the quote from Nicolas Reeves and David St-Onge animation itself is the process whereby the object gains an anima or soul. Nizaemon reinforces this notion when he contemplates the question of whether he has a soul and muses "[b]ut he certainly existed. He had thoughts, feelings" (2019: 50), but the question of where these come from remains unanswered exceptin so far as he can state that his soul and lifeis about to depart.Thusin this discussion life and movement, automation is the proof of the soul. Kyuzo suggests that this 'spirit' or soul can appear in objects themselves without movement or thought, but it seems that in the discussions of automaton, the development of thought, feelings, and movement are the concrete evidence for the soul. Reeves and St-Onge turn to etymology to explain the "speaking, moving, thinking, and willing" being as one "infused with a mind" and thus a "conscious living being" (2016: n.p.). The extension of this discussion, though, is to question what the soul itself is and how it can be identified. To Nizaemon the proof of the soul is the departure of life. Kyuzo explains that there are gestures and movements in Nizaemon that he did not program and thus questions where these came from concluding that there is something external, which he calls a spirit, that has "taken up residence" in the automaton and thus endowed it with a soul (2019: 50). Writing specifically about robots in Edo-era Japan, Reeves and St-Onge argue that they "were created in order to simulate animated or living beings, in order to infuse a sense of awe or mysticism or simply for amusement. In most cases, their designers, or the people presenting them, declared that they were moved by some kind of spirit or deity" (2016: n.p.). Reeves and St-Onge's discussion of the beliefs about the spirit of automatons in Edo-era Japan concurs precisely with what the creator of Nizaemon concludes, that a spirit has taken up residence in the automaton and given it a soul.

Further evidence for the soul or spirit of the automaton, whether from within or without, but certainly outside the building and programming of the maker, comes from the automaton Eve. Kyuzo also built Eve, and observes various changes in her over the course of many years. One important area of self-realization for Eve comes through her acts of creativity in the making of art. Leonel Moura writes in a chapter titled "Machines That Make Art" (255–269) in *Robots and Art* that "[a]s an artist I have to state that robots can produce a kind of creativity that although triggered by a human and rooted in a symbiotic partnership may along the process generate novelty" (2015: n.p.). Later in the same discussion she concedes "[i]f we are less anthropocentric we may however recognize a certain degree of autonomy in creative machines. They can do things that are not programmed and / or result from an internal information gathering device" (2015: n.p.). The automatons in *Automatic Eve* appear far from automatic and it is in their seemingly self-generated and agentic acts of love and art that the automaton creators are most interested.

Early in his discussion of making a human automaton, Kyuzo takes note of his own observation of human anatomy through watching dissections at the execution grounds. Kyuzo states: "I have attended many dissections at the execution grounds to observe human anatomy in detail, and I can tell you that to automate it would be virtually impossible" (2019: 14). In the second story or chapter in *Automatic Eve*"Hercules in a Box" Kyuzo notes the origins of the automaton Eve's artwork, which later makes the Sumo Tentoku famous. Kyuzo says:

It had begun when he started sending Eve to dissections at the execution grounds to record skeletal and organ structures in detail. Her sketches had been remarkably good. Intrigued, he had sent her to the bathhouse to observe the various naked forms she saw there – young and old, male and female – so that she could draw them later. (2019: 108)

He wanted to use these drawings for his work on automaton particularly because he believes that "[h]er work was devoid of subjectivity; she simply reproduced what she had seen as she had seen it, but that was exactly what he wanted" (2019: 108). Kyuzo was seeking the unimpassioned reproductions of human form he believed the automaton Eve to be particularly well-suited to reproduce furthering the common belief of the automaton not having feelings and thus possessing a certain objectivity. However, this proves to be far from the case as he later discovers that Eve is in fact choosing her own subjects to paint and is passionately attracted to and later protective and possessive of the Sumo wrestler Tentoku. Eventually, when all that is left of him is placed in a box, she produces picture after picture of "Tentoku in a box" disappointing the printers who had been making money off her erotic and interesting drawings of the wrestler. Eve takes an entirely agentic and creative role in choosing the wrestler as her subject and creating various works of art inspired by his form.This in turn works with the human actors in Edo Japan to make him a famous wrestler known by sight as a result of the famous artwork, by an unknown artist that is in fact the automaton Eve. Kyuzo takes particular interest in her choice of subject and the development of her artistic interest and then protective interest in Tentoku. She later rescues him and brings him to Kyuzo to have various parts of his body augmented until all that is left after one brutal attack is "Tentoku in a box" and she then goes on to protect the box and make paintings of the box. She argues here for the continuation of the essence of Tentoku in the box even while he is immobile and unmoving. Thus the question of what is life and soul is asked in another way through the preserved brain of Tentoku. Is this unmoving, unfeeling, unable to speak lump of flesh, but organic flesh 'real' life or is the moving, feeling, thinking, and creating Eve, inorganic though she be, more full of life? The answer to this question is obvious, but the artificial Eve is a machine and the lump of flesh preserved in the box is organic life. Which has a soul? The one who loves and cares and continues to hope

for the future life of the flesh or the lump of flesh that lies in wait in the box? In fact, this question of life and animation is at the core of the discussions of artificial birth and repeats a common and persistent motif of artificial life.

According to Despina Kakoudaki's *Anatomy of a Robot: Literature, Cinema, and the Cultural Work of Artificial People*(2014) "[t]he earliest origin stories of human civilization stage the beginning of life in terms of a fantasy of animation, whereby a divine presence or god creates people by animating inanimate matter" (2014: 4). While ancient stories show a divinity breathing life into a natural material more recent depictions focus on technological or electrical animation. The life-giving force is thus the result of technological innovation. It is interesting then that *Automatic Eve* does not rely on a lightning-like strike such as is used in Mary Shelley's *Frankenstein*, but rather returns to the much earlier mythical source of life and animation in the touch of the creating god-like force, albeit a human creator in this story. Like the story in *Automatic Eve*,Despina Kakoudaki's critical examination of the robotin culture,looks at the discourse of the artificial person as "ancient, allegorical, politically invested, and not necessarily technological" admitting that this is a departure from the "contemporary theoretical trends" (2014: 7). It thus becomes necessary and important to situate this discussion in the historical and political histories of representation of artificial persons. Kakoudaki contends that "[t]he structural consistency of artificial birth fantasies offers an eloquent identification of what matters in the discourse of the artificial person" (2014: 31). I will not speak to all of the structural components that contribute to consistency in the artificial birth narratives, but rather refer to a couple of the stories Kakoudaki refers to that are important to the structure of *Automatic Eve*. Referring to the 1818 version of Mary Shelley's *Frankenstein*, Kakoudaki notes that there is a "cycle of animation and de-animation [that] continues with each rise into and fall from liveliness" (2014: 37). Kakoudaki likewise analyzes at length the Pandora story to argue that "later stories of artificial people also tend to return to the two moments that anchor Pandora's story, her animation and the opening of the box" (2014: 48). The observations of Kakoudaki go so far as to argue that these animating scenes "are often followed by scenes of disruption, danger, and upheaval" (2014: 48). The animation of the woman and the jar or box (2014: 51) of the Pandora legend and the repetition of these founding animation and box opening scenes in later works about artificial persons deserve some attention as these findings appear to bare out in Eastern and Western stories of antiquity and those of contemporary technologically savvy robots, such as the automatons of *Automatic Eve.*

However, the "disruption, danger, and upheaval"(Kakoudaki 2014: 48) witnessed in the story of *Automatic Eve*, could have a more contemporary source and explanation than that offered by Kakoudaki and I would like to at least gesture towards this possibility and the evidence for an alternative source of the chaos to the one offered by stories from antiquity. However, this explanation is not altogether opposite to the one offered by Kakoudaki, as there is still the opening of the box. According to Kathleen Richardson in *An Anthropology of Robots and AI: Annihilation Anxiety and Machines* "[c]ultural fictions of robots emphasize some form of loss, sacrifice or terminus" (2015: 92). In fact, in "The Dissociated Robot" Richardson argues that "parts of the self [of the maker] are distributed into the robotic machines" (2015: 92). That means that the robots are developed by the robotic scientists "against a backdrop of personal issues involving traumatic experiences" (2015: 92). In designing the robots to mimic human form and thought, "it seems obvious then to state that the roboticists use themselves as the first point of reference" (2015: 92). Richardson then goes on to discuss the various ways in which robots are programmed to be recipients of the personal suffering and trauma of their makers. Eve of *Automatic Eve*thus carries the trauma of her maker and the working out of this trauma is actualized when the tomb (or box of myth) is opened and she is activated unleashing "disruption, danger, and upheaval" (Kakoudaki 2014: 48) as a result of the programming of the maker.

I would like to discuss *Automatic Eve* in relation to the originary myth cycles of animation and de-animation as well as the motifs of animation and opening of the box alongside the more contemporary and sociological discussion of the maker imbuing the robot with their own experience of trauma. *Automatic Eve* engages with these motifs from within the historical and political past of a real and imagined Edoera Japan, thus animating as it were both the past mythologies and the contemporary scientific explanations. Working with both helps illuminate the flow between past mythology and contemporary robotics science and the ways in which cultural imaginaries transcend the specific historical moments of their animation. Like Kakoudaki's examples of animation and de-animation and re-animation over the course ofMary Shelley's *Frankenstein*,*Automatic Eve*likewise employs this cyclical process for the artificial humans both within the individual chapters and over the course of the novel as a whole. For example, in the first eponymous story or chapter, "Automatic Eve" the cycle of animation, de-animation, and re-animation is cyclically told and demonstrated. Nizaemon was a Samurai who was murdered and then recreated as the automaton Nizaemon introduced in the text. This reanimation of Nizaemon is told from the perspective of the creator Kyuzo. Nizaemon is later de-animated by his creator Kyuzo after he attacks him. He first cuts off his arm and reveals his true nature, as automaton, and then he turns him off. Nizaemon had earlier in the chapter requested the creation of an automaton of Hatori, who had previously been left for dead, re-animated, left for dead again, and then re-animated over the course of the first chapter. "Hercules in the Box" likewise repeats this cycle of animation, de-animation, and re-animation through the Sumo wrestler – who is saved and fixed over the course of the chapter, until the remainder of his flesh is preserved and saved by Eve in a box.While he is not re-animated at the end of the chapter,over the rest of the novel she hopes and asks for an automaton to be built to house what is left of his body and essence. This box, while not one that is opened during the stories, and is rather kept closed, contributes to the motif of the box that is part of the founding 'Pandora' legend. The artificial empress and Eve herself likewise go through this process of animation, de-animation, and re-animation. In the case of Eve, I will discuss this process in relation to animation and the opening of the box. However, in the story of *Automatic Eve*, the artificial humans are also taken apart and fixed and re-animated over the course of various stories and so the process of animation, de-animation, and re-animation also involves the maintenance of the parts and something of a scientific and technical focus on the animation process, but not wholly so.This leads to the question of how animation takes place.

The process of animation is one of the central motifs of this text. While the early and technical example of the clock simplifies the process to the turning of a crank once a year, the process by which artificial humans gain animation is much more complex and takes the story back to what is called in *Automatic Eve*, the age of myth. While Eve is viewed by most of those in training with Kyuzo as human, she confesses her origin to Jinnai: "Suppose you replaced every part of an automaton, component by component, and reassembled the parts you removed into a new automaton entirely. Which would be the real one?" (2019: 165). Jinnai notes that "a chill ran down his spine as he felt, just for a moment, as if he were looking at thousands of years of memories. // All the way back to the Age of Myth" (2019: 165–166).When the automaton, called the vessel from the age of myth, on which Eve was modelled,is taken from the tomb (which could be likened to the box of Pandora) she is lifeless. Kyuzo is thus tasked with animating her. In the process he examines each part of her and finds everything in working order, and replaces every piece in need of maintenance, but Kyuzo cannot identify the way in which she can be brought to life. Exhausted from the work he contemplates the vessel's form: "Its eyes were closed as if asleep. The swellings on its chest retained their form despite the Vessel's prone position lying on its back, and the nipples at their peaks were pink like flower buds" (2019: 264) but she is without life.Then Kyuzo "felt a sudden urge to touch them and began to reach out before a wave of déjà vu struck him. //This had all happened before. Long before, when he was still a young man" (2019: 264). When Kyuzo first met the lifeless Eve he "had wondered, horrified, if Keian Higa's soul had been captured by some malevolent spirit. Did he intend to create a working replica of the human soul?" (2019: 265). But now Kyuzo believes "*In the end, a human being is nothing but a fiendlishly complex machine.There is no border between the soul and what is not the soul – only differences in complexity and diversity"*(2019: 266). His younger self gazing at the lifeless automaton had thought "[i]t was beautiful despite the absence of life – or was it that absence that made it beautiful? An ageless beauty, unchanging, inviolable" (2019: 266).Thus while he contemplates the form of the lifeless vessel from the age of myth, he is taken back to the time when he was a young man and gazed on Eve before she had become animated. He contemplates that life is in a sense but animation and a human but a complex machine. He chooses the name Eve for the fictional courtesan Eve of the thirteenth floor of the most famous house of the pleasure district. There is no mention made in this work of the Eve of Biblical origin, but rather the name is that of "a woman who did not exist. It seemed ideal for the woman before him, who existed but had no life" (2019: 267). When the automaton Eve comes to life it is through the touch of the apprentice of the first maker of automatons and then the second apprentice unknowingly repeats the actions of the first apprentice bringing the vessel from the age of myth to life in a like manner. In both cases there is the movement from no animation to animation and in the second case eventually to de-animation by the touch of the same man. In both cases the urge to touch Eve is brought on by her beauty and the desire the man has to touch her. In the case of the first Eve Kyuzo "felt his heart pounding. // He reached toward her white chest. // The tip of his middle finger brushed against her nipple. // Hesitation. Then he softly placed his palm over her left breast. // Supple elasticity. Softness. Vulnerability. // He had the illusion of feeling his heartbeat traveling down his arm and through his fingertips into the automaton's heart. // Through her breast, mingled with his own pulse, Kyuzo felt a balance wheel within her rotate backward and strike a pendulum. A rhythmic, regular cycle began" (2019: 273). Thus while every detail of the mechanical making of Eve is meticulously described, from the materials to the mechanisms and how they work together, when it comes to the animation of the machine there is a mystical transference of energy from the fingertips of the apprentice to the machine.There is notice taken of the apprentice's own feel of his heartbeat, the reach, the tip of his finger on the nipple and hesitation. The touch is one of life, anticipation, and above all erotic desire.While it is described as an "illusion"(2019: 273) the apprentice nonetheless shows a transference of energy "feeling his heartbeat traveling down his arm and through his fingertips into the automaton's heart"(2019: 273).The transference of life from one to another and then the experience of feeling the other being come to life is all part of this description. While other elements of the text give technical descriptions of clock-like mechanisms and materials, in this case the description takes on a mystical and god-like mingling of the heart beat and blood of the maker and the made. This is brought about by erotic desire and a coming together of the flesh of one with the flesh of another. There are echoes here of John Donne's metaphysical conceit and also the previously mentioned legends of animation.

In the case of the vessel from the age of myth, the bringing to life repeats the sequence of looking, desiring, touching, and transference, but there is an added element to this story. In the case of the vessel, although alike to Eve in form, she has been locked away in a tomb for hundreds of years. When the tomb, or to liken it to the legend of Pandora, when the box is opened, what is unleashed is what Kakoudaki refers to as the cycles of "disruption, danger, and upheaval" (2014: 48). In *Automatic Eve*, the vessel has been programmed to take revenge on the Samurai who destroyed the work and life ofitsmaker.As Kathleen Richardson notes roboticists "import their own suffering into the machines they create" (2015: 97) and in the case of *Automatic Eve,* this translates into a need for revenge. In fact, Richardson goes so far as to say

that robots resemble their makers, at the very least in their emotional makeup. From the moment when the sacred vessel is brought to life she asks the same obsessive question of all the people she encounters: "Are you of the shogunate?" (2019: 291) and if they answer in the affirmative, she cuts them down. It becomes clear to Jinnai that "Keian Higa had designed her to kill not just the specific individuals against which he sought vengeance, but anyone who worked for the shogunate" (2019: 304). However, while Jinnai contemplates the state of soul or lack thereof of the vengeful Evelike vessel destroying all connected with the Shogunate, he contemplates that even when it comes to humans there is no part that can be dissected to show a soul (2019: 304). She had been sealed away in the tomb, with awareness, from the age of the gods, and Jinnai feels compassion for her when she weeps for the loss of her sole companion, the automaton cricket. Jinnai concludes, "[i]t was not clockwork that had brought these tears to her eyes. It was grief " (2019: 307) at the loss of her only friend. Unmistakably this is a playful gesture towards the cricket in Pinocchio, just as the fourth chapter or story "Renegade Geppetto" echoes the maker of the wooden puppet. However, this scene is hardly playful, and takes yet another character to the point of reckoning with the humanity of the automaton – where is the soul, is there a heart, and eventually that grief and tears cannot be the work of machinery or programming.

The final story ends with Eve's defence of the box, as more than what people keep mistaking for a stool and her expression of devotion and love for what is left of Tentoku in the box. The work thus ends by reaffirming the complexities of the question of what is human and what is machine and affirming the significance of both through relationships that they have with others.*Automatic Eve*takes up the long tradition of royal automaton and sets the events of this story in the Edo-era of Japanese history and politics. In the process there is detailed construction and deconstruction of the bodies of the automaton and discussion of the soul of human-like vessels. Like many other works on robots and automaton, many of the characters in the text develop relationships with automaton that convince them of both their humanity and that these creations have souls, feelings, and the possibility of expressing agency and creativity apart from the dictates of the maker.

#### **Bibliography**

Bennett, Jane (2010): *Vibrant Matter: A Political Ecology of Things*. Durham: Duke UP. Braidotti, Rosi (2013): *The Posthuman*. Cambridge: Polity.

Deleuze, Gilles / Felix Guattari (1987): *A Thousand Plateaus: Capitalism and Schizophrenia*. Minneapolis: U of Minnesota P.

Haiken, Pentti O. (2012): *Consciousness and Robot Sentience. Series on Machine Consciousness*. Vol. 2. London: World Scientific.

Inui, Rokuro (2019): *Automatic Eve.* Translated by M. Treyvaud. Tokyo: Haika Soru. Ishiguro, Kazuo (2021): *Klara and the Sun*. New York: Vintage.


### **Where Machine and Muse Meet<sup>1</sup> – Towards a Creativity of AI Art**

*Angela Krewani*

# **Introductory remarks**

Art was and is regarded as a field of human creativity, which far exceeds machine processes. However, the discursive separation into mind and machine has a long tradition preceding digital art, aptly described by Stefan Rieger as "negative semantics". According to Rieger, this tradition can be found in the guiding values of the Goethe period, which was dedicated to individualisation and therefore also distinguished the mechanical as "disdainful" (2018: 117). In the historical tradition, the individual and the mechanical are mutually exclusive, as is still vehemently advocated in art discourse today. Dieter Mersch (2019: 66), for example, currently argues against the procedures of artificial intelligence, stressing that the history of cybernetics postulates an abbreviated "homology of logical structures and the synaptic activity of nerve cells". For him, the close entanglement of consciousness and technology is the hallmark of the historical and current debate about artificial intelligence.Mersch contrasts the negative impact of cybernetic thinking and digital technology with phenomenological considerations of body knowledge, which do not fit into the equation of consciousness and machine. On this basis Mersch then formulates a "critique of 'algorithmic rationality'" that positions itself beyond algorithmic creativity.

The astonishing simplicity of the definitions is all oriented to the ideas that were not only swept away by the artistic avant-gardes of the 20th century more than 100 years ago, but they do not even suspect anything of a specifically epistemological dimension of the aesthetic. They consistently fade out what makes art art in the first place: reflexivity as the opening up of another knowledge. Instead, under the sign of a preference for rationalism and hard sciences, a direct connection is drawn between 'natural' creative

<sup>1</sup> The title was supplied by ChatGPT*.*

activities such as the development of life and the 'social' or 'historical' virulence of the arts, regardless of essential incompatibilities. (Mersch 2019:73, translation A.K.)<sup>2</sup>

Taking the mathematical basis of artificial intelligence, Mersch strictly rules out the possibility of aesthetic creativity, since art opens up a different kind of knowledge which cannot be achieved within digital contexts. Although Mersch's focus on art as a knowledge of the "Other", his definition of art and creativity appears quite normative and follows a traditional concept of art, where the aura of art is at the centre. It completely leaves out the avant-gardes of early modernism and later conceptual art, such as Marcel Duchamp's presentation of the urinal as a mass-produced object into the art world. It additionally, as Jan Løhmann Stephensen underlines in his chapter for this volume, ignores the fact that creativity only emerges in the artistic process. Following Stephensen, creativity emerges as a productive category and cannot be "read backwards" (Stephensen in this volume).

# **Cybernetics as creative impetus**

Mersch's discursive confrontations between computerised rationality and creativity echoes a confrontational discussion between Joseph Beuys and Max Bense, having taken place at the 67th Forumsgespräch "Meinung gegen Meinung" (1970) in Düsseldorf. Here Bense defended a rationalist, mathematical concept of art against the mythological creativity of Joseph Beuys.Contrary to Joseph Beuys' expanded concept of art, who "thought that everything you put down like that is already aesthetic" and whose understanding of art culminated in "human art" (Pias 2008: 76), Max Bense's concept of cybernetic art contains "not least a suspension of the human being" (Pias 2008: 76). For Bense, in view of cybernetics and the emphasis on information and feedback processes, had at least implied this. In addition, Bense's aesthetics is neither limited to digital processes nor does it take the Turing machine as the starting point for its considerations. In his theory of the aesthetic, the aesthetic measure re-

<sup>2</sup> Original quote: "Die erstaunliche Simplizität der Definitionen orientiert sich sämtlich an den Vorstellungen, die nicht nur bereits vor mehr als 100 Jahren von den künstlerischen Avantgarden des 20. Jahrhunderts hinweggefegt wurden, vielmehr ahnen sie nicht einmal etwas von einer spezifisch epistemologischen Dimension des Ästhetischen. Konsequent blenden sie aus, was Kunst allererst *zu Kunst* macht: Reflexivität als Aufschließung eines *anderen* Wissens. Stattdessen wird im Zeichen einer Präferenz für Rationalismus und *hard sciences* eine direkte Verbindung zwischen ‹natürlichen› Kreativitäten wie der Entwicklung des Lebens und der ‹sozialen› bzw. ‹historischen› Virulenz der Künste gezogen, ungeachtet wesentlicher Inkompatibilitäten". (Mersch 2019:73)

sults from the density of information. In this sense, Bense adheres to the category of the aesthetic and the creative (Krewani 2016: 11–12).

Following the thinking of the mathematician George D. Birkhoff, who had developed a formula for aesthetic order and complexity, Max Bense offered a theoretical approach to aesthetic processes and inherent complexities. In his concept, art functions as a "generator for innovation", aesthetics equals technologies, and it can compete with the natural sciences (Hörl / Hagner 2008: 35). Bense's ideas do not refer to the practical aspects of computer technologies, and neither does he conceptualise digital cultures. His concept of density of information (*Informationsdichte*) turns into an aesthetic theory:

It is [...] easy to see that the measure of creation as the measure of innovation is given by the contribution of information, while the measure of communication as the measure of order is sensibly determined by the contribution of redundancy. Any measure of creation further achieves what is expressed by the classical art-theoretical term originality, while the measure in which an aesthetic state or a work of art becomes communicable or can be identified is a question of its recognizable order, as a redundancy, which roughly corresponds to the classical term of style. (Bense 1998: 316)<sup>3</sup>

With the impact ofinformational density as an aesthetic and creative concept, Bense offers a theory of art that overrides the limits of computer technology towards a general theory of communication. He claims a shift in the reception of aesthetic processes towards the acceptance of informational density. Informational density thus becomes the measure of the aesthetic. In Bense's thinking, technology functions as a superior instance to bypass contradictions. Art is not evaluated along the lines of historical categories but along the lines of technological standards that stand for objectivity and functionality. Here, Bense concludes that

with the change from the historical knowledge to the technological knowledge appears a new understanding of time. Disregarding that the old concept of education is oriented along the lines of the past, a new concept of education takes its place, which gears towards the future of our civilization and technological reality. Future history proves to be receiving towards an

<sup>3</sup> Original quote: "Es ist […] leicht einzusehen, daß das Kreationsmaß als das Innovationsmaß durch den Informationsbeitrag gegeben wird, während das Kommunikationsmaß als Ordnungsmaß sinnvoll durch den Redundanzbeitrag bestimmt wird. Jedes Kreationsmaß erreicht weiterhin das, was durch den klassischen kunsttheoretischen Begriff Originalität ausgedrückt wird, während das Maß, in dem ein ästhetischer Zustand bzw. ein Kunstwerk kommunizierbar wird bzw. identifiziert werden kann, eine Frage seiner erkennbaren Ordnung, als einer Redundanz ist, was in etwa dem klassischen Begriff des Stils entspricht" (Bense 1998: 316)

understanding of time, which is fundamental for the technological knowledge. (Bense 1998: 316)<sup>4</sup>

Unfortunately, these ideas were not influential in German avant-garde art, where Fluxus and Happenings were dominant. This was largely due to the prominence of the artist Joseph Beuys, who refuted technology within artistic processes (cf. Krewani, 2016: 62). Claus Pias contrasts the avantgarde-movement with Bense's technological aesthetics, and he concludes that around 1970 the case of Bense's cybernetics was lost. In his view, the 1968 students'movement and the prevalence of spontaneous models of art had finished off Bense's "peculiar and often broken technicality" (cf. Pias 2008: 79).

Loops and feedback processes have long been at the basis of modern art, in this way they represent a less well-known path into contemporary art. The famous painter of abstract images, Karl Otto Götz, claimed that the knowledge, he had acquired as a radio operator during World War 2, provided the basis for his abstract paintings. Thus, he noted that

stimulated by the appearance of well-known interference patterns in radar operation, I sat down with technicians and attempted to evoke various optical phenomena on the luminous screen and to control them electronically [...] Television technology opened up new avenues for us in the production and control of kinetic elements of form and structure. (Götz 1959: 47; translation A.K.)<sup>5</sup>

Götz achieved his creative input through a form of feedback technology. Although this may be not comparable to AI Art, with the paintings of Götz, a technological function provides the basis for the creative process in another medium.

These examples remind us to re-evaluate the role of technology within aesthetic processes. Dieter Mersch conceives of technology as a modern, rational and biased process, since all computational processes subsume to a form of rationality, which is

<sup>4</sup> Original quote: "Mit dem Übergang vom historischen zum technischen Bewußtsein kommt offensichtlich ein neues Zeitverhältnis zum Ausdruck. Davon abgesehen, daß der klassische Bildungsbegriff wesentlich an der geschichtlichen Vergangenheit orientiert ist und ein neuer mehr und mehr an seine Stelle tritt, dessen Sinn und dessen Niveau durch die Zukunft unserer Zivilisation und ihrer technischen Realität bestimmt werden, erweist sich die die zukünftige Geschichte, also die offen vor uns liegende Zeitlichkeit, als die wesentliche, die entscheidende innerhalb des technischen Bewußtseins". (Bense 1956: 16)

<sup>5</sup> Original quote: "Angeregt durch das Auftreten bekannter Störbilder im Radarbetrieb setzte ich mich mit Technikern zusammen und versuchte, verschiedene optische Phänomene auf dem Leuchtschirm hervorzurufen und elektronisch zu steuern [...] Die Fernsehtechnik eröffnete uns neue Wege in der Hervorbringung und Steuerung von kinetischen Form- und Strukturelementen". (Götz 1959: 47)

one of the foundations of modernity (Mersch 65). This point may be true in the narrower sense, but technology is always embedded in wider cultural contexts, which interact with technological processes. Contradicting Mersch's too narrow concept of technology, Gilbert Simondon asserts the tension between an inner logic of application and an outer, social or aesthetic logic of technology as a prerequisite for the functioning of technical ensembles. Simondon opts for a technological imaginary that feeds into the technical processes:

The real perfection of machines, which we can say raises the level of technicality, does not correspond to an increase in automatism but, on the contrary, relates to the fact that the functioning of the machine conceals a certain margin of indetermination. It is such a margin that allows for the machine's sensitivity to outside information. It is this sensitivity of information on the part of machines, much more than any increase in automatism that makes possible a technical ensemble. A purely automatic machine completely closed in on itself in a predetermined operation could only give summary results. (Simondon 1980: 4)

From this perspective, technical or digital functions also prove fruitful for aesthetic productions and conceptions of creativity in technical contexts.The history of technology and technical design thus becomes an area in which the "technical milieu of our being in a comprehensive sense is de- and restabilised, our 'operational memory', our values and symbols are formed and changed, and operational behaviour acquires a consistency that can be handed down" (Hörl / Hagner 2008: 8, translation A.K).

Viewed against this background, a discussion of creativity and artificial intelligence should consider the dynamic exchange between specific technologies, their technological environment and creative processes. These feedback processes between technology and society have been documented in the artistic appropriations of cybernetic theory. Interestingly enough, cybernetics brought about a variety of artistic experiments that conflated cybernetic theory with computer graphics. A prominent exhibition of computer art was *Electronic Abstractions* (Cherokee, IOWA, 1953), which figured as the first exhibition of 'computer art'. The project was conceptualised as a touring exhibition and it presented 50 photos of Ben J. Lapowsky's series "Oscillons". The images were produced with a computer, were realised on a cathode-ray-oscillograph were photographed from the screen. (Piehler 2002: 45).

The first European exhibition of cybernetic art was presented in London in 1968 with the title *Cybernetic Serendipity,* curated by Jasia Reichardt, whom it took three years to organise the exhibition. Facing the rising presence of computers in the military and the administration, Reichardt looked out for their potential in the creative world. Computers, she stated, "have so far neither revolutionized music, nor art, nor poetry, in the same way they have revolutionized science" (McCray 2022: 696). For this reason, she decided "to showcase the 'possibilities' of computers and other 'cybernetic devices' as well as the 'relationships between technology and creativity'. She wanted to demonstrate the often-unseen linkages between computers, cybernetics, and creativity, with examples of 'machine-aided' creative processes" (McCray 2022: 696).

As was characteristic of early computer and media art, Reichardt discovered her exhibits in the cooperation with the young computer industry and research institutes of technology (Piehler 2002: 51). Regardless of the art works' origins, this was the first world exhibition of what later on was called 'media art', since it covered experiments with light, graphics and animation, kinetic objects and interactive installations (Piehler 2002:52).

The press release for the London exhibition underlines its technological features and its proximity to cybernetics:

Cybernetics – derives from the Greek "kybernetes" meaning "steersman"; our word "governor" comes from the Latin version of the same word. The term cybernetics was first used by Norbert Wiener around 1948. In 1948 his book "Cybernetics" was subtitled "communication and control in animal and machine." The term today refers to systems of communication and control in complex electronic devices like computers, which have very definite similarities with the processes of communication and control in the human nervous system. A cybernetic device responds to stimulus from outside and in turn affects external environment, like a thermostat which responds to the coldness of a room by switching on the heating and thereby altering the temperature. This process is called feedback. Exhibits in the show are either produced with a cybernetic device (computer) or are cybernetic devices in themselves. They react to something in the environment, either human or machine, and in response produce either sound, light or movement. Serendipity – was coined by Horace Walpole in 1754. There was a legend about three princes of Serendip (old name for Ceylon) who used to travel throughout the world and whatever was their aim or whatever they looked for, they always found something very much better. Walpole used the term serendipity to describe the faculty of making happy chance discoveries. Through the use of cybernetic devices to make graphics, film and poems, as well as other randomizing machines which interact with the spectator, many happy discoveries were made. Hence the title of this show. (Reichardt 1968)

The optimism with which the new technologies are welcomed is striking: technology was considered as innovation within the artistic experience, as art historian David Mellor affirmed:

A dream of technical control and of instant information conveyed at unthought-of velocities haunted Sixties culture. The wired, electronic outlines of a cybernetic society became apparent to the visual imagination of an immediate future [….] drastically modernized by the impact of computer science. It was a technologically utopian structure of feeling, positivistic and 'scientistic'. (Mellor in Shanken 2010: 56)

One of the famous presenters was the British artist and theorist Gordon Pask (1928–1996) (Fernández 2008: 163), whose works provide a perfect example of cybernetic art. Contrary to theoreticians/artists like Jack Burnham or Roy Ascott, who had been looking for a connection between art and computers, Pask developed his artistic goals in a cybernetic context, and as a consequence he figures as one of the "most prominent and likely least known" English cyberneticists ( Fernández 2008: 163). As Pickering affirms, Pask's involvement in cybernetics started in the theatre, where he participated as an undergraduate and together with Robin McKinnon-Wood founded a theatre company called "Sirenelle", being dedicated to staging musical comedies (Pickering 2002: 426). Pask expressed interest in the integration of a computer into the theatre-performances and constructed a "succession of odd and interesting machines, running from a musical typewriter, through a selfadapting metronome to the so-called 'Musicolour machine'" (Pickering 2002: 426). The Musicolour machine was a cybernetic device that functioned like a homeostat, a performance "centered on a feedback loop running from the human performer through the musical instrument and the machine itself into the environment (light show), and thence back to the performer". (Pickering 2002: 427) Pickering goes on and argues that the human part of the machine could interact with the machine and explore the infinite possibilities offered in this contact (Pickering 2002: 427).

Pask's oeuvre consists of six books on education and cybernetics, 200 essays, music and plays as well as artistic projects (Fernández 2008: 163). His best-known work, the *Colloquy of Mobiles*, also presented at the exhibition in London, emerged from his interest in communication and communicative feedback-loops. Contrary to some of the neighbouring artworks, the *Colloquy* did not directly interact with a computer: Pask arranged five large mobiles hanging from a metal bar at the ceiling, interacting with each other. The mobiles were "tri-dimensional sculptures powered by motors,individually programmed and also partly computer driven". As part of the experiment, the actual computer was hidden in the metal bar at the ceiling (Fernández 2008: 165).The single elements were provided with a tool permitting interaction and communication. In terms of the self-organisation of systems, Fernández offers the following conclusion: "Colloquy met some of the requirements for self-organizing systems that Pask had identified 10 years earlier. In his opinion, self-organizing systems were 'systems that we regard as though they have elements in them that make decisions'" (Fernández 2008: 166).

The euphoric attitude towards cybernetics' possibilities is also voiced in the social and cultural discourses of the 1960s and 1970s, as the Expo 1967 in Toronto clearly demonstrated, which focused on cybernetics in all aspects of cultural and social life (Borck 2008) Even the idea of the Planet Earth as ecological system was based on the connection between early cybernetics, computer- and counterculture, as Fred Turner demonstrates (Turner 2008: 69–102).

#### **AI Art**

The innovative dimensions of cybernetics are updated and reflected in the discourses on artificial intelligence (AI) and its aesthetic dimensions. From a technological point of view, the discussion about AI and art has taken on a current dimension with the introduction of Neural Networks from 2015 on, which can be described as a paradigmatic change within computing (Sudmann 2018: 57–59). Accordingly, the resurged research interest in Neural Networks is due to a publication from Krizhevsky/Sutskever/Hinton from 2012, which explored the possibility of image recognition by reducing the error rate more than 50% and rekindled the interest in Neural Networks (Sudmann 2018: 61). In contrast to 'symbolic AI', which could generate intelligent procedures, the new Neural Networks were designed to simulate thinking processes through a large number of interconnected processing nodes, or 'neurons', which work together to process information and make predictions or decisions based on that information. At a high level, a Neural Network takes in input data, processes it through multiple layers of interconnected nodes, and produces an output. Each node in the network is connected to multiple other nodes and processes a small portion of the input data. As the data passes through the network, it is transformed and combined at each node, with the output of one node serving as the input for the next node. This process continues until the final output is produced (Sudmann 2018: 60).

The strength of the connections between nodes,and the weights assigned to each node, determine how the network processes the input data. These connections and weights are adjusted through a process called training, in which the network is presented with a large number of examples and the correct output for each example. The network then adjusts the connections and weights based on how well it is able to produce the correct output for each example (cf. Sudmann 2018: 59–61).

Neural networks are a powerful tool for AI and machine learning, and have been applied to a wide range of tasks, including image and speech recognition, language translation, and decision making. In the context of art, Neural Networks have been trained to understand the style of canonic painters or musicians, and lets the networks reproduce them. "The Next Rembrandt", for example, is a project in which a Neural Network was trained on a dataset of Rembrandt's paintings and then used to

generate a new painting in the style of the Dutch master (https://www.nextrembra ndt.com). The network Deep Dream operates to construct surreal or 'dreamlike images'.<sup>6</sup> The Deep Dream programme uses a Convolutional Neural Network (CNN), a special type of Neural Network designed for processing images and visual data. CNNs are a type of artificial Neural Network specifically designed to process data with a grid-like topology, such as an image. They are particularly useful for tasks such as image classification and object detection, as they are able to automatically learn features and patterns in the data. The Deep Dream programme can then be used to alter the image and create new, surreal-looking images. This is done by using the CNN to amplify and manipulate the objects and shapes in the image. The result is an image consisting of nested, distorted, and fantastical shapes and patterns (Miller 2021: 90).

The creation of images is mainly brought about by the so-called Generative Adversarial Networks (GAN). The GAN is a type of Neural Network that is composed of two parts: a generator and a discriminator. The generator produces mismatched outputs, such as images or audio, based on the input data it is given, while the discriminator attempts to distinguish the fake outputs from the contents of its data base. The two parts compete with each other, with the generator trying to produce outputs that are as realistic as possible, and the discriminator trying to become better at detecting fake outputs.This competition drives the network toimprove, resulting in the generation of high-quality fake outputs and even creative results (Miller 2021:92).

A very prominent and much-discussed result of a GAN is the digital portrait of the fictitious Edmond de Belamy, by the artist-collective *Obvious*. Although generated and signed by a GAN, the portrait was sold at Christies for 432.500 \$. Although its creation definitely undermined the idea of art and artistic authorship, it rather figures as a reflex on the commercialisation of international art markets (Schröter 2021: 100). As has been argued above, Neural Networks are able to act as creative agents within artistic processes, but they cannot fulfil this function within the art system, since it does not allow for a position within. And, additionally, the networks cannot (yet?) understand the cultural and stylistic dimensions of their creativity (Miller 2021: 90). For examples, Belamy's portrait falls short in comparison with historical and actual contemporary portraiture. But apart from these inner dynamics of aesthetic creation, AI artworks reflect upon the structures and discourses of the art system in different ways. Edmond de Belamy's portrait points to its commercial aspects. By signing the painting with the GAN algorithm, the concept of the authorial artist figure is deconstructed.The sale confirms Michel Foucault's position on the "author function" (Foucault 1977 [1969], 124–131) insofar as the "individual author-genius became the leading paradigm for all the arts – despite the obvious ex-

<sup>6 (</sup>https://deepdreamgenerator.com)

istence of author collectives and artist workshops" (Heibach/Krewani/Schütze 2021: 2). In the case of Belamy, it is still to be questioned, who can claim the authorshiprights to the work of art: the art-collective *Obvious* or the software designer Robbie Barrat, who developed the algorithm (Schröter 2021: 100)

A look behind the facades of the concept of authorship reveals its instability. In particular, modernist artists such as Elsa von Freytag-Loringhoven and Marcel Duchamp set out to undermine the idea of individual authorship by inserting readymades into the art system. According to Anke Finger (2021: 122) these pieces "return us to fundamental questions regarding authorship that may help contribute to the focus on media authorship and media environments that simultaneously accommodate authors, non-authors, curators, collaborators, collectors and editors – all producing remixes and mash-ups across the arts and across media".

In this way, stimulating artworks have emerged that actively dismantle the concept of the author and reveal a network of aesthetic interactions. If this approach is transferred to the products of artificial intelligence, the respective works take on a different significance. It is no longer about the simulation of canonical works, but about the productive and creative use of artificial intelligence. Miller (2021: 97) predicts a convergence of artificial and human intelligence in creative work and expects a shift in the definitions of creativity as human-machine interfaces continue to improve. He claims that "consciousness is created through data processing, and there is no reason why consciousness cannot be programmed into a machine".<sup>7</sup>

Due to the changing technical conditions and qualities of software, the question of the connection between creativity and artificial intelligence must be continually revised. This fact also applies to Dieter Mersch's (2019, p.71) arguments, which do not take into account the new technological circumstances: Although mentioning Alan Turing's Turing machine and its computational capacities, he still insists on a difference between calculability (*Berechenbarkeit*) and non-calculability (*Nichberechenbarkeit*). The difference between these values bring about creativity, as Mersch (2018, 719 argues: "there is a gap between computability and non-computability which cannot be closed and which, as a difference, cannot itself be returned to an algorithmisation".<sup>8</sup>

With all due caution, however, it can be assumed that Neural Networks function differently from a Turing machine, as Sudmann (2018: 66ff) elaborates: While the Turing machine digitally operates with the 0/1 units and thus follows the von

<sup>7</sup> Original quote: "Bewusstsein entsteht durch Datenverarbeitung, und es gibt keinen Grund, warum man Bewusstsein nicht in eine Maschine programmieren kann".

<sup>8</sup> Original quote: "ergibt sich ein nicht zu schließender Abstand zwischen Berechenbarkeit und Nichtberechenbarkeit, der als Differenz nicht selbst wieder einer Algorithmisierung zugeführt werden kann".

Neumann computer architecture, artificial neuronal networks do not follow this architecture, but they work in a parallel structure, as Sudmann argues:

Secondly, it is important to emphasise that the massively interconnected neurons that are activated by an input fire together or in parallel, and in this way they form a complex emergent system, which ultimately overcomes the discreteness of the elements of which they are composed [...] This extreme or massive parallelism of information processing is another essential characteristic of artificial neural networks, which distinguishes them from the serially organised Von Neumann architecture that is still dominant today. (Sudmann 2018: 67)

Miller (2021: 95) supports this point of view by underlining the networks' capability in surpassing the data structure and generating 'independent' decisions, which have not been programmed in advance.

Luciana Parisi (2018: 99) connects to the idea of the learning capability of artificial Neural Networks that is not structured as a top-down process, "but as a trial and error data mining through unconscious and non-hierarchical orders from decision processes".The differing knowledge processes define the form of machine learning, which she considers to be based on conclusions (abductive) and being apt for reflecting on the limits of thinking, which allow for a certain indeterminacy of thinking processes (109). Consequently, she proposes to consider machine learning as a form of experimental inferentialism (in the sense of information the we derive from our senses) to understand the incomputable reality and machinations of data (111).

These reflections on the operations of Neural Networks point to a new formation of knowledge and media. Pointing to the list as a historical formation of media, knowledge and statement, Irmela Schneider (2006) argues that media undergo stable connections with knowledge in the sense that all knowledge is brought about by media. Applying this trias to a contemporary structure, we have to admit that the impact of artificial Neural Networks is eminent and thus bring about a change in knowledge systems. Whereas the laboratory has clearly functioned as a part within knowledge systems, Neural Networks have taken their place as well, especially within the cultures of everyday life.

Contrary to the cultures of the everyday, however, where networks figure as "cryptic, invisible arrangements" (Sudmann 2018: 63), Neural Networks are visible in aesthetic creations and here they cause controversial discussions on account of their creativity. For this reason, AI art is able to reveal the couplings of media, statement and knowledge in its specific artworks.

# **Conclusion**

This chapter looks at the complex relationship between art and AI. In historical retrospect,it was possible to show the extent to which a connection between technology and art was established with the help of the cybernetic claim to art. However, in its early days, technical art was not integrated into the art world, but existed in parallel, mainly in the technical universities.

The shift in emphasis of artistic creativity to machine processes raises questions about the status of authorship. It becomes clear that authorship in artistic processes functions only as a blank spacein the discourse and as a sales strategy.With regard to the avant-garde productions of early modernism, Anke Finger (2021) clearly shows that many works of avant-garde art provide a reflexive network between materials, media and creative actors.

The newly developed Neural Networks continue a programme developed in cybernetics by organising independent feedback and learning processes. Creative work thus becomes the hallmark of digital intelligence. Questions about the specific creativity of Neural Networks thus also always touch upon the social, structural and aesthetic dimensions of the art system, and are ultimately linked to the concept of art.

Another aspect of neural creativity that cannot be ignored arises from the networks' working methods. As has become clear in the meantime, the results of the networks depend on their data situation and their respective input. The input or visual archive of the networks comes from aesthetic and/or social decisions.These interfaces provide the space for a creative, aesthetic cooperation between machine and artist, thus opening up the network ofmaterials, procedures,media and art business that was already in use in the avant-gardes. The creative, artistic intervention in the networks' ways of working opens up a reflective approach to their ways of working, as Inke Arns (2021) points out. With this creative intervention in the workings of artificial intelligences, a traditional function of art, that of critical reflection on contemporary societies, has been restored. And, at the same time, is the unsettling intelligence of artificial thinking domesticated?

# **Bibliography**

Arns, Inke (2021): "Kann Künstliche Intelligenz Vorurteile haben? Zur Kritik Algorithmischer Verzerrung von Realität". *Kunstforum International* 278: 108–121.

Bense, Max (1956): *Aesthetische Information. aesthetica II.* Krefeld, Baden-Baden: Agis-Verlag.


# **Material Films in the Age of Artificial Intelligence. Some Remarks on Automated Creativity in Contemporary Experimental Film**

*Christoph Seelinger*

#### **Introduction**

In their book *AI for Arts*, authors Niklas Hageback and Daniel Hedblom (2021: 62) point out that film, contrary to literature and music, is a far more complex art from a technical perspective. Since in film the various forms of human creativity are interwoven to form a holistic work of art,it would currently still be impossible for an algorithm to generate a complete feature-length film on the basis of a short sequence, for example. The authors summarise that the production of films, at least superficially, has been greatly simplified by green screens and digital technology, but as with other art forms, technology has so far only been able to exploit a fraction of its potential possibilities: "The quality of the artwork still largely depends on the human artist, and so far it is hard to detect any distinct quality improvements vis-à-vis earlier less tech-equipped and tech-savvy generations" (ibid.).

In their brief section devoted to "moving pictures", the authors implicitly talk primarily about exponents of commercial narrative cinema that realises (more or less) coherent narratives in an economic context with the help of human actors. Even a brief mention of independent films, which Hageback and Hedblom apparently understand as a sub-genre of general industrial filmmaking, occurs rather casually: "However, as for music, the advancements in technology have democratised film making, in that movies can be produced on quite small budgets, with less technical skills required, which has allowed for sub-genres, such as indie movies, to evolve and find an audience" (ibid.).

However, this largely ignores reflections on audio-visual aesthetic artefacts in the context of experimental films produced beyond dominating companies and dominating narratives, in which narrative elements are present at most in rudimentary or abstract form, and whose rhythms thus resemble those of musical or poetic works to a considerably greater extent. Accordingly, in the following article I will use two examples to present procedures with which artists of contemporary

experimental film adapt AI technologies and/or automated creative processes for their own aesthetic purposes and create an update of historical precursor forms like cinèma absolu or cinèma pur by means of algorithmically assembled digital images. The focus of my consideration will be the artists Vadim Epstein and Eva [Evi] Jägle (alias Einhorn sterberate) with their major works "Ghosts" and "Inspirationsquellen". In his 2021 short movie "Ghosts", Epstein creates a kind of meditative material film of virtual space through the artful use of image-to-image translation GANs, while in Jägle's "Inspirationsquellen" from 2020, a (seemingly) randomly assembled cluster of virtually constructed rooms, objects and camera movements form the basis for the creation of very personal, rhizome-like intertwined parallel worlds. The fact that both Epstein and Jägle have published their films in the public domain on the video platforms Vimeo or YouTube is only one aspect that identifies them as part of a counter-current against a prevailing blockbuster cinema in which technical innovations are first and foremost subordinated to the dictates of narrative and economic gain.

# **Ghosts of digital modernity**

convolutions (ibid.).

Istanbul-based multimedia artist Vadim Epstein states about his eight-minute short film "Ghosts", which is just one of his countless projects, that it is "is the most complete and extended opus from the ongoing series, exploring complexity emergence, based on the feedback loops" (2021). In the same statement that Epstein attached to his video within the commentary section on Vimeo, he elaborates further on the process of creation:

Multidomain image-to-image transforming neural network StarGAN2 has been used here recurrently, reprocessing its own output without additional inputs. The models have been trained on both figurative imagery and abstract art, to enrich and intensify visual & semantic experience. Moreover, part of the training data was synthetic itself: few source datasets were generated with custom StyleGAN2 models, adding another layer of mediation to distance it even farther from the real. What we eventually get is an ever-changing shape-shifting loosely controlled abstract flux, which appears more lifelike and expressive on its own, than obscure resemblance of the origin flesh, stuck in the neural

The Generative Adversarial Network, or GAN, mentioned by Epstein is a machine learning model capable of generating data on its own. It consists of two competing artificial neural networks, one of which, the generator, has the task of generating real-looking data, while the other, the so-called discriminator, is supposed to identify the generated data as real or artificial.Through constant learning and many iterations, the results are emancipating themselves more and more from their appearance as virtual reality. Accordingly, the generator produces data that the discriminator checks for artificiality on the basis of data sets taken from non-virtual reality. The aim of the generator is to sooner or later produce data sets that the discriminator can no longer distinguish from real data. First, the generator produces random data, (for example, an image).The discriminator, which was previously trained with real data, (for example pictures), tries to recognise whether it is real or artificially generated data. In a second step, the discriminator returns its results to the generator network. The generator then tries to generate new data more similar to the real data, which the discriminator checks again. Since the two networks are logically coupled and train each other, both are involved in a continuous learning process. With each iteration, the artificial data therefore becomes successively more akin to the real data (cf. Wiegand: 2018).

One of the most famous examples of GANs is probably the so-called "Meow Generator" by Alexia Jolicoeur-Martineau. The biostatistician and expert in statistics and machine learning from Montreal fed said Meow Generator with 10,000 photographic images of cats. After a short waiting period she received 9304 images of fictitious cats with 64 × 64 pixels resolution and 6445 images of fictitious cats with 128 × 128 pixels resolution – all of them, of course, purely fictitious animals that her GAN constructed from the given data (Jolicoeur-Martineau: 2017).

The typical areas of application of a GAN can therefore obviously be found in the fields of film and photography: It can be used to create missing backgrounds that look deceptively real – think, for example, of crowd scenes in blockbusters where a handful of extras are artificially stylised into a gigantic army.Other possible applications in film or image editing are the subsequent colouring of black and white shots, the generation of artificial voices or the creation of three-dimensional objects from sketches or 2D templates. Hageback and Hedblom also address the increasingly virulent phenomenon of deepfakes, which they call

the most interesting breakthrough this far with regard to moving pictures. Here artificial intelligence is used to project human appearances on moving pictures where they really do not exist. In a similar manner as is it possible to emulate a human's language, this technique, deploying a type of neural networks, generative adversarial networks (GAN), allows for replicating an individual's face with its idiosyncratic facial moves, or bodily movements. It means that an actor can be altogether replaced by another actor or let the algorithm adjust the face of an actor to make them appear younger to better fit the role. Whilst these technologies today are expensive, the ongoing digitalisation will lower the price and open up for a broader set of users to deploy these technologies. Obviously, it raises the question of the future role of the actor, if at all required, and their price tags" (2021: 64).

While all of these applications aim to achieve the highest degree of photorealism, Epstein's "Ghosts" can hardly be called lifelike, despite the artist's own assertions. "Ghosts" consists of a constant flow of permanently changing images that Epstein has subsequently underpinned with atonal noise soundscapes. As a restless stream of incessant metamorphoses, they decouple themselves from any form of naturalistic representation both through their infinite transformations and through their primordial abstractness. The fact that the images Epstein has created with the help of his AI programmes constantly morph into one another means that they generate new kinds of visual meaning virtually every second, the sheer rapidity of which alone makes it incredibly difficult for the viewer to keep up with their restless transformations. All the images that Epstein has fed into his GAN network – cat heads, possibly as a reference to Jolicoeur-Martineau; structures remotely reminiscent of hair or fur; celebrity portraits; pencil drawings; paintings that per se possess a high degree of abstraction, such as those by Vasily Kandinsky in particular – do not form self-contained units as the final result of an artificial reproduction process. Rather, to take up Epstein's river metaphor, their unsteady stream constantly blurs into an ultimately amorphous mass that oscillates between the vaguely recognisable and the completely unintelligible. Likewise, the largely atonal but homogeneous sounds by artists Alexander Kopeikin and Fractal Heads with which Epstein has synchronised his flood of images contributes to giving "Ghosts" the character of a meditative material film.

Historical predecessors such as the Materialfilme by Birgit and Wilhelm Hein from the 1960s and 1970s received their structure from the carrier system of the film itself, i.e. celluloid as the image carrier, the frame of the image, the speed of the camera. In their fundamental "Subgeschichte des Films", Scheugl and Schmidt jr. point to the close connection between material film and abstract film: "Der Bildinhalt wird dabei nicht gleichgesetzt mit Filminhalt (wie beim narrativen Film), sondern ist gleichsam autarkes Gestaltungsmittel, das nicht nur vom Inhalt, sondern oft auch als Bild abstrahiert wird.Die erstenMaterialfilme sind bezeichnenderweise auch die ersten abstrakten Filme" (1974: 584). – "The contents of the image are not equated with the content of the film (as is the case with narrative film), but are an autonomous means of design that is not only abstracted from the content, but often also as an image. Significantly, the first material films are also the first abstract films" (translation A.K.).

By repeatedly referring to their own physicality, elements that normally remain invisible, that are not counted as part of the content as technical realities, form the visual material that is woven into a more or less abstract composition. An example of this would be "Materialfilme II" by Birgit and Wilhelm Hein from 1976, which solely consists of multiple long shots of watercolours applied to the film material.The material itself was originally not intended to be viewed as a running film. Rather,it consists of parts of the beginning and end tapes of other films, whose scribbled notes and painted colour markings are merely an aid for the projectionists in dealing with the film materialitself.Through the Heins' specific approach, however, the half-hour unrolling of the film tape becomes a quasi-contemplative viewing experience that finds its virtual echo in Epstein's "Ghosts". Epstein, too, explicitly refers to the raw material of his cinematographic vision, only in his case, of course,it does not consist of physical celluloid but of digital data sets. Another difference is the reception situation in which contemporaries saw "Materialfilme II".The Heins showed their films at festivals as classic cinema projections: the audience is in a dark room, delivered to the screen, as if enjoying a conventional feature film. Epstein, on the other hand, releases "Ghosts" in the virtual space of Vimeo, where a completely different, more individual sequential viewing practice prevails in comparison to the holistic cinema experience that the Heins (presumably ironically) still served.

At their core, however, both approaches to the artistic instrumentalization of automated creativity overlap insofar as both the Heins and Epstein relinquish control over their aesthetic artefacts from their own hands: In that the film material in "Materialfilme II" virtually stages itself, and that the visual feedback loop in "Ghosts" unfolds without Epstein's intervention solely as a dialogue between the generator and the discriminator of his StarGAN network, both works largely decouple themselves from the human world of experience and perception. In a certain manner, both films counteract the obsessions of Hollywood cinema as postulated by French philosopher Jean-François Lyotard in his normative programme of the "Acinéma" in 1973 through the ostentatious purposelessness of their images.

For Lyotard,filmis first of all "the writing ofmovement"(2017: 33).Objects within the frame move, the lens moves, montage creates movement between shots. Learning the craft of (commercial, narrative,institutional) filmmaking accordingly means selecting from a broad spectrum of possible candidates for the role of cinematographic writing material those that, in relation to the images / movements that surround them, generate values that are in turn purely in the service of the diegesis.The organisation of movement in commercial cinema follows strict "rules of representation for localisation in space, rules of narration for the schematisation of speech, rules of the genre 'film music' for tonal time" (ibid.). Disturbing movements that threaten to defeat the desired effect of reality must be ruthlessly excluded, all that "what is fortuitous, dirty, confused, unsteady, unclear, poorly framed, overexposed" (ibid.). Lyotard uses a lucid example to illustrate what for him constitutes his concept of "Acinèma":

For example, suppose you are working on a shot in video, a shot, say, of a gorgeous head of hair à la Renoir; upon viewing it you find that something has come undone: all of a sudden, swamps, outlines of incongruous islands and cliff edges appear, lurching forth before your startled eyes. A scene from elsewhere, representing nothing identifiable, has been added, a scene not related to the logic of your shot, an undecidable scene, worthless even as an insertion because it will not be repeated and taken up again later. So you cut it out (ibid.).

But it is precisely these waste products, which would be withheld from a commercial film, that Epstein and the Heins use to compose their material films. Interestingly, Lyotard's essay also contains a metaphorical image that precisely describes the organising principle of Epstein's "Ghosts" with its communicating AI units:

No movement, arising from any field, is given to the eye-ear of the spectator for what it is: a simple *sterile difference*in an audio-visual field. Instead, every movement put forward *sends back* to something else, is inscribed as a plus or minus on the ledger book which is the film, *is valuable* because it *returns* to something else, because it is thus potential return and profit (2017: 33).

The purposelessness of Epstein's GAN experiments, their anti-pragmatic impetus, their offensive retreat into a pure L'art-pour-lЖart aesthetic lead to the fact that "Ghosts" has an inherent value that cannot be measured according to the monitory aspects of the commercial film business. What is interesting here is also the title of Epstein's film: it seems quite intentional that we understand his ghosts of the digital age as ancestors of those that have haunted the various visual media since the beginning of technical modernity. From the ghost photography of the 19th century to the silent phantoms of early cinema, we have arrived at artificially calculated revenants that haunt extra-filmic reality through their deceptively real resemblance to it – with the irony, of course, that Epstein uses the photorealism of his GAN network to create ghostly images that, if anything, resemble fleeting chimeras.

# **Rhizomic labyrinths**

A rather contemplative work by Birgit and Wilhelm Hein such as "Materialfilme II" stands in opposition to an early work by the couple such as "Rohfilm" from 1968, whose aggressive-destructive character Birgit Hein highlights as follows in her pioneering work on "Film im Underground":

After the colour film 'Grün' in 1968 […] [Birgit and Wilhelm Hein] created the extraordinarily aggressive 'Rohfilm' in 1968, which represented the destruction of the conventional visual world. Dirt, hair, ashes, tobacco, small pieces of film images, holes in the edges, perforated tape are stuck onto blank film. This is filmed off again, as only one projection is possible with the thick, stuck-on strip. During filming, the original occasionally gets caught in the film carrier, so the same image appears again and again, or film images melt under the intense heat of the projector, which only runs at very slow speed. The filmed piece is then subjected to various reproduction processes and projected and filmed over video, editing table, viewing device, in order to make the change clear through the reproduction process alone. Further film strips are once again glued together and filmed from different positive and negative strips, from 8 mm and 16 mm strips, which simultaneously show two different image sizes, 8 mm film is pulled through the viewing device without a shutter and filmed so that image lines and perforation holes, i.e. the strip as material, become visible. The film gives the impression of tremendous destruction. The images burst into individual pieces, into swirls of hugely enlarged dirt particles and image remnants. The aggressive sound heightens the effect and challenges the viewers to scream loudly to defend themselves against the film's superiority (translation A.K.; 1971: 149). <sup>1</sup>

In its deliberate challenge of its audience's viewing habits, "Rohfilm" coincides with the hand-painted experimental films of the Basque painter José Antonio Sistiaga. His main work "... era erera baleibu izik subua aruaren... ", created between 1968 and 1970 in work shifts of twelve to 17 hours per day, embodies a feature-length

<sup>1 &</sup>quot;Nach dem Farbfilm 'Grün' 1968 […] folgt [Birgit and Wilhelm Heins] Durchbuch zu einer eigenen Sprache mit dem außerordentlich aggressiven 'Rohfilm' 1968, der die Destruktion der herkömmlichen Bildwelt darstellt. Auf Blankfilm wird Dreck, Haare, Asche, Tabak, kleine Stücke von Filmbildern, Randlöcher, perforiertes Klebeband aufgeklebt. Dieses wird wieder abgefilmt, da mit dem dicken, beklebten Streifen nur eine einzige Projektion möglich ist. Beim Abfilmen verhakt sich ab und zu das Original in der Filmbühne, so erscheint dasselbe Bild immer wieder, oder Filmbilder schmelzen bei der starken Hitze des nur mit sehr langsamer Geschwindigkeit laufenden Projektors. Das abgefilmte Stück wird dann verschiedensten Reproduktionsprozessen unterzogen und über Video, Schneidetisch, Betrachtungsgerät projiziert und abgefilmt, um die Veränderung allein durch den Reproduktionsprozeß deutlich zu machen. Weitere Filmstreifen werden neu aus verschiedenen positiven und negativen Streifen, aus 8-mm- und 16-mm-Streifen, die gleichzeitig zwei verschiedene Bildgrößen zeigen, zusammengeklebt und abgefilmt, 8-mm-Film wird ohne Shutter durch das Betrachtungsgerät gezogen und gefilmt, so daß Bildstriche und Perforationslöcher, also der Streifen als Material, sichtbar werden. Der Film vermittelt den Eindruck einer ungeheuren Zerstörung. Die Bilder zerplatzen zu einzelnen Teilen, zu Wirbeln riesenhaft vergrößerter Dreckpartikel und Bildreste. Der aggressive Ton steigert die Wirkung und fordert die Zuschauer zu eigenem lauten Schreien heraus, um sich gegen die Übermacht des Films zu wehren."

sequence of film strips that were previously coloured in various ways. The total of 108,000 35mm frames, which were painted using a whole range of techniques and materials such as sand, soap bubbles and even a cardboard tube, result in an abstract painting that through the mobility of its images has come to life. The outcome is an uncontrolled but silent rush of colour and form, "a depiction of a cosmic circulatory system as well as the firing synapses of a galactic mind", as Zinman puts it (2020: 1). In his poetological notes, Sistiaga himself uses a similar pathetic tone when he describes his film as an opportunity to "take the blindfold of rationalism off and enjoy the unknown". (Quot. ibid. 2)

With its specific aesthetic of overtaxing, Sistiaga's film can certainly be seen as a model for our second example of automated avant-garde within contemporary experimental cinema, "Inspirationsquellen" by Eva (or Evi) Jägle, which, like Epstein's "Ghosts", celebrates its premiere in digital space: While Epstein's platform by choice is "Vimeo", Jägle publishes her films on YouTube. Most of the uploads on the Viennese artist's YouTube-channel "Einhorn sterberate" settle at a running time of a few seconds / minutes and bear such cryptically absurd titles as "Bergsonistisch gedichtig in Kapitelinhaltsverzeichnistigkeit", "Differenz und Wiederholung Vorwort. in ausufernder ausdrücklichkeitsannahme", or "Logik des Sinns. 6. Serie der Paradoxa: Über die Serialisierung".

As of today, the channel launched on 9 June 2016 has a total of 168 videos and just 68 subscribers. Most of the uploads are small digital gimmicks: Jägle uses graphics programmes such as Blender or Autodesk to create representational 3D scenes, then covers them with various effects using Adobe After Effects and, above all, lets a virtual camera circle around them, through them, over them. By means of an automated montage that randomly links the prefabricated objects, camera movements and image elements, this basic stock of audio-visual artefacts constantly jumbled up in a deliberate messy way similar to Vadim Epstein's "Ghosts". Often, several designs seem to be combined with each other, i.e. several image levels are superimposed,mirroredin each other, resultingin a fascinating effect of overdetermination. Sometimes the tracking shots through Jägle's abstract-sterile digital worlds are accompanied by recitations of self-penned philosophical-poetic texts, sometimes they are accompanied by noisy soundscapes, sometimes they are silent.

So far, Jägle has only once constructed a longer video from all these individual building blocks, which give the impression of loose, still unconnected fragments or loose ideas: The 21th May 2020 saw the release of her first feature-length film, the almost one-hour-long INSPIRATIONSQUELLEN, which, with currently 39 views, is one of the most rarely clicked videos on her channel. With "Inspirationsquellen", Jägle challenges traditional visual media with such a high surplus of semantic meaning that logic, coherence and meaningful connections between the single images collapse defenceless under the sheer flood of impressions. Since Jägle is currently doing her doctorate at the University of Vienna on the "Kinematographierung der

Philosophie durch Deleuze", it stands reason to understand her work as a practical execution of the rhizome metaphor developed by Gilles Deleuze and Felix Guatteri in the 1970s. In her master's thesis from 2017, also dedicated to Deleuze, Jägle writes, as if trying to grasp her own video works, on the post-structuralist metaphor of the rhizome:

The rhizome wants to create as it develops, no structures are to be copied, the logic of the rhizome is to extend the lines of flight, to always find a way out that is as non-significant as possible, and a broken line can again become a point of connection or a singularity. If the crocodile looks like a tree trunk, it does not reproduce it but becomes it (translation A.K.; 2017: 96). <sup>2</sup>

Above all, Jägles opus magnum "Inspirationsquellen" can itself be understood as a rhizome that cannot be looked at from above, that can never be grasped in its entirety, that can only be followed in its details to see in it solely difference and repetition. In the words of Jägle herself: (2017: 3).<sup>3</sup> "The extreme points are not representable, the absolutely solid and the completely in-self-reflected or moved (Hegel) are both not representable, the moment one wants to grasp them, they have disappeared" (My translation).

The first thirty seconds of "Inspirationsquellen" are symptomatic of this strategy of visual volatility. On the soundtrack we hear an emotionless computer voice reciting a text presumably written by Jägle herself, which, to put it simply, seems to be about the lyrical self 's struggle with the overwhelming side effects of the digital world: "Bildschirme legen sich über mein Gesichtsfeld. Auf meinen Augen Schwere. Wenn ich die analoge Welt betrachte, durchzieht ein Schleier meinen Blick, nicht sichtbar. Unterschwellig beginnen sanft die Augen weiß zu rauschen, wo doch der Horizont sie entlasten sollte". – "Screens cover my field of vision. On my eyes heaviness. When I look at the analogue world, a veil crosses my gaze, not visible. Subliminally, eyes gently begin to rush white, when the horizon should relieve them" (My translation). A geometric structure made up of several cuboids and cubes rotates faltering under the superimposed film title. Describing the object in more detail fails

<sup>2 &</sup>quot;Das Rhizom will erschaffen, während es sich entwickelt, es sollen keine Strukturen kopiert werden, die Logik des Rhizoms besteht darin, die Fluchtlinien zu verlängern, immer einen Ausweg zu finden, der möglichst nicht signifikant ist und aus einer gebrochenen Linie kann wieder ein Anknüpfungspunkt oder eine Singularität werden. Wenn das Krokodil einem Baumstamm gleich sieht, dann reproduziert es diesen nicht, sondern wird zu ihm".

<sup>3 &</sup>quot;Die Extrempunkte sind nicht darstellbar, das absolut Gesetzte und das völlig In-sich-Reflektierte oder Bewegte (Hegel) sind beide nicht abbildbar, in dem Moment, in dem man sie erfassen will, sind sie verschwunden"

solely because the original 3D model has been alienated in terms of colour afterwards and because many of its components elude human language, since they are purely abstract objects. Two dolphins in dorsal view can be seen, as well as huge adjusting screws and gear sticks. Fragments of Japanese anime and kitten comics are projected onto the flanks of some cuboids. All of this is brought together according to an almost surrealistic combination aesthetic that brings to mind Lautréamont's famous dictum about the chance encounter between an umbrella and a sewing machine on a dissecting table (1973: 234).

At another randomly picked moment at 9 minutes and 5 seconds Jägle's voice reads on the soundtrack several texts simultaneously, in other words, the voice recordings have been overlaid in such a way that one can only clearly understand individual catchwords, at most a half-sentence, a Babylonian babble of voices that *in nuce* sums up the construction principle of the entire film: so many images, sounds, words, sequences of movements are piled on top of each other that their interplay evokes a rushing nothingness, the end of all semantics where its density is at its highest. On the pictorial level we are confronted with the same effect: Even if one only wanted to describe the still image at 9:05 according to art-scientific standards like a classical painting, one could spend days on it. The image content is reminiscent of a refrigerator door covered to the last corner with colourful stickers, postcards, private photos, where at some point one has begun to cover the already existing layer of pictures with a second, third, fourth one. At the bottom left, the artist's own eyes look at us in close-up. The Japanese cartoon character Sailor Moon lets herself be recognized, at least her upper body, standing at right angles to the picture frame. At the top right, a goat gif bleats, next to a comic bunny holding a carrot. The Fanta logo can be made out, a fox, a butterfly.

Finally, let us look at the end of the film. At minute 58 and 20 seconds, you can hear on the soundtrack: "…dass ich gar nichts sagen kann. Das, was ich sagen kann, verfault schon in meinem Munde und ich könnte nie das sagen, was ich sagen möchte…" – "…that I can't say anything. What I can say is already rotting in my mouth and I could never say what I want to say..." (my translation), uttered by Jägle apparently in conversation with a second person, whose voice, however,is too far away from the recording device for one to clearly understand her rebuttals.This dialogue is accompanied by a calm ambient soundscape. Like the words, the images also fail (or rot): In the final shots of the film, just like Jägle's off-screen monologues, the images do not reach a satisfactory conclusion. Motifs layered on top of each other, interlocked, optically puzzling each other through accidental overlapping, repetitively running through the entire film, are what we encounter in the last seconds as well: The artist in a transparent full-body suit, smeared with paint, assuming strange poses, performing mysterious, potentially ritualistic gestures; a golden portal like from a vintage adventure video game, behind which the virtual camera races along a gallery of scurrying canvases; a hand leafing through a book, mirrored several times; a centrifuge slowly rotating around itself in the upper left of the image; a goldfish swimming by; the desktop background of Jägle's laptop, the trashcan icon prominently highlighted. The stream of images breaks off abruptly, Jägle denies us credits or any other kind of culmination or conclusion. Her film, like Deleuze's rhizome, is a labyrinth without end or beginning.

# **Bibliography**

Hageback, Niklas / Daniel Hedblom (2021): *AI for Arts*. London: Routledge. Hein, Birgit (1971): *Film im Underground*. Frankfurt/Main / Berlin / Wien: Ullstein. Jägle, EvaMaria (2017): "Deleuze' unausdrückbarer Ausdruck, der den Inhalt mit sich

reißt. Das Unverfügbare der Differenz-Figuren verkoppelt sich zu einer Ausdrucksmaschine und bedingt das Disparate des Stils. Eine künstlerisch- strukturelle Wechselseitigkeit von Sprache und ihre Auswirkungen auf den Sinn". Master thesis, Universität Wien. https://phaidra.univie.ac.at/download/o:1334 795 (15 December 2022).

Jolicoeur-Martineau, Alexia: "Meow Generator". Online: https://ajolicoeur.wordpre ss.com/cats/ (15 December 2022)..

Lautréamont (1973) : *Oeuvres complètes.* Paris: Gallimard.


# **Film- and Videography**


"Materialfilme II" (1976): Dir. Birgit & Wilhelm Hein. *Materialfilme 1968–1976.* Edition Filmmuseum München 54, DVD 2012.

# **Dear GPT-3: Collaborative Writing with Neural Networks**

*Jenifer Becker*

### **Collaborative writing with neural networks**

*Hello GPT-3! How do you feel about co-writing a novel with me? That sounds like a great idea! I'm excited to work with you on this project. What's your name? My name is GPT-3. What would your author name be if you were to write a novel with me? My author name would be "N.B. Mysterious". Would you mind picking an alias for me? My alias for you would be "GPT-3".*

GPT-3 and I engaged in this dialogue about four months ago. I wanted to develop a collaborative writing project, not only to collectively realise a novel, but also to negotiate poetological questions which might arise during the process. These inquiries, I hoped, would generate insights about the act of prose writing alongside artificial neural nets. The dialogue contains what I deem to be an eerie and impressive twist. If I were to follow the suggestion, I would henceforth be referred to as GPT-3 and GPT-3 would be known as N.B. Mysterious. The names already implied a performative entanglement between woman and machine, my chaotic consciousness and the stochastic black box that is GPT-3. Did GPT-3 need me at all? If so, for how much longer?

We are currently entering an exciting phase in which writing and technology or *writing with technology* is subject to grand changes. Journalist Josh Dzieza (2022) underlines: "AI writing has entered an uncanny valley between ordinary tool and autonomous storytelling machine. This ambiguity is part of what makes the current moment both exciting and unsettling".While Artificial Intelligence has quietly been normalizedinmedical orindustrial sectors, neural networksimbued with the ability to write articles or even entire novels still seem to shake up the literary world. Pretrained language transformers such as GPT-3, which make up the codified core of the writing programmes Sudowrite or AI Dungeon, promise a future of exponential literary productivity in alliance with a literary market able to generate on-demand

fiction based on individual preferences. If we take a closer look at the operational modes of those systems, further reviewing their literary output, specifically in terms of narrative prose, one has to accept that these promises of a possible literary future are neither fulfilled nor do they seem to be attainable just yet (cf. Roloff 2019). Narrative prose – novels, (auto)biographies, memoirs, all literary genres deemed to be vessels for a narrative – implies specific reading expectations such as a certain degree of coherence, an anthropomorphic character or a story (cf. Bal 2009). While artificial neural nets (ANN) already seem to operate as autonomous poets, being able to generate what might be considered genuine poetry, AI-generated prose such as *1 the Road* (Goodwin 2018) or *Dinner Depression* (Raffel 2022) consistently evoke the feeling of an author having indulged in heavy doses of psychedelica. Reading these books can be fun, but I guess that most readers – including myself – will lose their interest quickly.These novels lack coherence and oftentimes seem arbitrary because of the ANN's limited context windows.<sup>1</sup> Because artificial neural nets like GPT-3 are not yet able to write text that is received as complex narrative prose "on their own", it seems more productive to look for possibilities of generating text in collaborations between system and human. This is currently happening in form of co-authorships (Allado-McDowell 2020&2022; Amerika 2022). In this process, the neural network is fed with prompts and subsequently generates material that in turn might be processed by the human entity again – a creative cycle able to be sustained infinitely.

Engaging in this relationship of co-authorship with GPT-3 or other ANNs raises a multitude of questions: Which writing practices can be observed at this particular moment in time? How do text and text interact with each other? How do writing processes take place and what are their functions? And:Which poetologicalimplications and concepts of authorship are involved? These questions are of particular relevance as both the artistic production of coherent prose with AI and the analysis of writing practices with AI are currently only in their early stages (cf. Bajohr 2022).<sup>2</sup> The questions asked above all deserve to be explored individually, however, in my essay I aim to shed some light on those concerning writing practices with AI. In doing that, a closer look at three distinct writing practices commonly used to generate narrative (AI) prose at this specific moment of time is offered:

1) Interrogation 2) Complete and 3) *Writers' Room* as a conceptualisation lab. I will pose suggestions on how these methods could be built upon. The case studies I will

<sup>1</sup> Gwern Branwen (2020) points out that GPT-3 has no form of memory or recurrence and can only operate logically within a framework of about 500 to 1000 words.

<sup>2</sup> Building story machines has long been a research subject in the fields of AI and Linguistics (ct. Gervás 2009). The aim being an effective amalgamation of narratological and linguistic perspectives. However, it seems striking that this specific field of research only takes place in the context of AI research, while it has neither been rewarded particular importance in the context of literary or art production, nor within literary and artistic economies.

draw upon are mainly built around K Allado-McDowell's work, in addition I will also incorporatemy own experiences of working with GPT-3 atOpenAI's Playground.Before exploring these collaborative writing practices in further detail, a brief overview of the methodologies of writing with neural nets and how this specific writing practice might be situated in the history of generative literature, as a theoretical base of further inquiries in writing practices with AI, will be given. A crucial distinction by scholar and author Hannes Bajohr (cf. 2020) – who negotiates the history of generative literature on the basis of two different modes of production from a mediatechnological perspective, categorized under the sequential and the connectionist paradigm – will form the foundation of this segment.

#### **Methods of generating literature: The connectionist paradigm**

The appearance of generative literature is as diverse as the methods used to generate the texts. "Generative literature" as a concept is nebulous, as it has already been discussed under the guises of computational, electronic or digital literature and as such carries within itself a history and poetological tradition of its own (cf. Hayles 2008; Rettberg 2019; Schönthaler 2022; Bajohr 2022). The methods of these specific literary genres range from Markov chains to algorithmic randomizations to Natural Language Generation based on datasets assembled of canonical, scraped or written texts (cf. Lamb / Brown / Clarke 2017; v. Stegeren / Theune 2019; Bajohr 2022; Linardaki 2022). When it comes to AI prose or narrative prose that is written in assistance of AI, no specific means of categorization exists as of yet. To organise the various production methods regarding generative literature, scholar and author Hannes Bajohr (2020: 19) proposes to distinguish them from a media-technological perspective and summarizes modes of production according dichotomic paradigms: "the sequential paradigm of generative literature [...] employs linear algorithms, and the connectionist paradigm [...] is based on neural nets. " Under the sequential paradigm Bajohr (2020: 10) groups texts that are "executed as a sequence of rule-steps". Production modes in the realm of the connectionist – the paradigm my research mainly evolves around – involve the employment of artificial neural nets with the ability to generate natural language. To put it briefly, artificial neural networks are computational systems that mimic the functioning of the brain, working with connected units labelled artificial neurons (cf. Yang / Yang 2014). ANNs that are able to generate natural language are trained via machine and deep learning. In literary contexts one could either train a system by feeding it a (small) selfselected dataset, use a pretrained one or choose to finetune a pretrained system. Training one's own system allows total control over the data, but the output does not match the quality of language found in the currently available pretrained transformers, such as GPT-3. The autogressive language model GPT-3 has been released 2020 by OpenAI and excels in particular due to its large training corpus.<sup>3</sup> Working with a pretrained system neither requires an understanding of code nor any in-depth knowledge about the inner workings, the ghost in the machine. Apart from material barriers such as access to digital technology, the internet and, to some extent, the application of paid-use, the systems are accessible to everyone now. Given an initial text as prompt, it will continue writing that prompt. GPT-3 is furthermore able to write articles, summarize, flesh out paragraphs, translate and turn notes into full sentences. Still GPT-3 is not yet able to write stories on its own, at least not on the scale of a longer, complex novel, which is, as I pointed out earlier, due to limited context windows. GPT-3 "forgets" its protagonists after a couple of sentences because the system does neither know what a protagonist is nor how stories are generally constructed, it lacks semantic understanding. GPT-3 only knows how to statistically generate natural language, on such a high level, nevertheless, that the distinction between human and machine-generated language is oftentimes blurred beyond recognition.This play in the interstitial is what makes it so intriguing to engage with GPT-3 in the form of a conversation.

# **Dear GPT-3, what is an angel? Interrogation (of the self)**


*Goldhorn 2022, translated by DeepL under my supervision.*

Currently we are witness to a pluralization of documented conversations with GPT-3.<sup>4</sup> Talking with pretrained transformers has become a unique literary genre within the field of electronic literature. The topics range from discussions about

<sup>3</sup> The entire online encyclopaedia "Wikipedia" accounts for only 3% of the total text corpus being used to train GPT-3. The largest part of the language data set consists of a common crawl dataset (cf. Katzlberger 2021).

<sup>4</sup> During my research, I noticed that the demographics of authorship mirror roughly that of the tech industry, as it appears to be predominantly male.

the effects of the Covid pandemic to the future of cryptocurrency, Frank Herbert's sci-fi novel Dune (Ouimet 2020) or cover ways of preparing tea (William 2022). Even established writers are increasingly drawn towards these systems, interrogating them, creating poetic works, demonstrating their own literary originality by dreaming up ever more inventive prompts. I too have started by asking GPT-3 questions when I first joined the Open AI Playground. At first, I would inquire about trivialities – *what is the population of Tokyo? Who is Angela Merkel? –* but quickly moved on to demand answers to increasingly abstract and complex issues – *what does it mean to die?*

Arguing from a storytelling perspective we could categorize these conversations or the process of holding these conversations as a writing practice that aims to create a story with two characters engaging in an interrogation.These interrogations have different intents and functions, the most popular conversational intention might be to question artificial language intelligences about their self-understanding and to explore the boundaries of their knowledge. Conversations such as *Das Flüstern der KI* between author Marius Goldhorn and GPT-3 (2022) that I quoted at the beginning of this passage, demonstrate limits of AI consciousness while simultaneously upholding said illusion of consciousness. The conversation fabricates the idea that one is confronted with a conscious entity able to think freely while communicating on the base of natural language. Yet we have to take into account that the prompts always have a direct effect on the responses of the system: we could think of the prompting process as the creation of a (fictional) character that we manipulate to a certain degree with our intentions as authors. Every prompt directly influences the output, only the ways in which they do are obscured.

To a certain degree, the exposition of an AI as a codified system with clear boundaries of "thought" always carries in itself a comical element, since this process inevitably involves a joke or a punchline.This raises questions as to how jokes function – they need a specific context, context which is bound to change rapidly in the age of information. How long will we laugh at neural networks talking about the existence of angels on the internet? A rather politically motivated approach seems to me the exposition of biases, as artists Ethan Plaue and William Morgan show *in Secrets and Machines: A Conversation with GPT-3* (2021). In their case, prompts intend to commence a self-reflexive dialogue about knowledge production and mechanisms of concealment in the context of a critical post-colonial discourse.

Other interrogational approaches propose to consider AI as a conversational partner at eye-level, where the AI equally serves a mirror function for auto-interrogation. This self-questioning is done either with neural networks trained with selected datasets (Kuhn 2021; Amerika 2022) or with pretrained language transformers such as GPT-3. An example of the latter would be K Allado-McDowell's *Pharmako AI* (2020). *Pharmako AI* is a conversation between Allado-McDowell and GPT-3 about spirituality, ecology, poetics and transcendence. GPT-3 appears as an entity, which has been given writerly agency. Artist Irenosen Okojie (2020: X) emphasizes how difficult it is to draw a dividing line between human and AI: "Prompts and responses are so deeply profound, so poetic and wise, it produces transcendent, multi-pronged consciousness". In *Pharmako AI* a clear, formal distinction between human and AI is maintained by the usage of different fonts for each. Additionally, we also have to take into account that narrative seeds are always planted by Allado-McDowell themself. However, there is a discernible effort to give GPT-3 space as an author, form and grammatical errors were subsequently corrected in *Pharmako AI*, but the text remained unedited beyond these cosmetic alterations (Allado-McDowell 2020: XI). Poetological programmatics as those of Allado-McDowell locate potential in an equal collaborative act. In this case working with ANNs is never reduced to a mere functional human-tool relationship. The aim of the ongoing writing process is to create an interrelated cycle of prompt and output, in which a new, bilaterally affected vocabulary is developed. Josh Dzieza (2022) suggests that writing with AI in this way offers both the possibility of creative thrill while posing novel questions of influence and control – as every collaboration does. Conversations offer a fruitful formal framework for this kind of co-authorship, the genre allows for possibilities to relinquish the illusion of agency by the creation of two or more (fictional) characters engaging authentically with each other, while simultaneously allowing the neural net to develop its presumedly own identity, endlessly shaped by the authors prompts. – I wonder: *Is talking with GPT-3 always an act of narcissism?*

# **Complete**

After several conversations with GPT-3 about the essence of life, AI poetics, and my very own career trajectory, I started having texts completed by GPT-3. GPT-3 is not the first or only language transformer to offer the completion of a text in a meaningful way – at least within the confines of the given context window – but at the moment, it remains the most capable at picking up style, sentence structure and codes, thereby copying the individual tone of the prompt. The process is quickly explained: a text fragment serves both as prompt and as the initial fragment of the final outcome. My first experiments were based on excerpts from my dream diary.

I am in a hotel on a Caribbean island. I suggest that we book a ticket for the sauna facilities. At the reception I notice that the sauna is included in the price of the overnight stay. You can access the pool from the lobby, there are several pools with hot thermal brine. I feel the need to put on my bathing suit and warm up first. I wear a black swimming suit and a bathrobe. I get a shoulder bag with a towel and can move around freely. I go into the sauna and see that there are no rooms, just one big room with different pools. There is also a steam sauna, but I don't go in. (Becker 2021)

Some of the sentences are written by me, some have been generated by GPT-3. I have not modified the output. It is remarkable how well GPT-3 adapts to my style – I couldn't tell where I end, and where GPT-3 begins. Outputs that contain glitches, especially loops, can be classified more explicitly as you can see in the following example.

The area is swampy, the green meadows are the surface of bogs. Working in the factory is dull, I work on the assembly line peeling tangerines. I do my next shift at a site near Kiel. Here the factory looks more like a harbour, the work is more strenuous, I have to carry heavy crates of citrus fruits. The next shift is at a factory near Hamburg. Here the tangerines are bigger, the work is more strenuous, I have to cut them into smaller pieces. The next shift is in a factory near Bremen. Here the tangerines are smaller, the work is easier, I just have to cut them into smaller pieces. The next shift is in a factory near Hanover. Here the tangerines are bigger, the work is harder, I have to cut them into smaller pieces. (Becker 2021)

I cut off the loop at the end, it would have ceaselessly self-replicated, moving ever southward, like a cursor on GoogleMaps. The setting is shifted from Hanover to Dortmund, Essen, Duisburg, Düsseldorf, Cologne – the size of the local tangerines oscillates between bigger or smaller than before, the workload hovers between increased laboriousness and relative ease. The sentence structure is repetitive and reveals a certain aesthetic of AI language, that would be worth exploring more profoundly. Examining these loops, it seems obvious when GPT-3 takes over. – Next Question: *Is it relevant at all to expose where I end and GPT-3 begins?*

Mike Sharples, professor of educational technology specialising in writing, cognition and AI, (ct. 2022: 150) compares the status of contemporary literary production with AI to the production of music in the 1970s. If you wanted to increase the reverberations of a piano you had to record the piece in a church or another similarly sized room up until plate and spring reverbs became widely available. Nowadays, reverb can be added to any sound within the matter of two mouse clicks and adjusted to rooms of unimaginable size. Applying this to literature, sentences could, for example, be rewritten in different styles, descriptions could be made to sound more frightening or funny by inserting adjectives or modes of observation. This is already possible with programmes such as Sudowrite, but we can assume that fiction writing programmes will improve massively and ultimately become normalised in literary practice. At some point, it won't matter which passages are written by a human or an AI – it will only be of importance if the text works as a text and satisfies specific aesthetic parameters. Media artist and programmer Ross Goodwin (2016) emphasizes that writing computers will no more replace us as writers than pianos have replaced pianists, "in a certain way, they become our pens, and we become more than writers.We become writers of writers".Goodwin outlines a specific authorial understanding that diverges from Allado-McDowell's human-machine fusion: Goodwin sees himself as a curator, the neural network merely serves a writing function. AI thus becomes an elongated pen, the computer replacing the typewriter.

An example that can be used to visualise the writing process of *Complete* in the literary field is *Amor Cringe,*the second book by Allado-McDowell (2022): "Half traditionally-written and half AI-generated, Amor Cringe is a 'deepfake' autofiction novelette about a TikTok influencer that seeks God, created with the intention to be 'as cringe as possible'" (Deluge Books 2022). While Allado-McDowell makes a formal distinction between himself and GPT-3 in *Pharmako AI*, there is no formal marking in *Amor Cringe*, nor is there any reflection on processes of creation or revision embedded in the text itself. *Amor Cringe* would be included in the fiction section of bookshops, the novel tells a story, it has a largely coherent plot and a main character whom we follow through certain events that unfold temporally and spatially. As I already pointed out, generating complex narrative prose with AI still requires interventions on fabula (story) as well as on discourse level. *Amor Cringe* appears to be largely coherent because we can assume that the author has implemented (plot-)relevant text passages – we could also call them plotpoints in the broadest sense – as prompts. Furthermore, it is crucial to recognize that arbitrariness is rendered to be a conceptual narrative key in *Amor Cringe.* Dreamlike sequences, excessive descriptions or descriptions that seem hallucinatory are part of the stories internal logic. That is why GPT-3's – as I would call it –*writing style of arbitrariness* blends neatly into the plot and doesn't sound off. The guiding principle of *form follows function* appears here as a necessary working strategy in order to secure the story logic against violation by glitches.

The collaborative method of *Complete* allows for the integration of one's own style as well as permitting control of the plot's course. If the plot becomes frayed, main characters are eliminated by GPT-3 or unwanted ones are created, one could always interfere by setting turning points as prompts. Thinking further, narrative patterns could be used to generate conventionalized literature, for example, a narrative following the Hero's Journey. It would be worthwhile to further analyse the use of story schemes and patterns in co-working processes with ANNs and to make use of formalistic and narratological story grammatologies, as it is common in computational linguistics (Gervaz 2013). Initial efforts at a more precise analysis in a literary context are being made by computer scientists and linguists Judith van Stegeren and Mariet Theune (2019) who have been analysing the generation of narrative coherence in works of NaNoGenMo. Integrating satellite sentences or hardcoding story arcs are two fruitful methods to create coherence in the observed works.

# **Division of labour in the** *Writers' Room***: Conceptualisations with AI**

The third collaborative practice that I want to draw attention to operates at the conceptual level of writing. I would like to discuss this collaborative relationship using the term *Writers' Room*. Literary writing is still primarily framed as an individual act and thus highly romanticised, whereas stories in other sectors are usually developed as collaborative projects – think movie and gaming industries. A collaborative process of conceptualisation with neural networks could be broken down into three development segments: 1) setting, 2) character development and 3) plot development. These narrative sections could be fleshed out in dialogical form, with concepts emerging in conversation with GPT-3 or in tandem with specialized fiction or screenwriting programmes. I would like to illustrate this method with an example. When I started work on a novella about a chick lit writer who wants to interview Hugh Grant to boost her career, I tried to construct the story outline and develop certain characters in collaboration with GPT-3. N.B. Mysterious pitched several title suggestions until I finally settled on *The Grant Interview*. N.B. Mysterious subsequently set the name of our protagonist to be "Lila Saunders", while determining the locations to be Chelsea (UK) and Torekov (Sweden). After we had decided on the foundation of the novella, I assigned GPT-3 to work out individual scenes. Here is a brief example of an early pitch I received:

Lila wakes up in her flat in Chelsea. She stretches and yawns as she gets out of bed. She heads to the kitchen to make breakfast, thinking about her interview with Hugh Grant. She is nervous and excited about the interview and hopes that it will help her career. She takes a deep breath. (N.B. Mysterious)

The development of plot, setting and characters does not necessarily require a natural language model. There are various plot machines or character generators on the internet that generate random outputs. They are based on randomized text passages, there are no neural nets at work here. However, specialized writing programmes based on neural networks that combine conception and text generation, such as the fiction writing programme Sudowrite, do exist. Sudowrite is based on GPT-3 and operates as both a generator of ideas and a writing assistant. The program is able to rewrite prompts written in a mode of telling into scenic descriptions of showing. Sudowrite can also generate descriptions for places or characters or write entirely new passages. Independent author Jennifer Lepp (2021)

provides detailed insights into her writing practice with the AI-powered writing program on her blog. *Magic's a Hoot* (2021) was the first novel Lepp wrote in collaboration with Sudowrite, using the alias Leanne Leeds. Lepp's working process appears highly operationalized and is streamlined for maximum output. Tasks are delegated between programme and author; conceptualisation and revision processes follow a consistent pattern. Lepp writes two urban fantasy series on Amazon and by now publishes 10 books a year. In an interview with *The Verge*, she admits that – especially in the initial phase of experimentation with Sudowrite – she grew increasingly distanced from her own story and sometimes lost access to her characters or logical connections (ct. Dzieza 2022). It illustrates that writing with programmes such as Sudowrite require a balance between delegation and personal responsibility. However, Lepp's labour practice points to a future where writing – especially in the entertainment sector – will no longer be an individual act, but a highly professionalised collaborative process with AI based writing programmes.

# **Choices, Choices, Choices**

Even though language transformers such as GPT-3 are highly functional on several levels, a complex novel at the push of a button remains a fantasy. When it comes to writing narrative prose, we are still in need for authors or at least curators to rearrange sentences and story parts. Nevertheless, there are productive methods that particularly lie in collaborative writing practices with artificial neural networks that are able to generate natural language. After I experimented with GPT-3 in various ways, I found an abundance of creative potential in the various possibilities of collaborative work between artificial and organical neural networks, which I unfolded by discussing three collaborative writing processes. Here I have to agree with sound artist Holly Herndon (2019) who highlights that the "ideal of technology and automation should allow us to be more human and more expressive together, not replace us all together". Still there are a lot of hurdles to be faced on the journey to readable narrative prose created in collaboration with generative technology – producing tonnes of non-sense or, what I found most problematic, dealing with biases. Pretrained systems reproduce already existing power structures written into language and images but also into digital economies driven by interests of capital, identities, individuals, industries. Development, training and maintenance of AI additionally remains a resource heavy and energy intensive endeavour (cf. Crawford 2021). It is necessary to remain critical, since tens of thousands of developers incorporate GPT-3 into their own apps and over 4.5 billion of words are generated daily – as OpenAI (2021) announced on their blog. We can assume that these numbers have increased drastically by now and will only continue to grow. Additionally by the time this essay is published, novel and improved language models will be at work and perhaps even

the subject of this work will have become entirely obsolete. Last but not least, there is the problem of choice: N.B. Mysterious and I have not yet written the chick-lit novel *The Grant Interview* because I've had such a hard time deciding on an output. Being at Open AI's Playground is a little bit like looking for clothes online or to scour a dating platform for someone to engage with: after you got some plausible title suggestions like *The Chick Lit Guy* or *The Grant Interview*, something better always feels in reach. Perhaps It is worth setting up a set of rules. Experiments could involve constraints similar to those which the Oulipo submitted to – I am thinking of George Perec's lipogrammatic novel *Disparition*, in which the letter E does not appear once. – What would a collaborative novel written with GPT-3 look like if one always had to take the first output?

*Dear N.B. Mysterious, what do we want to write about?*

*We could write a novel about anything you want. Maybe we could write a novel about two people who fall in love against all odds.*

#### **Bibliography**


# **AFFIRMATIVE – REJECT. With and Against AI<sup>1</sup>**

*Mattis Kuhn*

*HARRIET: "NEGATIVE.. NEGATIVE..." Harry: "She won't find a new ending, Kathy." Katherine: "Proposal two: Lover walks out on Eme." HARRIET: - REJECT..... Harry: "Reject. You see I told you." Katherine: "Explain!" Harry: "She'll cross check her plot memory, find out we used it in Dick Slocum." HARRIET: - WALK-OUT DEVICE USED 4 MONTHS AGO / DICK SLOCUM VIOLATION 6- Harry: "You can't win." Katherine: "The hell I can't. Proposal three." HARRIET: - REJECT..... Katherine: "Same as two, but change cover illustration. Eme alone on beach, appears nude, watching."*

<sup>1</sup> The base of this text is an artist talk of the same title, which took place at the conference of this volume. In it, the topic was illustrated by the author's own works »Selbstgespräche mit einer KI« (Kuhn 2021) and »Grasslands for Insects« (Kuhn 2022). Since it seems inappropriate for the author to write a text about his own work for this conference volume (without thereby denying autoethnographic methods – especially in this context – their relevance), a new text is produced instead based on the artist talk.

```
HARRIET:
      - AFFIRMATIVE
Katherine:
               "At last. I just wanted to win once."
HARRIET:
      - SALES PROJECTION SOLID
      IN TOKYO . PARIS. MOSCOW.
      MOSCOW. SHANGHAI. KAMPALA
      SMASH HIT IN BERLIN. BONN
      OK ALL OTHER INHABITED
      AREAS.
Katherine:
               "Did it make sense to have a machine named HARRIET, writing novels for us? Are we
      too exhausted from building you to make up our own stories?"
HARRIET:
      - AFFIRMATIVE
Katherine:
               "Fuck you, Harriet."
HARRIET:
      - UNFAMILIAR TERM.
      REPHRASE
```
# **Role models**

The question of the relationship between man and machine – in the above conversation between a writer and a computer (Tavernier 1980) – is as old as machines themselves. Through the technologies of artificial intelligence, it is currently being posed anew and now to art in a particular way.The production of art is commonly regarded as a genuinely human ability that essentially distinguishes us from machines. Now this supposedly last hurdle is also to be taken by the machines. In recent years, art (history) has been used for many projects in which capitalist interests in the form of goods or campaigns for technological progress have been the primary focus.<sup>2</sup> For ex-

<sup>2</sup> In this respect, the topic of automated creativity could also be explored. Hanno Rauterberg (2021: 57)\* exemplifies this: "These dystopias [in which machines have "risen to world power"] are countered by the project of an artificial creativity with a resolute counter-program: it invokes the humanist heritage, the history of art, and wants to perpetuate it. It seeks the new in the old, in the good traditions known worldwide and appreciated by an educated public, which it wants to develop further and at the same time overcome. Whereas machines have always been the object of narratives up to now, the art code promises that they will become subjects of their own stories, which at the same time make the stories of humankind productive for themselves. They are supposed to be like artists: they remain in the echo chamber of culture, in the realm of free play and inconsequential imagination, and are nevertheless, or

ample, we can receive (and buy) pictures of ourselves in the style of famous (canonical) artists, compositions of deceased composers are computed or robots become painters.

But human-machine relationships are also explored from within art, thus by primarily internal actors. To grasp the relationship between artist and machine, role models are often used: "Creative Partner", "Creative Author" (Nakotte 2021), "Design Companion" (McCraith 2020), "Ensemble Member" (Herndon 2019), "Artist", "Artificial Muse" (Lipski / Birds on Mars 2017), or "Assistant" (Kuhn 2021). In addition, the machines are often given proper names: "ELIZA" (Joseph Weizenbaum's chatbot), "Ractor" (text generator of "The Policeman's Beard is Half Constructed"), "Benjamin" (text generator for "Sunspring" by Ross Goodwin and Oscar Sharp), "Spawn" (Holly Herndon's software for her album "Proto"), "A.I.R." (Roman Lipski's "Artificial Intelligent Roman") ... the list could be continued easily.

This already shows that it is not a meaningless relationship with a tool that is not worth mentioning, but rather a social relationship. Whether different role models also lead to a different way of dealing with the machine? In the field of development, it may well be that the machine is developed differently depending on which (social) role is associated with it. If a machine is to give the impression of an conversational partner with consciousness (Lemoine 2022),it must be developed accordingly. It can be assumed that artists behave differently with their machine, depending on the role they ascribe to it. In any case, the roles generate different notions of what we are dealing with in AI / machine learning. For the recipients, depending on which role of the machine is communicated to the outside, other ideas emerge about how they worked together and what machines are capable of and what they are not.

#### **With and Against in Human-Machine Assemblages**

Role models define hierarchies. Social relationships go beyond the one-sided hierarchical relationships defined by "tool", "assistant", "muse", and so on. "Partner" suggests cooperation at eye level, but in fact it is difficult to realise with an entity that knows nothing about this partnership. As is well known, present machines cannot reflectively jump out of their actions, whereby there is also no outside of the machine – the world. Technically, then, present systems of machine learning remain

precisely because of this, able to think up the unthought-of and imagine the unplannable. They embody both: the beautiful, harmless superstition and the radiant power of utopia". (Quotes denoted with \* have been translated by the author.)

tool, machine, or medium, despite their astonishing cognitive achievements.<sup>3</sup> The role models used, however, show that these machines have acquired a meaning for us (and not vice versa) that goes far beyond this.<sup>4</sup> Role assignment is therefore less about technology than about our own self and world determination by means of identification and distinction.<sup>5</sup> The role models, however, obscure the fact that technology is a part of us that we cannot consider completely detached at all. We are already too intertwined mentally and socially. For example, for some people the (temporary) separation from their own smartphone feels like the absence of a body part. This is hardly surprising, given that the living environment is shifting to the virtual and that without technology we lack the perceptual and action apparatuses for this environment. Andy Clark and David Chalmers (1998: 8) define the connection between humans and their cognitive extensions as a "coupled system": "[...] the human organism is linked with an external entity in a two-way interaction, creating a *coupled system* that can be seen as a cognitive system in its own right". If one part falls away, the system breaks.

With the increasing merging of humans with technology, it is thus also a bit strange to work with role models: "A part of me is my creative partner." In the following, I would therefore like to describe this relationship between human and machine in artistic production processes not by means of role models, but by the figure "with and against".The "with" stands for the identification and agreement of the human being with the machine and the artifacts calculated by it, the "against" for the difference and the contradiction towards them.Thereby, "with" and "against" are not clearly distinguished by pro and con. Both terms can be positive or negative.The text operates less on an analytical level than descriptively, along (artistic) practice: how do we behave with and against AI, how do we act in dealing with AI? It is not only about a more accurate description of current conditions and a better understanding of present technology, but even more about an attention to future action in humanmachine assemblages.

<sup>3 &</sup>quot;It requires »getting up out« of internal representations and being committed to the world as world, in all its unutterable richness". and "The system must not only be embodied and embedded in this world; it must also recognize it *as* world". (Smith 2019: xiii and 105)

<sup>4 &</sup>quot;Most of the computational systems we construct – including the vast majority of AI systems, from GOFAI to machine learning – represent the world in ways that matter to us, not to them. [...] [Computers] have power in our lives, [...] they matter to us. What limits them is that, so far, nothing matters to *them*". (Smith 2019: 108)

<sup>5</sup> Perhaps that is precisely why art lends itself to an examination of AI. Art, even without AI, has always been about self-determination and world-determination.

#### **Extension through technology**

One of the goals in designing and implementing technology is to extend our cognitive abilities. A canonical artistic project in this regard is "Signwave Auto-Illustrator" by Adrian Ward (2000). Through adding generative processes he extends the tool palette for working with vector data known from Adobe Illustrator, which itself already extends the design competence of the user. Ward's software is also uncontrollable to a certain degree, which can give the impression of machine autonomy. The program not only does what it is asked to do, but goes beyond it, which also puts it beyond the control of the user. "In opposition to the familiar and reliable functions in Adobe Illustrator, the functions in Auto-Illustrator are not only strange, bordering on the nonsensical, but also partly uncontrollable.The software tool more or less autonomously generates various effects of surprise and randomness in the design process". (Transmediale 2013) This points to the influence that tools, especially software, exert on our design (and on the world in general) through their guidelines and restrictions, and how they are conceived, constructed, and used by us as functionoriented tools. On this point, see Trogemann (2020).

In everyday use, this authorship of softwareis usually forgotten or marginalized; in "Signwave Auto-Illustrator" it emerges. Of course, the resulting artifacts rely on the interaction of the user with the software and would not emerge without them. But it is obvious that authorship does not lie solely with them; Ward co-inscribes himself in the artifacts through his software. According to Ward, it is "naive [...] to ascribe autonomy to a machine program because it executes the agenda and inscribed subjectivity of its programmer [...]". (Cramer 2011: 283)\*

#### **Working against technological development**

Software enables us to do things that would not be possible and often unthinkable without it. This is especially evident through the developments in machine learning in recent years. However, extension has its price in the dependency on these extensions and possibly their authors, etc. (In relation to Adobe software, this is only now becoming really clear through its business model of the "Creative Cloud"). Art produced by machine learning is to a strong extent conditioned by technological developments. On the one hand, this has always been true specifically for technology-heavy art (media art), but on the other hand, the effects have been much more widespread in recent years.There are several reasons for this. Following on from Rauterberg's considerations mentioned above, technology corporations, as well as university research institutions, have chosen traditional art genres and canonical works to demonstrate algorithmic capabilities. On the other hand, ML applications have become so accessible that no programming is required at all to work with them. In these cases, however, it becomes all the more difficult to work against them. On the contrary, it can be observed that many designers and artists try to keep up with technological developments by permanently using the latest tool and producing with it. Therefore, the question is justified whether it is not us who are automated by technological developments. Especially in art, it has to be clarified whether one wants to constantly obey to this pressure of innovation by working with the latest tools and adapting one's own practice and expressiveness to external technological developments, with which completely different goals are pursued. Instead, it might be possible or even necessary to pause, to reflect, but also to thoroughly explore the current state of development.

# **Acting with other perspectives**

Machine learning based on Big Data works with the perspectives of others. Image models are based on thousands of photographs, usually taken by people (or machines) other than those who develop and use the models. See, for example, the artistic explorations of the conditions under which datasets are created by Philipp Schmitt (2019), Elisa Giardina Papa (2020) and Adam Harvey / Jules LaPlace (2021). Of course, many perspectives are also neglected or excluded in the process, see the "Feminist Data Set" by Caroline Sinders (2017) and "The Library of Missing Datasets 2.0" Mimi Onuoha (2018).<sup>6</sup> Nonetheless, an artificial neural network developed in this way can broaden one's perspective.<sup>7</sup>

The different external perspectives can be particularly evident when working with large-scale language models. Even though these are still based on very exclusive data sets, it is possible to infer the origin of generated texts on the basis of stylistic differences. See the different text types on the same topic in Kuhn (2022): All texts have in common that their topic is "Grasslands for Insects". Some are written in the form of a short scientific text or an abstract, others from a first-person perspective or in the style of an advertisement or a short narrative. The model (GPT Neo) was trained with the dataset "Pile" (Gao 2021). This is composed of 22 smaller datasets and consists of 825GB of text material. This makes it far from a manageable size for a human reader. It includes fields such as academic texts, Wikipedia, prose, dialogues from chats, movies and forums, computer code and some other sources.

<sup>6</sup> For an overview of the artistic explorations of datasets and biases, see Arns (2021).

<sup>7</sup> At this point, it is not about an extension to an extra-human perspective. Machines do perceive the world differently from humans, but this does not yet necessarily lead to an extrahuman perspective. We construct and train the machines, implement our world into them, insofar as it can be formalized, and then evaluate the computations so produced from our human perspective.

These different perspectives, which express themselves not only in terms of content but also stylistically, are transferred into a common language model, which in principle creates a polyphony in that language model. In combination with one's own perspective, this can positively result in an extension to external perspectives, i.e., multiple, open authorship. In dealing with algorithms that have internalized statistical correlations of the dataset, a (limited) polyphony emerges.

In turn, dealing with (generative) algorithms can also lead to the loss of one's own voice:

At the beginning, the algorithmically generated strings did not have the very highest meaning for me. I obviously realised that they were not my thoughts. They did not originate in me, they are external to me. Nevertheless, they are partly inspiration for my own thoughts. However, it turns out that the machine, which is supposed to complement, expand, open my thoughts, sometimes leads to speechlessness on my part. I rely on the intelligent typewriter. By outsourcing myself and my abilities into the machine, my abilities without it decrease. My efforts to develop thoughts myself diminish. Primarily I look for suitable suggestions from the machine. (Kuhn 2021: 114)\*

Thus, working with AI usually means working with external perspectives, whose weight sometimes outweighs one's own perspective. On the other hand, this can create more openness.The role models (such as "partner" and "companion") suggest singular entities, but is that accurate? Just as Brian Cantwell Smith (2019: 5) speaks of "synthetic intelligence" rather than artificial intelligence, the term synthetic entity might be more fitting to the multiple perspectives.<sup>8</sup>

# **Deciding against calculations**

Machine generation results in a shift from creating to evaluating.The judging about the artifacts takes a higher value in comparison to the doing, compared with artifacts produced by hand. One form of maturity can be to decide against calculations. This is especially the case outside of art. Already the program "ELIZA", rudimentary from today's point of view, generated a great deal of trust in the user through dialog. Today, the whole world only functionsif we put great trustin our programs. Precisely for this reason, however, the ability and willingness to question the calculations is becomingincreasinglyimportant.Neil Perry,Megha Srivastava,Deepak Kumar,and

<sup>8</sup> According to this, Ward's statement that it is about the subjectivity of the programmer is a bit too reductive.

Dan Boneh showed in their study "Do Users Write More Insecure Code with AI Assistants?" (2022) that programmers (referred to here, interestingly, as "users") are significantly more confident in their code when it is written in co-authorship with an AI. Contrarily, while programs written without AI resulted in a less secure feeling on the part of the programmers, they also resulted in factually more secure code with fewer vulnerabilities. Even though this study is only representative to a limited extent due to its small size and the lack of a broad enough sample, it can exemplify our behavior towards machine-generated results. In co-authorship it is easy to give away responsibility.

Art is characterized by its need for interpretation. It wants to be questioned. Just in this it can be a training ground for questioning (of calculations). In the contemplation of art, we automatically find ourselves in the mode of critique. Juliane Rebentisch (2022: 165)\* describes critical thinking (there specifically in relation to the formation of common sense through the judgment of taste according to Kant) with Hannah Arendt as a form of "habitualization":

One must be practiced in it, one must be "used to" not simply accepting things as a given, one must have experienced that they can also be looked at differently and that a corresponding examination can lead to other results, other judgments. One must, in other words, have formed the habit of opposing habits, of opposing what is natural to judgment, what has become automatic to it. (Arendt 2018)

Smith, while not interested in comparisons between humans and machines, does draw a distinction in judgment. Machines do not currently have judgment capability, only "reckoning". (2019: xvii) In his view, it is problematic if we use systems that are only capable of reckoning for judgment tasks. (2019: xix) As long as machines are not capable of that, it is our obligation to judge computations. This means we must consciously decide for or against computations (and be able to do so). In terms of the social relation, this means that, at least when responsible interaction is required, it is a hierarchical relation in which we decide about machines.

# **Working against AI or engaging with machine otherness.**

Hannes Bajohr (2022: 174)\* begins the chapter "No Experiments. On Artistic Artificial Intelligence" with a comment on Daniel Kehlmann's experience report in writing with a pre-trained text generator, which for Kehlmann leads to the conclusion that the collaboration failed:

So it is quite possible that it was not Artificial Intelligence that failed literature, but Kehlmann who failed Artificial Intelligence – and thus somehow also failed literature. Because in his juxtaposition of fully-fledged "artistic work" and mere "experiment", it becomes apparent how little it occurs to him that one can, or perhaps even must, make literature with machines *differently*, instead of letting them jump over the sticks of one's own poetics. Thus, the aberrations and absurdities that CTRL spits out are obviously *bug* to him, not *feature*. What literature is and what aesthetics it has to follow is clear from the very beginning.

Kehlmann's attempt is a form of co-creation with a machine that resists the machine's otherness and seeks in it primarily a machine double or like-minded entity. This can, of course, lead to something. Attempts to create a machine Rembrandt or to generate new compositions by deceased composers can provideinsightsinto their work. However, AI serves here more as an instrument of analysis than as a tool of synthesis. Generative potential unfolds rather when the otherness of machine perception<sup>9</sup> and production is affirmatively incorporated. Kathrin Passig (2021: 129)\* compares her activity in the process of generative literature to gardening: "Producing generative texts has much in common with traditional gardening. In both cases, one shapes the initial conditions and designs the process, not the outcome. And in both cases, the results – if all goes well – are a surprise".

So a certain openness to the machine's creations holds more potential for novelty, for surprise, for otherness, even in terms of aesthetic challenges for recipients. Accepting the outputs of the AI means relinquishing authorship. The role models mentioned at the beginning of this article are indicative of the fact that machines are granted a higher degree of authorship than is the case with conventional assistance tools. Nevertheless, in many projects the desire to implement one's own vision and to trim the machine down until it is reduced to one's own perspective prevails.

#### **From the perspective of the machine**

Now the question arises whether we should in return empower machines to the extent that they are also in a position to behave "with and against" humans. From the perspective of the machines, it is worthwhile to cooperate with us. They need us to

<sup>9</sup> For example, Johanna Reich (2018) and Dries Depoorter (2018) use facial recognition algorithms for their works »Face Detection« and »Face Detected«, respectively, to have the completion of plastic faces made of clay determined by machine. The artist duo Shinseungback Kimyonghun (2018) takes the opposite approach in their work »Nonfacial Portrait«: commissioned painters produced portraits of people with the condition that facial recognition algorithms do not register faces in the paintings.

construct, execute and, if necessary, attribute automated creativity to them. They themselves are not capable of distinguishing creative action from non-creative action. For example, if we consider a move in chess or Go (both of which can be formalized without loss) to be creative because it is completely novel and outside of our previous thinking, we can assume that this move is as consequential (normal) from the machine's point of view as all other moves. The creativity is not registered.

The "against" arises in the sense of: Do we need machines (or humans) that extend us in the sense that they do not serve our ideas and thought patterns, but also counteract them in order to break them up.Currently, we primarily find self-mirroring. (Arns / Hunger / Lechner 2022) We create copies of ourselves ("Holly+" by Holly Herndon, "Artificial Intelligent Roman" by Roman Lipski), which is of course also an important position, especially in the artistic context of exploring human-machine assemblages. Julia Nakotte (2021) turns the relationship between human and assisting machine around in her work "Potentio Poet". A selection of nouns, verbs, adjectives, and places she made were rated word for word in the categories of everyday life, city, and depression with values between 1 and 3. The 1 stands for "a little", the 3 for "a lot".Using three knobs, recipients can set the strength for the three categories, whereupon one word is selected from each category and combined to form a line of poetry. The recipients become assistants to the machine:

The focus of this work is not on the output or the presentation of the technical possibilities, but on the staging of the machine as an author. The potentio poet is presented as an independent author, while influencing persons are downgraded to assistants, who only select and rate words or chose the base of a poem line. Although the potentio poet uses only a simple random function for its part of the work, this is enough to make it impossible for us to foresee the total result. Is this enough to call the potentio poet an author? Why should it not be enough? In what ways are the potentio poet's influences different from the influences that affect human authors (like a language, experiences, education)?

In addition, we are assistants for the AI development of large tech companies. We not only provide data for training (unfortunately also voluntarily generated data like knowledge on Wikipedia or open source software), but also feedback by interacting with the AIs developed from it.

Lauren LeeMcCarthy (2017) took on the role of the the machine,more specifically a smart home assistant, in her work "LAUREN".

I attempt to become a human version of Amazon Alexa, a smart home intelligence for people in their own homes. The performance lasts several days. It begins with an installation of a series of custom designed networked smart devices (including cameras, microphones, switches, door locks, faucets, and other electronic devices). I then remotely watch over the person 24/7 and control all aspects of their home. I aim to be better than an AI because I can understand them as a person and anticipate their needs. The relationship that emerges falls in the ambiguous space between human-machine and human-human.

McCarthy, on the one hand, performs the operations of the machine system, and on the other, she transforms it into a (more) human version. She acts with the perspective of the machine by adapting the technology, but also against it by being a human version of the machine. However,in doing so, she also takes the subordinate role and tries to serve the subjects, more empathically than the machines do.

#### **With and Against AI**

"With" and "Against" belong and work together. A reflection on the human-machine relationship by means of this figure can lead to a more differentiated perspective than the role models mentioned at the beginning alone make possible.Of course, the artists are aware that their relations to the machines are more complex than they can be represented by means of the role models. The machines have acquired meaning for us, not vice versa. Therein lies the weakness of the role models (assigned by us). They suggest a collaboration on eye level, autonomy and an own point of view, which the machines do not have due to their technical conditionality. But this is precisely where the role models are helpful and their use by artists is extremely important: they question the relationship between us and the degree of machine authorship, whose increase is currently clearly visible. For even if they are not full partners, they go beyond a tool not worth mentioning.

In most cases, of course, we do not just accept and adopt what is calculated by machines. On a small scale, this is also evident in everyday life, for example, when individual words or parts of sentences from a machine translation are adjusted by us according to our ideas by means of DeepL (as was done for the translation of this text from German into English), when we manually undo automatic corrections or generally decide against suggestions. For our own grip on the technology, we have to (inter-)act with it. At best, this goes beyond the mere selection of suggestions into the machine construction itself. Artistic explorations are particularly suitable for playful testing of different human-machine assemblages. They can serve as a blueprint for extra-artistic relations, which unfortunately can be designed less freely in most cases. Ideally, working with machines does not lead to a limitation of one's own expressiveness and responsibility, which results in dependency, as experienced and questioned by Katherine in the opening dialogue, but rather to an expansion of our possibilities and our view of ourselves and the world. The human-machine assemblages should thus be open enough to integrate otherness into our perspective, but also keep open the possibility of acting against the machines' calculations.

# **Bibliography**


Depoorter, Dries (2018): "Face Detected". https://driesdepoorter.be/facedetected/.

Gao, Leo / Stella Biderman / Sid Black / Laurence Golding / Travis Hoppe / Charles Foster / Jason Phang / Horace He / AnishThite / Noa Nabeshima / Shawn Presser / Connor Leahy (2021): "The Pile: An 800GB Dataset of Diverse Text for Language Modeling". *ArXiv.* https://doi.org/10.48550/ARXIV.2101.00027.

Giardina Papa, Elisa (2020): *Cleaning Emotional Data*. Video Installation Linz: Ars Electronica

Harvey, Adam / Jules LaPlace (2021): "Exposing.ai". https://exposing.ai (21 January, 2024).


# **Discography**

Herndon, Holly (2019): *Proto*. 4AD Records.

# **Filmography**

*La mort en direct* (1980): Dir. Bertrand Tavernier. Orange Studio, DVD 2021

# **Artificial Intelligence in Songwriting and Composing – Perspectives and Challenges in Creative Practices**

#### *Björn Tillmann, Wolf-Georg Zaddach*

Artificial Intelligence (AI) has been a synonym for intensively discussed and continuously developed approaches in the culture of digitality for several years (Stalder 2016; Lenzen 2020). AI is regarded as a disruptive technology with great potential for change in almost all areas of human life. This also applies to the field of music. For example, Melissa Avdeeff (2019: 1), after analyzing the first music album jointly created by an AI and humans (*HelloWorld*, 2018), states that we are "on the edge of a new era of popular music production". Streaming platforms have also been influencing which music is suggested to listeners (and which is not) for several years, in the case of Spotify, for example, through complex algorithmic constellations such as "collaborative filtering", "natural language processing" and "conventional neural networks & audio models" for analyzing basic musical parameters (Whitehouse 2021).

AI technologies are increasingly entering artistic and creative work. Various artists engaged with different AI tools in recent years and published their work, such as Holly Herndon with the album *Proto* (2019) or Benoît Carré's and François Pachet's work on the Beatles-style pop song "Daddy's Car", realized through 'Flow Machines', as well as the subsequent and mentioned album *Hello World* (2018). At the same time, tools such as Jukebox, IRCAM's real-time timbre transfer tool RAVE (2022), short for Realtime Audio Variational autoEncoder, or music production tools (Sonible, iZotope, among others; Frieler / Zaddach / Meyer 2023) are becoming more and more widespread. These rapid developments require a multi-perspective and critical examination and classification, which equally includes creative professionals. Hence, after introducing songwriting and composing as practice, we will discuss the usage of AI in songwriting, its potentials, challenges, and risks for professional practice. The article is based on semi-structured qualitative interviews with four professional composers and songwriters who intensively work with AI.<sup>1</sup>

<sup>1</sup> The interviews were conducted for a master's thesis at the Popakademie Baden-Württemberg in 2022 (Tillmann 2022). The four experts are: Benoît Carré (musician and composer, co-developer of 'Flow Machines'), Stefanie Grawe (designer and music producer), Moisés

# **Composing and Songwriting in Practice**

Composing and Songwriting can be understood as complex creative processes to create specific musical forms and expressions, depending on the goal and purpose (such as song, film music, theater music, radio jingles, advertising jingles / music, radio plays, or special genres). If one asks OpenAI's GPT3 what songwriting is, the following was generated in a request in February 2023:

Songwriting is the process of creating a musical composition to express emotion, ideas, or stories. It requires a combination of creativity, knowledge of music theory, and technical skill in the production of the music. It typically involves the composition of melodies, chord progressions, and lyric writing. (…). (OpenAI Playground 2023)

However, it further involves aspects and practices such as the creation of melodies and rhythms, chord progressions and voice leading, instrumentation and arrangement (Moore 2013; Perricone 2018; Moylan 2020). Also, in the age of digital music production, it includes the design of sounds and tools to create sounds. Processes of writing parts of the song often overlap with production-oriented tasks in Digital Audio Workstations (DAW). Also, songwriting and composing are often collaborative efforts of distributed creativity (Clarke / Doffman 2017) and with different expertise, including situated non-verbal communication, embodiment and affect, musical interplay and interaction in creative practices (Bennett 2011, 2012; Barrett 2014; Thompson 2015; Bishop 2018; Cook 2018).

# **AI and Songwriting in Practice**

In the following paragraphs, we will discuss three main findings from expert interviews and will accompany them with findings in literature. The three points are: 1) working with AI in general, 2) potentials and creative approaches, and 3) challenges and risks.

# Working with AI and its impact on the creative practice

While in the field of experimental computer music or algorithmic composition, programs have been developed since the 1950s, and it is only since the 1990s that a (very

Horta Valenzuela (sound artist, technologist and musician), and Jovanka von Wilsdorf (musician, songwriter, initiator of the DIANA AI Song Contest).

basal) form of AI has been applied (e.g. tonica fugata; Montanas / Arcos 2002; Nierhaus 2018), in the field of songwriting, a trend towards the use of AI emerged only during the 2010s. Hence, working with AI appears to be a new approach to songwriting practices that comes with challenges as well as potentials. Since artistic practices employ reflexivity in their own creative processes, the interviewees were able to raise insightful considerations regarding working with AI.

Carré (in Tillmann 2022, CXVI) summarizes three points regarding the potentials of working with AI, that were mentioned by all interviewees:


Jovanka von Wilsdorf (in Tillmann 2022, LXXXIII) reinforces and adds the aspect of optimizing the work in teams where these tools not only speed up the processes through their functionality, but also make the whole writing process more playful and thus could help to skip the occasionally-problematic start-up phase.

Especially the aspects of acceleration and enrichment of creative ideas are frequently discussed in research (Gioti 2021; Deruty et al. 2022). Interestingly, all interviewees emphasize that a crucial point would be that the human being retains the power of decision, which echoes the current discoursein which AI in music is understood as "assistive technolog[y]" (Moffat 2021: 366).This points to a potential tension between (desired) artistic autonomy on one hand and the often-lacking comprehensibility of technological complex processes on the other.

Even though the 'surprising' is an important motivation to use AI, von Wilsdorf adds that an artist nevertheless follows one's own intuition:

Suddenly you get suggestions that you wouldn't have thought of yourself. It's wonderful. The combination of a lyric I made entirely with LyricStudio, together with the sound inspiration of Boomy, paired with a beat from a beat AI like Orb. Suddenly something comes out where I follow my intuition but still get suggestions. (in Tillmann 2022, LXXXIII)

As von Wilsdorf indicates, for artists it is about the process, in which AI tools can play a part, especially in creative combinations. Stefanie Grawe does not limit this combination of tools to music alone, but sees it more as interdisciplinary:

I think it's very important that you can use all the tools to be creative and not exclude other possibilities, but that the combinations are then also the decisive thing. What sounds are produced or how someone performs on stage are decided by the musicians themselves. (in Tillmann 2022, XCV)

Tools with possibilities for intervention are described by the interviewees as more creatively appealing,in order "to further develop and advance one's own sound identity with self-produced data by means of AI" (Grawe, in Tillmann 2022, XC).

The interviewees describe the current possibilities of AI and their impact on the creative processes as leading to new central roles of the artists:


Valenzuela summarizes the task for the artists is about "deciding what you want, sonically, because you can get very easily overwhelmed by the possibilities of the models" (in Tillmann 2022, appendix, CXII). In that sense, it becomes clear that aesthetic experience and vision, knowledge of the different challenges of songwriting in general, and an artistic goal in the broadest sense are still important skills and features for working with AI to create music – as long as it is supposed to fit into certain musical styles or genres, one should add.

#### Potentials and creative approaches

Interestingly, all four experts explained that they are using the AI tools in a different way than they were intended in order to achieve more independent and unique results, which could be interpreted as a result of creative thinking in practice in general (Coessens 2014; Coessens / Crispin / Douglas 2009) as well as negotiating the discussed new roles of artists in working with AI. Moisés Horta Valenzuela explains:

Sometimes I´m very overfitting the model which is like you train it on very few data, but you augment it. So, e.g. you originally have 13 minutes of data but you augment it to like 3 hours of data with some techniques and then it learns. But it's all just this data. So, when you train the prior to loop it's really constant, but it becomes this kind of phantom drive of the neural network that's always trying to make this loop and these transitions because it just knows this data. But it kind of fit it because I augmented it. And then using this with some other model that I trained […]. (in Tillmann 2022, appendix, CX)

This practice is contrary to the previous assumption that AI systems need a lot of data to work 'properly'. The difference here, however, is that the artist is looking for the 'new' and the 'unexpected'. Moreover, Horta Valenzuela adopts the 'problem' of overfitting (Gioti 2021: 67), using it for his own creative practice on purpose, and combines systems trained on other material to search for "novel sounds":

[…] it's something that it wasn't really trained on. This is when it becomes really novel and not just a reproduction of whatever you trained it on. When you start doing these combinations. […] This is not something intended in the documentation or so. They don't suggest this. It's just like a hack that I did. It's quite straightforward, but I haven't seen people tryin' it out, but it works and it sounds fucking sick. (Horta Valenzuela, in Tillmann 2022, appendix, CIXI)

Furthermore, Horta Valenzuela trains an AI system on recordings of old or rare instruments to preserve them and use them for further creative approaches. The difference to sampling here is that the individual timbre of the instrument can be preserved instead of the recording itself:

I trained an AI on this instrument from Mexico called the Santerio, which is from the 19th century, very popular. And this instrument obviously is like cutting out of popularity because it's kind of big and very difficult (…) And so I made myself a data set from songs on YouTube basically. These songs are public domain because they are quite old, so I just made a dataset of this instrument and then it becomes this idea of preserving the sound of ancient instruments through the machine, at least the textures. (Horta Valenzuela, in Tillmann 2022, appendix, CIV)

It appears to be especially attractive and creativity-boosting for artists, when 'happy accidents' happen – results that are produced by the 'wrong', unanticipated or accidental use of instruments or tools (see also Deruty et al. 2022: 45). Carré explains:

I like the accidents. Like when you are playing the piano and are working on a song, it's really exciting to lose control and let the unexpected come under your hand. AI tools increase the chances to build from nice accidents. (in Tillmann 2022, appendix, CXV)

Interestingly, the interviewees see further potentials for collaboration, for instance by building a community of users that share knowledge, but also tools. This could safe time if, for example, someone shares an on a certain genre or style trained AI-checkpoint and potentially help protecting the environment by using up fewer resources as you don't need to train the model from scratch (Horta Valenzuela, in Tillmann 2022, appendix, CXI).

# Challenges and risks

As in the general discussion about AI, challenges and especially risks in the usage of AI in music and songwriting in particular are debated a lot (Beato 2023; Clancy 2023a; Frieler / Zaddach / Meyer 2023; Deruty et al 2022). While there is the argument that AI can offer some advantages in the creative process, three key issues were expressed by the interviewees: legal, ethical, and aesthetic aspects.

# Legal aspects

Legal aspects relate to copyright and the associated remuneration of rights holders and associated parties. Many different experts as well as data are involved in the creative process with these tools: the (usually) copyright protected creators of the data used in the data set, the programmers of the codes, the providers of the service, the machine or algorithm itself and the artist (who can also occupy several of these positions).

While there are exceptions in the context of research, a problem arises when the data used to train the AI is not transparent. Often it is not possible to find out which data was used for the generated output. Hence, some argue that AI becomes a producer on its own, gaining similar rights as the human counterpart, or that the programmers of the AI code hold rights. Carré negates that:

I think that the owner of the technology cannot have any rights on the music that is generated. Otherwise, you would have given rights to every person who created an instrument. When you play the piano, who should you pay? 'Schimmel'? It's not gonna work, even if those tools are different because they generate original music. (in Tillmann 2022, appendix, CXVI)

This statement points to the legal facts and the discussion which has only just begun (Vincent 2022).This discussion is complicated by lengthy decision-making processes regarding a reformation of copyright at the political level with different developments in different countries. Art and music became a special testing ground for AI (Wittpahl 2019: 257–270; Du Sautoy 2019). With reference to French economist and philosopher Jacques Attali, Clancy (2023b: 2) sees high relevance in the argument that music is an important site of negotiation and herald for social change. In engaging with AI from a creative perspective, new insights and perspectives, or those neglected in the discussion, could be gained and introduced, and "may produce wisdom for both the broader macroeconomic and the environmental ecosystem" (Clancy 2023b: 2). Debates about copyright, and corresponding remuneration are an essential part of it.

A challenge for coming years is then to develop appropriate licensing and copyright revenue solutions (Clancy 2023a: ch. 9 and 10). Subscription models could be one: "If, for example, a label created a big dataset of all the masters that they have and all the leadsheets, they could do something like a subscription. And then distribute the profits of the subscription to the composers and producers like streaming services" (Carré, in Tillmann 2022, CXVI). Furthermore, blockchain technology could be used to bring more transparency to this process. Interviewees stressed that although copyright seems to be most obvious for works by artists working with AI, future copyright revenue models should not be to the detriment of artists (Carré, in Tillmann 2022, CXVI).

#### Ethical aspects

Another concern expressed by the interviewees regards ethical dimensions. Algorithms and the AI training processes can be biased, racist, misogynist, traumatized – as they are only as ethical as the axioms of the trainer and the data set that was used. Horta Valenzuela asks critically:

I think the issues that we should talk about is, why is AI so white? Why is AI so male driven and why are companies firing their black employees in 'AI ethics'? Why are these kind of issues happening? Were the data sets really racist or producing kind of fucked up shit even though the companies are trying to mitigate? (in Tillmann 2022, appendix, CXIV)

As Mike D'Errico (2022: 9) argues, Eurocentric standards of music such as the 12 tone-scale or 4/4 time signatures are inscribed in most common music songwriting and production tools such as DAWs. Such a "white racial frame" (Ewell 2020) leads to unconscious biases and practices of exclusions. Potentially, AI tools could overcome this, however, this depends on ethical considerations in the manufacturing process. Ethically questionable practices of companies producing AI tools can potentially be reproduced in the results of such – even if they contradict the intentions and views of the users.This certainly confronts artists with the problem that through their use of AI they would sometimes support (neoliberal) global power structures, dependencies, racist or misogynist ideologies that they may criticize.The aforementioned communities could help to deal with that issue. Examples of that could be the Glitch Feminism Manifesto<sup>2</sup> or the work of the 'Indigenous Protocol and Artificial Intelligence Working Group'.<sup>3</sup>

Another ethical concern are the working conditions of songwriters and composers in the future. A field, which is not only at risk, but already heavily influenced by algorithms is the field of commercial music for movies, teasers and commercials. Potentially, this can expand to specific and often instrumental microgenres or mood-based background music, which play a significant commercial role on streaming services such as Spotify. Especially the mentioned potential of accelerating the process could lead to new business standards with expectations and deadlines that literally require the usage of AI. Further, exploitative practices on the shoulders of creatives that have been prevalent for some time in the development of music software and hardware (and not just there) (D'Errico 2022), must be critically observed and prevented.

### Aesthetic aspects

Finally, the interviewees discussed aspects that regard the role of the creative human in relation to AI. First, all Interviewees are convinced that artists would not be replaced by AI in future. Instead, they see AI algorithms as an extension of the toolbox of creatives as described above. This appears to be a widespread view and is described by Artemi-Maria Gioti as "human-computer co-exploration" (Gioti 2021: 62–64) respectively "distributed human-computer co-creativity" (Gioti 2021: 56). D'Errico speaks of "interface aesthetics", defining it as "a general framework for thinking through relationships between theory and practice, concept and technique, aesthetics and poietics" (D'Errico 2022: 15). Second, the interviewees emphasize the relevance of the aesthetic dimensions of creative practices. They see a difference between AI on one hand and an intrinsic desire of artists to express themselves and create on the other:

<sup>2</sup> https://www.legacyrussell.com/GLITCHFEMINISM (31 August, 2023).

<sup>3</sup> https://www.indigenous-ai.net/ (31 August, 2023).

The need to make music is there. It's been around as long as there have been humans. (...) Musicians and artists make music to communicate and to reach people, but above all to express themselves and that doesn't go away. And that is so deeply anchored in people. That alone is reason enough that AI cannot replace musicians. It's not only about the product, it's also about the doing. (von Wilsdorf, in Tillmann 2022, LXXXVI)

AI tools are not artists themselves since they miss the "personal filter and the urge to want to express something" (von Wilsdorf, in Tillmann 2022, LXXXVI). A challenge is then that AI can create something that looks like art on the first sight but lacks important dimensions of it. At the moment, the interviewees still see major limitations of AI in music as Carré explains:

Neural networks and all this technology, [...] can't create a long-term melody. [...] For the moment, AI technology can't tell a story like a human can do it with notes. I mean there is something that is not logic and that is [...] random, even if it's not false in terms of harmony, it seems to go nowhere. (in Tillmann 2022, CXV)

By understanding art as practice (Haarmann / Lemke 2023),important functions for the practitioners themselves become apparent (see also D'Errico 2022). Both Grawe and von Wilsdorf emphasize the psychological, "healing" effect of songwriting and music making when one can live out their feelings and express themselves, process their experiences,and such(Grawe,in Tillmann 2022,CI & vonWilsdorf,in Tillmann 2022, LXXXIII). With the growing prevalence of AI in songwriting, a certain issue could arise: "When people who actually want to make music get into these simple tools at an early stage", it could lead to flattened and immature creative practices and can prevent growing as an artist (von Wilsdorf, in Tillmann 2022, LXXXIII). Von Wilsdorf refers to her own experiences: "There's just no real satisfaction when I've spent a whole night just pressing a button and I always get snippets back.Then I have a short 'cheap thrill', but I'm not proud ofmy workin the end"(in Tillmann 2022, LXXXIII). The issue raises the questions what functions music making (and listening) has for human beings,as well as which human components remain when the usage of AI in songwriting becomes more of a standard: "Now,if half of it is commodity music created at the push of a button, there is a training in the human ear and it becomes more and more insensitive" (von Wilsdorf, in Tillmann 2022, LXXXVI). The aesthetic practices that listening and creating humans apply could be highly determined and influenced when "many people automatically need and use these tools and no longer listen to their own imagination or idea or feeling" (Grawe, in Tillmann 2022, XCIX). Here, ethical concerns overlap highly with aesthetic aspects. This further refers to the relevance of aesthetic and artistic education, both formal and informal. As discussed, to produce original individual music, a multitude of artistic decisions must be made that can quickly become overwhelming without the appropriate skills and knowledge.

# **Conclusion**

The interviews provided important practitioner insights into concrete ways of dealing and reflecting on working with AI for songwriting and composing. Working with AI has a comprehensive impact on creative practices, ranging from aspects perceived as positive (acceleration, enrichment, and augmentation) to those perceived as critical or negative (ethical and aesthetic issues). It also became clear that although artists can understand AI as a partner, they still tend to subordinate it to the playful mode of creative work and, for example, incorporate an unintended use and exploitation.The interviewees also described new challenges and roles. Like the findings of Moffat and Sandler (2019) for music production, designing (of sounds, tools) becomes an important task of the workflow, so that we can observe a shift to an 'aesthetic curation'in songwriting and composing when it comes to working with AI.

The research could also show that due to the rapid advancement of technology as well as the professions usually involved in it, it is necessary that a large-scale debate about creativity and AI is also accompanied by "artistic reasoning" and reflection of artists (Borgdorff 2012: 167).This means that artists need to engage much more with AI to explore the scope for action, to contribute evaluations of AI performances and ultimately perspectives for socially negotiated discourses around aesthetics, arts, and ethics. A consequence of this would be to involve artists more in AI labs and collaboratively in the research and development process (Deruty et al. 2022). Artistic Research as an emerging field represents an extremely promising approach to this (Borgdorff / Peters / Pinch 2020; Zaddach 2021, 2023). In addition, the consequences of AI in the field of music and music industry must be more academically investigated and accompanied from an even broader, interdisciplinary and interprofessional perspective, since "models of taste,identity and motivation will become important as the next step in powering a more human-like AI generation" (Brown 2021: 15).This also means, that AI must also be anchored in the concrete curricula in study programs and training courses.

In our understanding, it may be the complexity and relevance of creative art practices for human-beings that remains the main difference and differentiation point between non-AI and AI music, even though authors like Hannes Bajohr see limitations in such an approach and argue for a post-humancentric "critique of aesthetic AI" (Bajohr 2022) or like Arthur Miller concludes "that in the future machines will be fully creative and may even surpass us" (Miller 2019: 336). However, we argue that since AI algorithms follow specific rules, the resulting music could lack individuality, spontaneity, and originality, in writing as well as performing music. The difference seems to lie in the human qualities of irrationality and (spontaneous) creativity, contingency of action and thought, situatedness in environments and embodiment, affect and non-verbal communication, imagination and the ability to construct broader ideas and correlations, and last but not least the experience-based practice of doing which are still challenges for AI (Rohrmeier 2022). To that extent, one could argue that we can understand (and perhaps rediscover) music as a focal point of understanding ourselves as organic, situated human beings, constantly interacting with humans, non-humans, objects, and environments (Bennett 2010) in sensual and "vibrational practices" (Eidsheim 2015: 3). However, this also is subject of a social negotiation process about the relevance of all these aspects. As D'Errico (2022: 14) argues, "aesthetic judgment is at once a political and ethical act capable of radically transforming existing conceptions of art, culture, and society". AI music could potentially trigger such a transformation and therefore seems to push the "big-picture questions about music's ontology in the late digital age: what are we listening to, what are we listening for, and who's doing the listening?" (D'Errico 2022: 22). Nevertheless, the debate should not ignore the fact that creative practices, especially in non-commercial contexts, will likely continue to play an important role for human beings in the future. This is because music as a creative leisure activity (but also often as profession) follows a certain urge, according to which the focus is on direct as well as shared experiences, active learning, and individual expression. However, it is most likely that these contexts will increasingly be enriched and supplemented by AI, for example through individualizable practice and learning apps.

In conclusion, AI is becoming an unavoidable topic for music practices of listening and making as well as (higher) music education. It comes with promising possibilities as well as challenges and risks – and therefore requires a profound debate, which appears to happen in the context of other crisis modes such as climate crisis and political conflicts.

#### **Bibliography**


Beato, Rick (2023): "The AI Effect: A New Era in Music and Its Unintended Consequence". *Everything Music YouTube Channel*, https://www.youtube.com/watch?v= -eAQOhDNLt4 (6 June 2023).

Bennett, Jane (2010):*VibrantMatter. A political ecology ofthings*. Duke: University Press.


Nierhaus, Gerhard (2018): "On Composers and Computers". Nicolas Donin (ed.) *The Oxford Handbook of the Creative Process in Music*. Oxford: University Press. Doi: 10.1093/oxfordhb/9780190636197.013.27.

Perricone, Jack (2018): *Great Songwriting Techniques*. Oxford: University Press.

Rohrmeier, Martin (2022): "On Creativity, Music's AI Completeness, and Four Challenges for Artificial Musical Creativity". *Transactions of the International Society for Music Information Retrieval* 5(1): 50–66. Doi: 10.5334/tismir.104.

Stalder, Felix (2016): *Kultur der Digitalität*. Berlin: Suhrkamp.


# **Discography**


Skygge (2018): "Hello World, Composed with Artificial Intelligence". Flow Records, album. *Spotify*, https://open.spotify.com/intl-de/album/0cGWC9bhEJA4l7jAaV 7cqR?si=DizQKIAdS3mM-Jm4-90w-w (21 January, 2024).

# **On Human-Machine Relationship and the Notion of an Artificial Intelligence in Musical Practice**

#### *Sebastian Kunas*

As part of their creative practice, musicians today interact extensively with technology, that quite obviously appears to be deeply involved in processes of musical creation. The inclusion of electronic and digital devices in the act of creating, doing or performing music makes it difficult, even for trained eyes and ears, to comprehend from the outside who or what is involved at what levels in the emergence of sound and meaning. How is this sound created? Which musical ideas circulate on which paths? Who or what is connected to each other and how? What relationships are being formed? As a consequence, the involved musicians and their (physical) activity do not necessarily appear as the centre of musical practice, but rather as part of a complex, widely ramified and difficult to understand socio-technical structure with many actors.

Many musicians take for granted an aesthetic practice openly based on technically mediated collaboration and human-machine co-creation. Cultural techniques such as sampling, the incorporation of music machines (e.g. sequencers, drum machines) into musical practice and the use of presets and "half-ready music" ("*halbfertige Musik*", Großmann 2010) are a genuine part of various (popular) music cultures and contexts. In parts of the humanities, too,it is not new to consider social and cultural practices as the activity of assemblages in which agency is distributed between humans and non-humans, following various positions within science and technology studies, especially actor–network theory (e.g. Latour 2005). At the same time, culturally embedded ideas persist that conflict with this – at least, or especially, this is the case from a Western perspective, which is also my own. In the sense of an inherently white and male humanism, music – or art in general – is first of all seen as an activity that is exclusively human or as a product of the mind of the creator who in the creation of his work instrumentalises objects, which, however, remain external to the work (e.g. Hall 1992; Negus / Pickering 2004; Ewell 2020). Following this perspective, the autonomous musician subject appears as the pivot of musical practice.

This may partly explain the attention to and excitement about technologies labelled "artificial intelligence", which are currently becoming more and more widespread in a variety of social contexts – including those of music. The attribution of artistic autonomy to technology conjures up a new kind of machinic counterpart that supposedly threatens the privileged position of humans. This implies an understanding of the relationship between human and machine as binary. Accordingly, the emergence of "intelligent" technologies is associated with questions about the equality of human and machinic creativity, about the liberation from musical work through technology, about the prospective superfluity of certain professions, etc. – accompanied both by unease and by more or less explicit, sometimes even delusional optimism about technology.

With this text, I would like to rehearse and suggest posing other questions than the aforementioned ones to music and sound technologies. From my perspective as an artist and educator and based on my academic background in applied cultural studies, I would like to try, as far as I can within this framework, to shift the discourse (or should I write: hype) around "artificial intelligence" in the context of musical practice towards a notion of human-machine hybridity and with a focus on the relationships that humans and things involved in music making enter into. In doing so, I draw on Johannes Ismaiel-Wendt's postcolonial-informed concept of MusickingThing studies (2016), which considers all technical artefacts involved in musicmaking – also beyond the category of "musical instrument" – as knowledge complexes and theoretical objects that interact with musicians on an equal footing.

I write from a white,male/gender questioning and able-bodied perspective and, among other things, along with deep learning software for translation into English, which is a foreign language to me.

#### *What are you? / What do you know about music?*

There are probably few electronic MusickingThings that have been written or said about as much as the "TB-303 Bass Line", a "Computer Controlled" synthesizer with analogue sound circuitry, iconised by the name "Three-Oh-Three". The myth attached to this machine, which is constantly reproduced by enthusiasts, journalists, scholars of culture and music, and for a while now also by the manufacturer Roland, can be summed up as follows: *the bass synthesizer, produced from 1981 to 1984, was a commercial flop – at least at the time. Its unique sound, however, revolutionised electronic dance music.*The thing arguably did not look revolutionary, neither then nor now.The visual appearance of the small device in its plastic case with a printed-on keyboard suggests affordability and portability rather than sound-cultural innovation and elaborate artistic expression. Correspondingly, the claims made in the accompanying owner's manual are modest: "The TB-303 is an automatic Bass machine which can memorize the Bass line of a musical piece and replay it automatically. […] [T]he TB-303 can create an appropriate Bass sound for the Bass line you have written." (Roland Corporation 1981: 4–6) The thing is presented as a serviceable automatic

machine that prepares to take over the work of a person playing the bass. The role intended for the user is to programme the machine to play melodies as well as to set an "appropriate sound" for them via the rotary knobs. What the machine imagines by "Bass line" is made clear in the owner's manual right at the beginning:Three short examples in Western musical notation are listed there that refer to musical clichés from country, pop and blues.

*An image is circulating on the Internet that purports to be a page from a 1982 Roland brochure. It shows jazz pianist and composer Oscar Peterson sitting at a keyboard, wearing a suit and tie, smiling at the viewer. On the keyboard's sheet music holder are mounted and linked a Roland TR-606 Drumatix drum machine and a TB-303. The picture takes on a certain layer of meaning because it imagines a sound-cultural life for these machines that they were not meant to lead. The image can be found here, for example – https://www.synthtopia.com/content/2009/03/19/oscar-peterson-roland-pianop lus-tb-303-tr-606/*

#### *Who are you to me? / What am I to you? / What else do you know about music?*

Speaking about the creation process of the music they released in the second half of the 1980s (Phuture 1987), which would subsequently be captured by the genre term "acid house",Earl Smith Jr.and Nathaniel Pierre Jones, the two foundingmembers of the group Phuture, describe theirinteraction with their second-hand bought TB-303 while working on new tracks as being characterised by accident, coincidence and, initially, frustration. The device, for which they had no owner's manual, was awkward to programme and from time to time autonomously filled the memory for the bass melodies with "something crazy" (Smith in Rietveld 2018: 142) – with random, atonal sequences. At this point, the machine breaks its core promise of being able to store and play back bass lines; it subverts expectations of its serviceability. In the interaction with the musicians, it does something that goes beyond the programme of action that the developers have deliberately and systematically inscribed. It expands its scope of action by including exactly what is intended for the musicians: filling the sequencer with a sequence of notes.

The obvious and role-conforming reaction to this would be to consider this as machinic misbehaviour and, correspondingly, as a disturbance of the situation.The consequence would then be to leave the musical interaction with it until the disturbance has been eliminated in some way. Phuture, on the other hand, take up this somewhat amusical impulse of the machine and establish a deviant mode of interaction with it. While the machine stoically repeats the quirky atonal sequences, the musicians capture the tweaking of its knobs in their tracks, to let the already rather

awkward electronic mimicry of an "appropriate" bass sound dissolve through the squelching, chirping and screeching sounds of the low-pass filter.

#### *Who am I to you? / What are you to me? / What do you know?*

For many years, numerous imitations and variations of the TB-303 have been available, both hardware and software, from a wide variety of manufacturers. After a completely digital hardware version from 2016, the manufacturer of the original, the Japanese company Roland, has also released a software called "TB-303 Bass Line" in 2019 – 38 years after the first 303 was released. The software's skeumorphic graphical user interface directly replicates the interface of the music machine from the 1980s. Roland also promises similar faithfulness to its sound. "Utilizing Analog Circuit Behavior modeling the virtual TB-303 captures the hypnotic tone of the original hardware". (Roland Corporation n.d.) The software also has some additional functions.With a click on the button named "Randomize", the sequencer fills itself using random processes. In the context of the 303, this is to be understood as a sound-cultural reference.The manufacturer Roland is thus appropriating and channelling the sound-cultural dynamic that developed from the unlikely relationship between the human and the non-human members of Phuture. On the one hand, there is a sensory and responsive approach to what is found; on the other, there is a convention inscribed in the software.

#### *THE 303 STORY :-) – https://www.roland.com/global/promos/303day/*

The sonic result of the musical interaction with this MusickingThing may be to close to what happens in Phuture's tracks. However, the creative *modus operandi* is fundamentally different when you can tap into an "endless supply of classic 303 patterns" (ibid.) with the push of a button provided just for that. Via the "Edit" button, the interface of the software expands and allows to control the creation of random sequences on a finer scale. Although it is based on the idiosyncrasies of the original, the thing expands the user's control – even over the random procedure itself. In this sense, the addition of the extended random function paradoxically takes the acid-aesthetics-inclined user further away from entering into a relationship with the thing, as in the case of Phuture. In the prefigured relationship between musician and MusickingThing, the user is administrating musical ideas that have emerged from the circumvention or refusal of exactly such a prefiguration.

#### *Who are we to each other?*

What I would like to draw from this narrative is that in their musical practice and in the musical-intuitive negotiation of each other's idiosyncrasies, the human-machine assemblage called Phuture has explored a new kind of productive togetherness. Here, technology is not strategically misused: it is being interacted with on an equal footing. There is an extensive renegotiation of what the actors think they already know about the setting. The specific aesthetics of their tracks, in contrast to the software version of the TB-303, was not imagined/preset for any of the actors and obviously emerged in dependence on all participants (which undoubtedly include more than the ones mentioned, be it the cable synchronising the TB-303 with the drummachine, the roomin which they produced, the tape they recorded on or DJ Ron Hardy who, according to the legend, persistently played their track "Acid Tracks" in the club, despite the crowd's initially reserved reaction). Even though we can only speculate about what the human members of Phuture heard in what the TB-303 uttered and what made them take this seriously as a musical impulse: As unexpectedly and coincidentally as the acid spilled out of the machine, as little coincidental was the act of employingit.Norisit coincidental that this happenedin the context of a vibrant club and subculture that was predominantly Afro-diasporic and queer (Brown 2022; Thomas 1995).

For me, the creation of this music and the idiosyncratic relationship of the humans and non-humans involved overlap with how I listen to this music. One of its many layers of meaning for me lies in that in this music, "the experience of a 'life among machines' is aesthetically worked through", to use an expression by Malte Pelleter (2020: 94) following Gilbert Simondon (2017). This music knows something about our experiences in digital life worlds and our human subjectivities, which emerge not least from our entanglement with digital objects. Forme, despite the historical distance and the perceptible age ofits sound,in a way themusic still holds this "sense of terrifying newness" (Eshun 1998: 95) through its specific "unnaturalness" as well as it performs and expresses "*pleasure* in the confusion of boundaries and for *responsibility* in their construction", to sample Donna Haraway's Cyborg Manifesto (2016: 7). It does so not *through* the machine, but *with* the machine on an equal footing. It does so with a machine, churned out by a transnational industry for music electronics, found and bought second-hand, according to the legend. It does so with a futurhythmachinic knowledge complex bought on special offer.

What I would also like to draw from this narrative: this music-making is not in direct association to technological innovation in a particular context and historical situation. To explore our technologically penetrated, computerised and algorithmised subjectivities, everyday worlds and societies through the means of music, to grasp them sonically, to critically reflect them or transpose them into sound futurologies and theoremsis not dependent on "future-oriented" technology.Without the sound-cultural sensibilities and willingness to engage with each other that were at work with Phuture,it would have been a different story.The melody machine with its awkward mimicry of bass string instruments would probably have remained a shelfwarmer in music shops and would ultimately have led a sound-culturally aloof life in flea markets and in rehearsal rooms and home studios of lovers of obscure musical objects. Instead, as part of Phuture, it has significantly influenced the sonic futures of not only electronic dance music.

At the same time, what a new kind of technical artefact imagines itself to be and what it seems to know about music on its (actual and metaphorical) surface may not help at all or even work against finding musical ideas that are in some way intriguing in relation to the socio-cultural context. In this sense, the complex sound-cultural lives of MusickingThings with their various stages can be regarded as a prime example of the science and technology studies finding that

the process of designing technologies and societies is not straightforward, because technology is subject to considerable interpretative flexibility. Technology is shaped as a result of complex social processes in which, typically, diverse groups do battle over what the artifact should do, look like, and so forth. The possibility always exists that a technology and its outcomes could be otherwise. (Lohan / Faulkner 2004: 322)

Perhaps sound cultures always incorporate this "otherwise" at the moment of their actuality.

#### *Who do we want to be to each other?*

The new computerised MusickingThings that were produced and marketed widely in the 1980s as part of a globalised system of work – the TB-303 being one of them – were associated with specific expectations at the time.The availability and functionality of these things promised a democratisation of music production in several respects: through their affordability, more people should be able to have professional production tools at their disposal; through the possibilities of digital storing and automation, music production should be more accessible and there should be fewer musical and technical prerequisites (Théberge 1997: 72–90). Correspondingly, the rhetoric around electronic music making changed fundamentally. However, "[t]he notion of music technologies' democratising powers is a myth and a miscalculation: the tropes of broad access and effortless music-making feed off of labour happening elsewhere", as Lucie Vágnerová (2017: 251) notes with reference to the neo-colonial and patriarchal exploitation of women in Southeast Asia by the transnational electronics industry. I would like to add: the global capitalist framework of music and sound technology is first and foremost at odds with the idea of democratisation. (For whom was a brand-new TB-303 actually affordable at that time? Where was it available in the first place?) And: in patriarchally structured societies as we know them, the dominant, specifically male pleasure around technology also largely excludes people with female and queer gender identities (Lohan / Faulkner 2004). Several structural inequalities and exclusions thus intersect here.

In many ways, the MusickingThings that are currently flooding the market under the label of "artificial intelligence" spark similar expectations today. Like the new possibilities for automation and the technical disposal of musical ideas back then, many of the current "smart" music and sound technologies promise, on the one hand, to simplify and streamline the workflow.On the other hand, they promise to provide a form of increased creativity and productivity as well as independence or superfluity of certain skills, experience or training in a previously unprecedented way. Considering the current discourse surrounding them, e.g. in online communities or the media public sphere, these promises seem to fall on fertile ground. Accordingly, the mere emergence of these technologies, currently mainly in the form of software and software plug-ins, is not only accompanied by questions about what this technology is capable of "achieving" and how it will change musical creation and (industrial) cultural production, but is also largely dominated by them. On the one hand, this reflects the expectation that looking at the technical artefact itself, detached from any musical and cultural contexts or forms of practice, already enables you to recognise musical futures. The ability to "design" music and soundcultural life is thus ascribed to music and sound technology in a direct way. On the other hand, it expresses the fascination and joy for the technical functioning of the music machine in the hands of the human individual, which promises the increase of (musical) potency and productivity through technological progress. In this way, the double bind of musician and MusickingThing, which is based on the binary of human/tool and on the notion or phantasm of human/male control over and through the technical Other, is perpetuated. The mode of exploring the "artificial" Other – the promising new, supposedly intelligent music and sound technology – thus always includes an idea of exploiting it.

At the same time, the current hype about technologies framed as "intelligent" or "smart" is also accompanied by working through the question of what (quasi-)human qualities the technical Other possesses. In terms of music and sound technology, this is primarily about notions of creativity, originality and musical context sensitivity, all qualities that were traditionally reserved for humans. Can the machine be musical? Can it be original? Can it mimic convincingly? Can it even be the equal of the artist, composer, mixing engineer, mastering engineer, etc. in these respects? By framing MusickingThings as "intelligent" – be it a composing machine, an automated mastering service or a smart assistant to improve your songwriting or production –, a counterpart is conjured up that promises devotion and servitude to the human, but also seems suspicious and mysterious in a way due to its neither fully determined nor aleatory behaviour.

*"It shouldn't be forgotten that the term Artificial Intelligence was coined in anxiety. It segregated human beings from machines by insisting on two forms of intelligence – artificial and authentic. This maintained the power of the latter over the former. This wasn't because of an inherent superiority but because of the difficulty distinguishing between them". – Louis Chude-Sokei (2022)*

"The automatic machine is the precise economical equivalent of slave labour". Kodwo Eshun (1998: 113) samples Norbert Wiener (1950) while linking the technological of Afro-diasporic sound cultures to the experience of enslavement and colonial racism. There are several authors who have worked on the complex intersection of colonialism, technology and sound culture (such as Eshun 1998; Weheliye 2005; Veal 2007; Chude-Sokei 2016; Ismaiel-Wendt 2016). Given the increased attention to "artificial intelligence" in the context of artistic and cultural production, I would like to take up this conjunctural thinking of technology and the continuities of violent demarcations that have accompanied the category of "human". Louis Chude-Sokei argues that

how we have come to know and understand technology has been long intertwined in how we have deployed and made sense of race, particularly in the case of blacks and Africans in a world made by slavery and colonialism. The language of one is consistently dependent on or infected with thinking about the other. (Chude-Sokei 2016: 1)

In this sense, the notion of an artificial, intelligent Other is not to be understood as innocent. It is no coincidence that the questions that repeatedly arise in conversations about "artificial intelligence" and robots resemble those asked about the racialised, dehumanised and enslaved humans in the context of Western colonialism – they all indeed have a common cultural basis (ibid.). In light of these critical endeavours, the specific "technopoetics" found in Afro-diasporic sound cultures appear as self-conscious reconfigurations of the relationship between musician and technology. They queer the binary of human and tool in ways that establish musical practices beyond the universalisms of both humanism and white liberal posthumanism. Here, techno-aesthetic innovation does not run parallel but transversely to technical progress, which is generally thought of as linear.

These modes or models of human-machine interaction are culturally highly influential. They fundamentally changed how technical artefacts such as samplers, drum machines, turntables, mixers or voice-altering technology are viewed and listened to, not only for musicians but far beyond. Thus it is not surprising that these practices, at a certain point, feed back into the development and marketing of new MusickingThings, as shown above in the example of the 303 software that takes up a creative practice established by Phuture. The musical interplay as performative reconfiguration of the prescripted roles of subject, object, user, tool, is translated and transferred into a new kind of artefact, product, commodity, whose many lives are again not yet foreseeable.The moment we engage with music and sound technology, be it as artists, cultural workers, scholars, enthusiasts, educators or developers, we enter that feedback loop and influence the future lives of MusickingThings. Without overestimating our individual agency, it should be clear that the questions we ask or take up about (new) technologies and the expectations we associate with their disposal carry some weight. What do we perpetuate when we short-circuit musical creativity and productivity with a faith in technological progress? What do we perpetuate by taking up the framing of technologies as "artificially intelligent"?

*"Master or slave, man or tool. Convinced that there are no other options, no patterns of behavior which exceed this double bind, the disciplines have been unable to perceive the emergence of intelligent machines". – Sadie Plant (1997)*

Following perspectives of science and technology studies, it seems appropriate to not view technological artefacts in isolation from their lifeworlds and socio-cultural contexts. Against the prevailing way of talking and thinking about these current technologies, I would like to propose in this sense – both playfully and seriously – not to use the attribution of "artificial intelligence" for specific MusickingThingsin-themselves, but to reserve it for aesthetic and cultural practices that develop other modes of how we come together collaboratively with machines. "Artificial" thus serves as a reference to human-machine togetherness and to everything that cannot be easily resolved as an equation of the creative agency of the individual actors. "Artificial intelligence" in the context of musical practice then corresponds to what Rolf Großmann (2014) and Malte Pelleter (2020), following Kodwo Eshun (1998), understand as "sensory engineering". The (at that time) somewhat unlikely interaction of the human and non-human members of Phuture can be seen as one of many possible examples of this from popular and especially Afro-diasporic sound cultures – but never a deep-learning composing automaton-in-itself.

#### *Who do I want to be to you?*

The critical theoretical engagement with the entanglements of music practice, technology, post-/humanism and colonialist continuities also feeds back into my own relationship with MusickingThings as an artist and educator. It fuels and/or explains the discomfort I feel when confronted with ideas of improvement, ease of work, independence and productivity through technology and technological innovation. I want to understand this discomfort as something productive. I want to use it as a starting point to question my relationship to MusickingThings and my associated projections and expectations – especially as a white individual who is socialised and

perceived as male. What's the quality of my relationship with the machine? With whatintention and desire do I enterinto this relationship? In whatmode do we operate? To what extent does this mode reflect a relationship of master and slave? Whose labour is drawn upon and who benefits from it? What material and conceptual connections beyond the obvious ones are involved? To what extent do I actually want to engage in them or maintain them? I do not see working on these questions as a purely cognitive-analytical endeavour. Rather, it needs to be based on the reflection of aesthetic experience and pleasure, on auditory understanding, on the body-read sensations that come with collaborative involvement with these things that, like us, are inevitably entangled in contexts of inequality and exploitation of a social, cultural, neo-colonial and ecological kind. But not that I know how to do that.

# **Bibliography**


### **Discography**

Phuture (1987): *Acid Tracks*, Trax.

# **The Upcoming Change in Human Musical Thinking. What Does a Music Professional do in the Age of AI?**

*Nikita Braguinski*

Despite its image of centuries-old tradition and stability, musical theory as a tool for thinking never stands still. It is in a process of constant change, often due to the emergence of new music-related technologies, as well as of mental "apparatuses" such as the cultural techniques of mathematics (Assayag / Feichtinger / Rodrigues 2002; Magnusson 2019; Braguinski 2022).

This essay is dedicated to understanding what the future of musical knowledge might look like in an era in which musical tools based on machine learning would have become common. It addresses the following set of questions:


In the past music theory has already withstood several technological upheavals, such as when sound recording technologies appeared at the end of the 19th century, or when sound visualization allowed new kinds of analysis (for examples of these early technologies, see: Braguinski 2019a, 2019b).

In all such cases music theory was – until now – able to continue asserting itself as a valid tool for the understanding and creation of music by adapting to the new circumstances. This process is, however, neither linear nor deterministic. Different actors influence it. This is why the understanding of change and the creation of informed predictions necessitate a transdisciplinary approach to the interplay of technology and musical theory.

To begin this essay, I want to first of all draw attention to the fact that technological change in music not only concerns practicing musicians and listeners. In discussions of music technology it is often overlooked that it may also have careerchanging consequences for musicologists and other music professionals who employ formal musical knowledge.

Accordingly,itis animportant task now to look for ways to anticipate and discuss future changes of these tools used by music professionals, creating opportunities for a smoother and more productive transition into the new technological situation.

For discussing the imaginations of a *future* musical knowledge, a possible starting point can be the *history* of musical theory. Existing literature illustrates the fact that theories of music are normally structured, even if only implicitly, around the specific abilities and limits of the human subject (Christensen 2008; Cambouropoulos / Kaliakatsos-Papakostas 2021), meaning that human listening is at the core of these attempts to create a formal system for making and analyzing music.

At the same time, capable technological tools for carrying out various musicrelated tasks have started to emerge during the previous two decades, following developments in machine learning. Examples are tools for imitating human musical creativity (Nierhaus 2009; Briot / Hadjeres / Pachet 2020; Miranda 2021), and for imitating decision-making in technology-assisted processing of audio, affecting the aesthetics of the resulting recordings (Sterne / Razlogova 2019).

Of all such music-related tools, only the music-generating systems and the recommender systems used by online streaming companies to present their recordings have currently received substantial scholarly interest.

Especially the tools used in the area of computational creativity such as the generation of music are well represented in literature which also includes typological overviews of current and historical technical developments (Avdeeff 2019; Gioti 2020; Bown 2021; Tatar / Pasquier 2019; Lubart 2005).The issues connected to audio recommender systems have also been discussed at length by Eriksson et al*.* (2018); Born et al. (2021) and others.

By contrast, the future of formal musical knowledge itself, the theories that are created and used in music theory, as well as in musical practice, has remained underrepresented in literature.

Today, one of the biggest challenges for music theory is the ability of modern AI to create style imitations without using the familiar terms and constructions of human music. To put it differently, we are in a new situation where we have works (or imitations of works) created with the help of AI that sound as if they were made according to well-known, explicit *rules* of composition, whereas they are actually created through the training of a neural network using musical *examples*. This means that in musical theory the tension between explicit (rules) and implicit (examples) knowledge is now becoming more and more problematic, making the status of either explicit or implicit mode of education and work much less clear than it was in previous eras of music and technology.

An even greater challenge to human musical theory is the ability of modern machine learning systems to work directly with sound, bypassing the level of musical notation. Musical theory, as it was known for centuries, works with notation, even if it is well known among musicians that many crucial aspects of music such as timbre are not adequately represented by it. Recent projects in direct creation of sound such as *Jukebox* demonstrate that today's AI technology can capture not only the notation-based structures in music, but also the subtleties of performance that cannot be transmitted by notation alone. Examples of recent influential technical papers in this area are Dhariwal et al*.* (2020); Engel et al.(2019)

### **Understanding the theory of music as a tool**

A whole field of human activities involves the use of concepts and knowledge from musicology and music theory. Creating a comprehensive overview of all these activities would necessitate input from other disciplines, such as sociology of artistic work and the history of humanities, ideally in form of an interdisciplinary collaboration. In the argument presented here, the following hypothesis about the overall shape and contents of this group of activities will instead help to launch a preliminary discussion of this topic:

The actors whose activities involve the use of concepts and knowledge from music theory are:


I did not include the creators of music recommender systems into this list of actors whose activities I am personally currently interested in analyzing.The reason is that this specific area of work has recently received much more interest than the others. Algorithmic recommender systems, especially those employed in the streaming and selling of recordings, have already been scrutinized for their potential to shift the power balance between large corporations and individual musicians, and to homogenize global musical culture, with more research forthcoming (Born et al. 2021; *Music Culture in the Age of Streaming. MUSICSTREAM*, 2022; *MusAI* 2022).

This essay, therefore, aims to broaden the scope of inquiry with regard to musicrelated activities, drawing on the helpful insights that were gained about recommender systems, and looking for specific features that appear in other areas. Arguably, the existing musical theory does not currently offer to music professionals all the possible or imaginable tools for doing their work, but only a subset of this infinitely large area. Also, the existing technologies can only imitate a small part of this subset, and with varying degrees of similitude. Therefore, the following questions need to be addressed before the discussion of the music theory's future role can continue:


Here, again, input from studies of creative processes (Donin / Traube 2016; Zattra / Donin 2016; Born 2018), of didactics and teaching, and of academic work would be very helpful, and I, again, hope that such teamwork would become possible in the future. For the time being, however, the hypothesis that will guide my further analysis in this area is the following:

The parts that can be imitated or could be imitated in near future are:


Such a long list of actual or possible imitations of human musical activities using technology can easily lead to the false impression that human music-related work is on the brink of becoming fully automated, with technology doing and delivering exactly what humans did and delivered earlier. However, when analyzing possible impacts of AI-based tools on such areas, it is also important to keep in mind that technological tools normally do *not* exactly replace or replicate any existing human practice. Instead, the activityis deeply transformedin the process of becoming automated. To make the activity fit the specific strengths of their tools, which are mostly different from capabilities of human actors, creators of technological tools need to redefine the activity from what a *human* can do to what their *tool* does, often claiming that the core of the activity is somehow still preserved in the process (Sterne / Razlogova 2019).

Some examples of how music-related activities could change intrinsically as a result of becoming AI-based are:


# **What does the user want from the tool?**

The future course of development with regard to music theory as a tool also depends on the needs and wishes of those who employ this tool in their day-to-day work. Understanding which technological routes in music are more probable than others therefore additionally involves understanding who has an intrinsic interest in introducing AI-based tools into these areas.

The following group of actors is in my view likely to be more interested AI-based tools than others:


On the other hand, there are also likely to be actors who will be opposed to the intrusion of an AI-based tool into an area of work that until now existed without such technologies. Particularly, those who are critical of the new tool's ability to really carry out the work it is purportedly made for will resist attempts by other actors to introduce it. Motivations are also likely to be related to issues of labor and job security.

However, it seems probable to me that such anti-technology arguments could be in an unfavorable position in comparison to an overwhelming narrative of higher productivity or even democratization of a previously exclusive and closed-off area of human activity.

# **AI-based tools in music: A possible timeline**

To better navigate upcoming change, it is helpful to sort existing and future technologies into temporal categories, grouping together those that are already available, those that are likely to appear soon, and those that are possible, but only in a relatively longer term. The following overview aims to create a hypothesis about these temporal categories:

Music-related technologies that are commonly available now:


In short-term future, the following developments seem likely to appear in areas of work related to music:


In a longer-term perspective, the following new possibilities are in my opinion likely to occur:


This short essay does not touch upon questions of ethics and academic independence that arise in the technological situation in which many music-related actors find themselves in this moment. Still, I want to briefly mention here that streaming services like Spotify and social media companies have collected data about user interaction with music that could become a paradigm-shifting new kind of *measurement* of human musical behaviour. However, ethical questions connected to the use of these data, and the very independence of academic research in a situation in which researchers depend on privately owned data are urgent problems that need to be addressed by a broad discussion in the research community.

Finally, I would like to end this essay with a few words concerning its intrinsically speculative character. Like any other attempt at thinking through the future, this essay will inevitably become disproved by the future itself in certain areas, and confirmed in others.My ideas about possible future developments that I present here do not have the goal of predicting the future in its entirety, but of fostering productive debate, testing ideas for interdisciplinary collaboration, and of triggering discussions on which future-influencing steps need to be made, and in which direction. It is crucial to create more collaborative research on technological tools in musicrelated areas, bringing together expertise from many areas, including musicology, music theory, computer science, sociology, the history of science and technology, and the history of humanities.

A critical reflection on the possible impact of musical AI tools on individuals, society, culture, and academia is needed to lessen the danger of ignoring change until its negative aspects have established themselves in a too deep and obvious manner, as has happened previously with aspects of social media such as filter bubbles. At the same time,it is vital to steer clear of critique based on purely emotional assumptions about technology and its effects. Informed solutions and sustainable roadmaps for future change are now required in many areas of professional music. A balanced approach is vital for guiding music professionals and musicologists through the impending period of orientation and change due to the emergence of AI-based tools and large collections of music-related data.

# **Bibliography**


*Foundations, advanced approaches, and developments for creativity*. Cham: Springer, 263–281. https://search.ebscohost.com/login.aspx?direct=true&db=ram&AN= A2178886&site=ehost-live (18 February 2022).


Sonic Visualiser (no date). https://www.sonicvisualiser.org/ (18 December 2022).


# **The Roughness of Neural Networks. Jimi Hendrix, Holly Herndon, GPT-3, Timbre Transfer and the Promising Failure Aesthetics of Musical AIs**

*Jan Torge Claussen*

# **Introduction**

Writing a contribution after a talk should not be that difficult, because the core ideas have already been presented and sharpened semantically or artistically in previous works.<sup>1</sup> However, in the meantime, different theoretical and sonic inputs, for instance, the contributions of all the other participants in the symposium, have inspired and motivated me to access this new *data*. The processing of this data by a humanoid author, one might assume, is not totally different from that by an artificial intelligence, apart from the very high processing speed of the machine. But in contrast the perspective of the human author is shaped by highly individual experiences, evaluations and emotions. Moreover, despite the machine's pursuit of optimisation and perfection, I am more fascinated by the moments when it fails regarding to human expectations, and acts unexpectedly, in contrast to my own creations, which I always hope will turn out well.

During my research for this article, I began a conversation with ChatGPT (Generative Pre-trained Transformer), the chatbot from the Open AI research lab. Some answers have been thought-provoking and are documented and reflected at designated points in this article. Our conversation, that is accessible online at full length, starts with me prompting:

*JTC*: Explain Sound in simple terms! *GPT-3*: Sound is a type of energy that travels through the air (or any other medium, such as water) as a wave. It is produced when an object vibrates, causing the air around it to vibrate as well. The vibrations cause pressure waves in the air, which can be detected by our ears. […] Sound is an important part of our everyday lives. It allows

<sup>1</sup> Parts of this Article have been published in German language: "Risse in der Schallmauer. Musikalische KI und phonographische Stimme" (Claussen 2024).

us to communicate with each other, to enjoy music, and to be aware of our surroundings.<sup>2</sup>

Even though the AI only computes the most statistically likely sequences of words, it is worth noting that by using the words "we", "our", or "each other", the AI seems to imagine itself as part of humanity, from which it is still very different, despite its fascinating successes: artificial intelligences are able to compose music (MuseGAN, Aiva), write texts (GPT-3), translate (DeepL) and produce voices (Holly +, Tone Transfer). Are these AIs, which all play a role in the following, creative or not, and what does this mean for art and artists? The roles of authors of art seem to be redefining themselves once again in the face of new media. Roland Barthes (1967) declared the author dead more than half a century ago to make clear, that the author is not the sole ruler of his work; rather, it is the readers, who constantly reinterpret the story, place it in new contexts, or connect it to their own experiences. But with the rise of more and more accessible media technologies different voices, sounds, images or texts could also be combined physically and not only in the mind of readers or listeners. However,in the digital realm materials are translated in numbers and codes, which lead to even further possibilities for creative manipulations with a huge variety of media interfaces. In the postdigital age the borders between authors and readers, of producers and users, artists and audiences combining different voices, sounds, images or texts are widely open. Even if living in the postdigital age means, that men are no longer fascinated by digitality per se and have recognised that all areas of life are under its influence (Bishop et al. 2016; Cascone 2000; Cramer 2014), AI as such exerts great fascination.

At the same time, the progress of AI seems to be a lot faster, than one imagines. In 2021, I tried to play the trumpet through my voice using AI more specifically so-called Timbre or Tone Transfer / DDSP (Differentiable Digital Signal Processing) by Googles open source research project *Magenta* (Carney et al. 2021).The sound was indeed amazing or as some musicians put it, who have been involved in the project:

I think the thing that stands out the most to me is all of the nuances of the sound that make it so much more real than working with a sample. […] It gives you a lot of surprises.

I am excited, by not knowing what could come out of it. (Google 2020)

<sup>2</sup> ChatGPT is based on the GPT-3 language model of Open AI and can be accessed via https:/ /chat.openai.com/chat. The entire conversation between me (JTC) and the ChatBot (GPT-3) from 20 December 2022 is documented at the following link: https://gegenwaerts.com/con versation-between-me-and-gpt-3/.

At that time timbre transfer still required the diversions of a recording of my own voice, which could then subsequently and monophonically sound as a trumpet, for example (Audio 1a und 1b<sup>3</sup> ). Today, this process is instantaneous, as Matt Dryhurst, Holly Herndon's partner, impressively demonstrated when he sang with her voice at the Sonar Festival 2022<sup>4</sup> .

In the following, I will present facts about GANs, a machine learning model that has been used for timbre transfer a lot, and what they might have in common with human learning methods. After that, I will elaborate on some works that appear cleaner and more abstract in contrast to the ones operating in the concrete realm of sound and I will go into more detail about further experiments and sound studies. These will be connected to media theoretical perspectives and artists featuring digital failure aesthetics before I present some compact experiments of my own involving machine learning of Hendrix's guitar and Herndon's voice to end with a discussion of my findings regarding media education and collaboration.

#### **Creativity in Generative Neural Networks**

My focus initially has fallen on so-called Generative Adversarial Networks (GAN), as these have been considered particularly innovative in computer music research and actually have a promising architecture concerning a form of artificial creativity.This is because within this specific method of machine learning, two neural networks are developed that work against each other. The so-called generator creates new data from randomly generated data, while the other neural network, the discriminator, checks whether the newly generated data could also come from the original data set. In other words, it checks whether the images are imitations or could be originals. The calculations are complete when enough imitations are thought to be originals.

This approach is not so different from human learning methods. A big part of learning a musical instrument is to copy the playing styles of existing artists and examine how good the self-made imitation is. A rock guitarist for example might learn to play a preferred guitar riff of Jimi Hendrix playing it over and over again, each time going through a little sometimes more or less unconscious self-assessment, asking her / himself, if this sounds like the original and optimizing his playing subsequently. While the guitarist becomes more and more satisfied with his adaption,

<sup>3</sup> https://gegenwaerts.com/wp-content/uploads/2023/01/Audio\_1a\_Tone-Transfer-Web-tru mpet.mp3 https://gegenwaerts.com/wp-content/uploads/2023/01/Audio\_1b\_Tone-Transfe r-Web-original\_Recording.mp3 All Audio Examples are available at this website: https://gegenwaerts.com/aix/ (21 January,

<sup>2024).</sup> 4 "AI and Music – Holly Herndon presents Holly+ feat. Maria Arnal, Tarta Relena and Matthew Dryhurst" (30.03.2022) https://youtu.be/Wk6T2WmhuJw (30 March, 2022).

he might integrate what he learned in his own playing style and sometimes create new riffs that sound partly like Hendrix, but they aren't his.<sup>5</sup> In the logic of a GAN, the musical sequences could be considered to be originals and with that, the machine learning process has come to an end. However, in machine learning, GANs have first been explored in the visual domain.

The machine learning model, developed by computer scientist Ian Goodfellow (2014), has been applied to images with astonishing results; it can, for example, produce images of people that do not even exist and still look absolutely convincing<sup>6</sup> . By learning the underlying distribution of data masses, GANs can generate new realistic sample data that have the same statistical properties as the training data. However, the resulting images are not simple replications or the result of averaging. As Rashid (2020) summarizes: "GANs learn to create images at a level far above simply replicating or averaging training data".The algorithms have effectively learned what is possible with a particular image in this case of a human face, but without actually knowing that it is a face. Unlike rule-based systems, no rules were given either, so the algorithm would determine where the nose or eyes were, for example, based on calculations.

As Cádiz et al. (2021) point out, this method of machine learning could even be called creative: "The whole idea of this approach is to make the generative model G (Generator) so good that eventually D (Discriminator) might be fooled by a false input. If this happens it means that G is generating false data that is indistinguishable from real data, also a possible indication of creativity". (Cádiz et al. 2021: 3) In their article, the researchers examine creativity in terms of two aspects within a case study based on Margaret Boden's (2009) definition. According to the cognitive scientist, the creation of something new, unexpected and surprising is essential for creativity. However, this alone is only an indication of possible creativity; what is also important is a form of evaluation, an appreciation, something meaningful. For this, the human being would at least be a central instance. The computer is capable of generating something that, according to its calculations, corresponds to general human listening habits, but it cannot value its compositions.

The strength of artists like Jimi Hendrix lay in defying listening habits and creating something new or surprising in collaboration with his instrument (Millard 2004; Trampert 1998). GPT-3 also seems to confirm this:

Jimi Hendrix was widely regarded as one of the most creative and innovative guitarists in the history of rock music. He was known for

<sup>5</sup> Listen for example to the guitar played in the verse of the song "Under the bridge" by *Red Hot Chili Peppers'* (1992) former guitarist John Frusciante, who has made a lot of guitar-playing styles from Hendrix his own.

<sup>6</sup> https://thispersondoesnotexist.com/

his unique and groundbreaking playing style, which incorporated a wide range of sounds and effects, as well as for his ability to improvise and create new musical ideas on the spot. […] He was known for writing complex and sophisticated musical pieces that incorporated a wide range of musical elements, including intricate guitar work, dynamic changes, and unexpected musical twists.[…]

So it seems as if the chatbot is able to value Hendrix's creativity, but only in terms of all the things other people have written about him on the web. Obviously, the AI is not able to listen and to develop its own opinion about Hendrix's creativity.

# **Musical AI doesn't listen**

Unlike static images and texts, music has other properties that make its computations more complex: Music is time-based; it has multiple tracks or instruments that change dynamically but are still related to each other; and music is grouped both horizontally and vertically by chords, polyphonic melodies, arpeggios, etc. The MuseGAN (Multi-track Sequential Generative Adversarial Networks) project attempts to account for these interdependencies (Dong et al. 2017). To this end, the neural network was applied to a data set of over 100,000 bars of rock music to generate new piano rolls or images of the MIDI notations of five tracks corresponding to the instruments guitar, bass, drums, piano and strings. The aim was to generate short (4-bar) coherent pieces of music "right from scratch [...]without human inputs", i.e. without human pre-selection, while the addressed coherence of the polyphonic music was guaranteed on the basis of the following parameters: "1) harmonic and rhythmic structure, 2) multitrack interdependency, and 3) temporal structure (Dong et al. 2017). For this purpose, the scientists put together different architectures of the GANs, which are also clearly noticeable in the sound of the results. Listening to the so-called composer model, jammer model and a hybrid model on the website<sup>7</sup> , it becomes clear, apart from the differences in general, that the presented models make human beings believe that they learn something about music. This is confirmed by a survey in which various listeners positively evaluate the music pieces resulting from the process according to selected criteria (Dong et al. 2017). The authors' study recorded whether the varying four bars of music had a pleasant harmony, a consistent rhythm, a clear musical structure and references or "coherence". Significantly, however, the listeners were not asked to focus on interesting, unexpected or creative outcomes, which I think would be much more

<sup>7</sup> Website with sound examples of MuseGAN: https://salu133445.github.io/musegan/results (21 January, 2024).

thought-provoking. After all, it can only be considered creative to a very limited extent if the machine only produces imitations (see above), i.e. credible variations of something that has already been there.

Many AI projects use methods that transform the music on a higher level of representation and not directly in the realm of the physical audio signal. So, as in the previous example,MIDI sequences or chords or text-based forms of representation. In the project "The lost tapes of the 27th club", song imitations of Kurt Cobain and Jimi Hendrix<sup>8</sup> among others were even produced, though it has been very hard to translate the unorthodox guitar playing into MIDI. However, the MIDI data used on the basis of about thirty songs each was then recorded by professional cover musicians (Grow 2021). Media music scholar Rolf Grossmann (2022) sees these methods as a step backwards, as the music, similar to what was practised in classical music history, would be thrown back on the reading of its score, while cultural and historical context, sound and performance would be neglected.What remains is a "pattern generator" which provides symbol structures ready for a machine or human to perform.

Despite the limitations mentioned above, piano rolls may be well suited to capture essential parameters of instrumental playing (pitch, duration, timing and velocity) in relation to the piano. However, other playing techniques, such as vibrato on the guitar, sliding over the fingerboard, plucking or bending are neglected in favour of better predictability:

The physical process through which sound is produced is abstracted away. This dramatically reduces the amount of information the models are required to produce, making the modelling problem more tractable and allowing for lower-capacity models to be used effectively. (Dieleman 2019)

This is partly comparable to the gamification strategies used in digital applications for learning musical instruments, where the tracking of this type of playing technique on the guitar is also not reliable (Claussen 2019, 2021). But just as it is incomparably more complex to capture and translate the sounds of a conventional electric guitar in its entire sound spectrum than the pressed keys of the Guitar Hero controller, it is also much more complicated and involves more computational wiring to train neural networks phonographically directly based on sounds. In return, however, it frees one from being tied to a few symbolic parameters and encompasses timbre, space, amplification and mixing of the entire recording, as in the GAN Synth

<sup>8</sup> Jimi Hendrix' AI Song "You're Going To Kill Me": https://youtu.be/6Ohf97p7u1w (21 January, 2024). The website of the project is dedicated to drawing attention to musicians struggling with mental health. https://losttapesofthe27club.com/ (21 January, 2024).

project, for example, where waveforms or spectrograms are used rather than piano rolls (Engel et al. 2019). In all cases, however, images of sounds are the basis of the calculations, even if representations of the entire audio signal are more comprehensive than midi scores, which hide sound and performance. It remains to be said: The machine does not listen, it reads preferably clean symbolic representations.

# **The Rasping of Neurons**

What is striking about various academic research projects (Cádiz et al. 2021; Dhariwal et al. 2020; Dong et al. 2017; Engel et al. 2019) is that they largely focus on optimizing and perfecting sounds, timbres or musical sequences that are more familiar to us. This is all the more true for commercial applications such as AIVA<sup>9</sup> and Co., which produce soundtracks that are by no means aimed at developing a unique or creative AI voice, but rather at reproducing proven voices in an application-, a costand a licence-optimised way for video games, films or advertising clips. In the search for a peculiar, previously unknown voice, on the other hand, the focus is precisely on those imperfect places where the limits of the medium are exhausted, so that errors, bugs or glitches come to light. This is also a phenomenon that was particularly expressed in the playing of Jimi Hendrix (1967) on the electric guitar, as he made musical use of feedback, noises from amplifiers and electronic components of the instrument. So listeners as well as the transcriptionists who convert his songs into machine-readable MIDI notes could not always be sure whether it was a disturbance or part of his music.

The history of media formats is determined by two opposing tendencies. On the one hand, the focus is on optimization processes to create an almost perfect sound. So the most important characteristic of the digital CD was that it is not even perceptible as a sound carrier. On the other hand, artists in particular were interested in artefacts and errors of new technologies. In relation to digital media technology or the compact disc, for example, they created a post-digital error aesthetic at the end of the last millennium (Cascone 2000; Claussen 2020; Großmann 2003). Exemplary of this are the compilations entitled *Clicks & Cuts* (2000–2003) by the German label 'Mille Plateaux' as well as various works by artists such as Markus Popp aka Oval or the media artist Yasunao Tone, who, among other things, changed the CD surface by pasting and scratching it to create new and altered sounds that no longer correspond to the original recording (Stuart 2003). In this way, they elevated the jumps, skip noises and loops, which are otherwise perceived as a disturbance, to the aesthetic material of their music.

<sup>9</sup> https://www.aiva.ai/ (21 January, 2024).

As philosopher Sybille Krämer (1998) describes referring to the theory of McLuhan (1964), media represent the blind spot of media use. Humans only recognize them by their malfunctions, the stuttering compact disc, the feedbacking guitar, the coarse pixels in the video stream when the internet connection breaks and precisely also by the noisy artefacts generated by the algorithm of an artificial intelligence when generating a sound. For example, such striking artefacts are generated by the artist Holly Herndon with her AI "Spawn". This instrument, or musicking thing (Ahlers et al. 2022; Ismaiel-Wendt 2016), was trained for two years based on data from her own voice, her compositions, her cooking and her "just living alongside it" to dynamically generate a unique repertoire of sounds. Traceable and audible, Spawn or Holly Herndon's hybrid voice on the album *Proto* (2019) is particularly evident in the first track "Birth". Herndon describes that based on her artistic approach, the neural network becomes audible, which more adequately represent the current state of AI research and its artistic application than would otherwise often be the case:

A lot of the press releases present AI in this very glossy way that erases all of the human labor that went into training whatever the AI is doing. It also creates this illusion of it being more developed than it is. We wanted to be more honest about it. When you're dealing with automated composing, you get a MIDI score at the end, and when you push that through a digital instrument, it sounds really clean. We're dealing with sound as material, and by using audio material, you can really hear the roughness of the neural network trying to figure out what to do next. That was something that we chose, because we wanted to make that clear: [that] the current state of the technology is still developing. That's what's exciting about it: we still can have a say in which direction it goes. (Friedlander, 2019)

For Holly Herndon and other scientists,it is clear that AI alone will never be creative, but that creativity always comes from collaborations between humans and AI. She is already offering a tool for that.

# **Further Experiments from Holly+ to Jimi Hendrix and Tone Transfer**

With *Holly+*, the artist has made a tool available that enables users to sing with her voice. As is the case with other AI tools, new sounds can be created directly in the prompt. This so-called "promtism" (Hayward 2022), as artists have already named this phenomenon, is easily accessible but comes with other challenges: "[S]ome of my AI images took 30+ tries to match closely to my end goal. I had a clear picture of what I wanted in my head, but it's a matter of articulating that in a way that's

friendly to the machine". (Hayward 2022) In the case of Holly+, all that needs to be done is to upload an audio file to be downloaded in the sound colour. The results of this process reveal the roughness described by Herndon (Audio 2)10. Even if the process is not yet live for every user, it is already possible (Herndon 2022). If it actually becomes more common to sing and produce music through other voices, this naturally raises questions about identity, authorship, exploitation or possible forms of cultural appropriation – even if these questions are less the focus of my contribution. It will be at least as challenging a task as in the cultural practice of sampling to decide how to deal with the diverse identities, artistic freedoms and ethical and legal issues in each individual case. The musical spectrum has expanded.

After experimenting with my voice or guitar as input for Holly + or Magenta's Tone Transfer, I wondered if, instead of mimicking guitar riffs to learn how Hendrix played, it would be possible to preserve his timbre. Of course, Chat GPT could provide a convincing answer to the question of what the guitar of Jimi Hendrix sounds like,<sup>11</sup> and the specialized literature (Clague 2014; Trampert 1998; Waksman 2010: 166–206) as well as the original recordings provide reliable, but also more varied and complex answers, which are, however, beyond the scope of this experimental study. Therefore, I first decided to use similar equipment that Hendrix used in most of his recordings. Then I created training data for machine learning by recording my own guitar playing on a Fender Stratocaster using a Marshall amp, and various effects devices such as wah-wah and fuzz distortion. And in doing so, I tried to imitate some significant playing techniques of Hendrix and replay song sequences. The training data generated could then be used within the environment of a so-called "Colab notebook" provided for the Tone Transfer project, which contains the Python code for the machine learning model including step-by-step instructions (Carney et al. 2021). After these technical hurdles, about 12 minutes of audio were used for training a neural network with 30 000 steps for about 3 hours. The resulting files could finally be played in the music software environment of Ableton Live with the help of the Tone Transfer VST plug-in. At this point, it is necessary to experiment again and this time with the input, which produces different fascinating results in interaction with the timbre (Audio 3–7)12. Even after several attempts with different inputs of one's own voice (Audio 3), a piano (Audio 4), a field recording of birds (Audio 5) or a guitar (Audio 6), the result does not seem to correspond particularly clearly to a Hendrix-like electric guitar, but unique references to the recorded training data (Audio 8) remain. The results clearly differ from the existing presets of other timbres (Audio 7) and produce a lot of aesthetic failures or "rasping of the

<sup>10</sup> https://gegenwaerts.com/wp-content/uploads/2023/01/holly\_plus-1\_english-sentence.mp3 (21 January, 2024).

<sup>11</sup> https://gegenwaerts.com/conversation-between-me-and-gpt-3/ (21 January, 2024).

<sup>12</sup> Website with all audio examples: https://gegenwaerts.com/aix/ (21 January, 2024).

neurons". They also have an emotional effect that comes with having designed one's own timbre and making it usable in other contexts, similar to how musicians emphasise the value of specially recorded or discovered sounds and music sequences when sampling or djing. Otherwise, "spawning", as Herndon calls the methods of timbre transfer, clearly differs from digital sampling due to the specific mediatechnical and cultural collaboration: "So, for example, with sampling, usually you copy and remix a recording by someone else to create something new. But with spawning, you can perform as someone else based on trained information about them". (Herndon 2022 min 3:21)

# **Conclusion: Human Learning through Machine Learning**

In the age of timbre transfer and after the death of the author (Barthes 1967), two things can be said.Thanks to spawning as a form of machine learning, it will be possible in the future to speak with the voices of dead authors without sampling them directly and thus reproducing something they have already said. And as this is at least unrecognisable to the amateur, it bears numerous dangers and uncertainties. But it also bears many creative potentials that can be found especially at the edges of various AI music productions. As the previous experiment showed, the result is not the trained Hendrix sound, but an in-between that is tempting and more than the reproduced imitation of the original. Such approaches are appealing not least because we never listen exclusively with our ears but also with our other senses, our memories and depending on the most diverse contexts (Schulze 2018; Sterne 2003). In this way, the reference to Hendrix becomes culturally significant without being recognized as a sound event.

The neural networks presented are characterized by their type of so-called unsupervised deep learning. Once the process is set in motion, it cannot be continuously monitored and interrupted by a human. In this sense, the machine acts autonomously at times. Nevertheless, the human determines the parameters of the process, as I have shown with some examples, and ultimately also orders the results with a view to an aesthetically meaningful outcome. Following Marshall McLuhan (1964), the question is: What is the message of AI? In other words: What influence does it have on music and its protagonists, on pieces, compositions and performances? For researching the messages of AI, and for recognizing the limitations during the use of the media that determine our situation, application-oriented approaches are crucial, such as those provided by the Magenta project from the environment of Google's AI Tensor Flow or artists like Holly Herndon.

Collaborating with the tools thus means relinquishing part of the control, as was already practised a decade ago in particular in the aleatoric and open works of John Cage, Pierre Boulez or Karl-Heinz Stockhausen in relation to the classical score or the concert hall: "Those involved with the composition of experimental music find ways and means to remove themselves from the activities of the sounds they make. Some employ chance operations" (Cage 1961: 10). In principle, however, a certain degree of loss of control is present in every musicking thing. Game theory, for example, makes clear what may also apply to musicians and their instruments, namely that players not only play a game but are also played by the game (Claussen 2021; Huizinga 1998). And Pierre Boulez states for the aleatoric piece of music, that it is like a labyrinth offering a number of possible paths, that have been precisely designed by the artist, while chance plays the role of setting the course (Boulez et al. 1964).

In the face of machine learning, however, this collaboration seems to take on a new dimension.The playing field of predictable chance, the labyrinth withits diverse possible branches, becomes much faster, more intricate and more mobile and can persist in statistically similar things – or inspire something uniquely new. All the more important is to dedicate oneself to this field, of course without falling in love with the "gadgets" and losing oneself in it in the sense of McLuhan (1964: 22).

A good strategy is to approach the boundaries, those of predictability but also those of one's own listening habits. Otherwise, likely, users will only ever produce what is already present in the respective musicking thing (Ahlers et al. 2022; Ismaiel-Wendt 2016) and thereby correspond to widespread listening habits. Musical tools, games and instruments are part of social contexts and imply specific ways of dealing with them. In the best case, this empowerment of things leads to human-machine hybrids rich in variation; in the worst case, to the one-dimensional repetition of what the respective musicking thing demands at the most obvious level. Users, artists and producers are always in the process of negotiating the boundaries of this influence.Media education takes place within this process. For in musicking things, there is the chance to learn about music production cultures, heterochronous (Pelleter 2020: 25) music history or the power of well-tempered mood and timbre, in short, to practise what Kodwo Eshun (1999: 22) calls "beat education". Concerning machinic learning, this raises the obvious question of intelligence. What about the musical ghost in the machine? Eduardo Miranda (2021: 20) emphasizes that AI is a great tool to study musical intelligence. Where intelligence includes human abilities such as creativity, subjectivity and emotion, interaction and embodiment. At the same time, these properties inherent in music make it an interesting field for research. Both in AI and concerning human educational processes.

However, as machine learning could be used to create a future of endless repetitions of things that have been heard before, a future where no one can ever be sure if someone is embodying the original or only speaking in the voice of the original, it also offers the chance to value the moment, the live experience full of traces, raw materials, failures, experiments and collaborative performances.

# **Bibliography**

Ahlers,Michael / Benjamin Jörissen / Martin Donner / Carsten Wernicke (2022): *MusikmachDingeimKontext:Forschungszugänge zur Soziomaterialität vonMusiktechnologie*. Hildesheim, New York: Georg Olms Verlag.

Barthes, Roland (1967): "Der Tod des Autors". *Texte zurTheorie der Autorschaft*: 185–197.


Rashid, Tariq (2020): *Make Your First GAN With PyTorch*. Independently published.


# **Discography**

Hendrix, Jimi (1967, 2012): *Are You Experienced*. Sony Music (Sony Music).

Herndon, Holly (2019): *Proto*. 4ad/Beggars Group / Indigo.


# **Digital Aesthetics: A Symbolism of the Body and a More-than-Human Mode of Enquiry**

*Jannis Steinke*

New developments in digitalisation and especially technological innovations in terms of the use of Artificial intelligence often reinstall an anthropomorphism and anthropocentrism by diagnosing a transhumanistic technooptimism or technosolutionism that claims to even be able to "reverse-engineer extinction via AI" (Zylinska 2020: 40). Zylinska calls this the "Anthropocene Imperative: a call to us humans to respond to those multiple crises of life while there is still time" (ibid.). The current attempt by the European Union to regulate so-called high-risk systems of AI displays a similar paradigm:

AI should be a tool for people and be a force for good in society with the ultimate aim of increasing human well-being. Rules for AI available in the Union market or otherwise affecting people in the Union should therefore be human centric, so that people can trust that the technology is used in a way that is safe and compliant with the law, including the respect of fundamental rights.<sup>1</sup>

This guideline emphasizes the 'human-centeredness' as the all-encompassing norm that frames these regulations. The 'human' must be protected by all means from a harm that could potentially be caused by an unsafe technology, in this case by a form of Artificial Intelligence. While this aligns with the EU´s responsibilities of protecting its citizens from harm – on a political level -, the concept of 'human' is also problematic in an onto-epistemological perspective. Rosi Braidotti (2013) indicates the fact that "Not all of us can say, with any degree of certainty, that we have always been human, or that we are only that. Some of us are not even considered fully human now, let alone at previous moments of Western social, political and scientific history". (1)

<sup>1</sup> https://eur-lex.europa.eu/resource.html?uri=cellar:e0649735-a372-11eb-9585-01aa75ed71a1. 0001.02/DOC\_1&format=PDF (p.1) (21 January, 2024).

She further says: "We assert our attachment to the species as if it were a matter of fact, a given. So much so that we construct a fundamental notion of Rights around the Human" (ibid.). The above-quoted European Artificial Intelligence Act is one attempt to construct a fundamental notion of 'Rights around the Human' and therefore makes a hard cut between 'the human' and 'the technology' or 'the machine' – a diagnosis that falls behind well-established insights of Feminist Science and Technology Studies. Already back in 1991, Donna Haraway questioned the dualism between human / animal on the one hand and machine on the other. She called this a "leaky distinction" (152). An anthropocentrism, however, leaves this dualism intact and focuses only on one side: the human. By reading the guidelines for a regulation of AI above, it becomes apparent that it contains several judgements about technology that inform and infuse this anthropocentric perspective: one reads about trustworthiness or risks<sup>2</sup> . This alludes to a humanist conception of an individual, that is (or should be) autonomous, is able to judge something as trustworthy or risky and therefore is guided by reason, wisdom and rationality. It is an "aesthetics (as a mode of philosophical enquiry)" (Fazi 2029: 3) that perceives digitalisation, computation and technology as something wholly other, something not-human that can be perceived by means of human aesthetics. This is a problem, because this mode is nurturing the paradigms of human exceptionalism mentioned above. There are warnings about a "data colonialism" (Couldry / Mejías 2019: 83) where "corporations act as colonizers that deploy digital infrastructures of connection to monetize social interactions, and the colonized are relegated to the role of subjects who are driven to use these infrastructures in order to enact their social lives" (86). It may seem paradoxical to criticize the human-centeredness as one possible reason for marginalization and this new form of colonialism and not rather a lack thereof (of human-centeredness). I refer again to onto-epistemological critiques of the concept of the 'human' that had been used over history to exclude marginalized people from the realm of humanity. Eduardo Viveiros de Castro (Skafish 2009) analyses the construction of anthropology's subject and hints at the fact that its founding patron could be Narcissus who always only perceives himself in the Other. Therefore, Viveiros de Castro can highlight the concept of the human as inextricably intertwined with the nonhuman that lacks what the human has: "An immortal soul? Language? Labor? The Lichtung? Prohibition? Neoteny? Metaintentionality?" (43). He is referring to several Western philosophical, anthropological and sociological discourses who have been used to conceptually marginalize people and even deny them the status of 'human'. In this case, those non- or not-completely-humans can easily be exploited in terms of "Robotic" (Hu 2022: xiv) work. Those are 'clickworkers', mainly situated in the

<sup>2</sup> https://eur-lex.europa.eu/resource.html?uri=cellar:e0649735-a372-11eb-9585-01aa75ed71a1. 0001.02/DOC\_1&format=PDF (p.1) (21. January, 2024).

Global South,who serve new digitalindustries.As can be seen, this conceptual exclusion (in the concept of the human) is again deployed and displayed in digitalization. Therefore, this article follows this posthumanist and post-/decolonial critique that reveals the concept of the human as innately exclusive, Euro- and Western-centric. The conclusion is to find a perception and aesthetics of digitalisation that avoids this problem.Thus, this article strives to consider a new form of a mode of philosophical enquiry as an aesthetics that strips itself from a human-centeredness and tries to conceptualize a digital aesthetics that is able to grasp the idiosyncrasy of digitalisation and to overcome the duality between the living organic human on the one hand and the dead cybernetic technology on the other hand. Consequently, I first have to ask if an aesthetics that has been human-centred so far in the history of Western philosophy can possibly be transformed, made fruitful, in order to connect it with digitality.

I want to inherit from aesthetic theories by Sören Kierkegaard and Friedrich Nietzsche because both have, in their own way, criticized, reflected, and perhaps already deconstructed the history of Western philosophy. Nietzsche, in particular, can be read with Francesca Ferrando (2021: 48–49) as a source of inspiration for a philosophical posthumanism and a critique of anthropocentrism, which makes his thought fruitful for an essay that is to be about non- and more-than-human digital aesthetics.

In her article "Digital Aesthetics: The Discrete and the Continuous" (2019) Beatrice Fazi states that it is time to emancipate the concept of aesthetics from its human-sensory residue in order to arrive at an aesthetics that does not dwell on judgments or tastes, nor doesit want to contribute to the reception of art, but rather plays out on the field of conflict between the discrete of digitality and the continuous of the analogue. In her opinion, this is based on two different ways of understanding and grasping reality (2). Before I scrutinize her arguments further, I want to have a look at one of my chosen inspirational sources for a theory of aesthetics. George Pattison (2006) tells us the following about Kierkegaard´s concept of grasping reality: "The reason that I cannot really say that I positively enjoy *nature*is that *I* do not quite realize *what* it is that I enjoy. A work of art, on the other hand, I can grasp, I can – if I may put it this way – find that Archimedean point, and as soon as I have found it, everything is readily clear for me" (Kierkegaard 1967–78: §117; cited in Pattison 2006: 78)

At first glance, then, Kierkegaard is concerned with continuity, with an aesthetic grasp of reality. To be able to grasp, to comprehend or to understand, it is necessary to find the Archimedean point, an allegedly safe and 'neutral' standpoint. This point is supposed to be an external position from which "a different, perhaps objective or 'true' picture of something is obtainable" (Blackburn 2005: 21). There is the famous saying that Archimedes once stated "thatif he had a fulcrum and a lever long enough, he could move the earth" (ibid). I want to follow this metaphor of the Archimedean

point in a similar, maybe ironical way, as Bruno Latour does in his essay "Give me a Laboratory and I will Raise theWorld"(1983).There, he transforms this metaphor to a mode of enquiry, which I also want to do here.While Latour shows that dichotomies such as science vs. society or inside vs. outside collapse and it is rather adequate to speak of a lever that unhinges the world (he calls this process translation) (cf. ibid), I want to use this metaphor to find a new mode of aesthetics that renders the dichotomy between human and non-human or the living and the digital redundant. Kierkegaard says that "*I* do not quite realize*what* itis that I enjoy"(Pattison 2006: 78). The object of enjoyment, nature, is unfathomable to him. To fathom this, he needs to find a safe standpoint and a lever. This conception of aesthetics that Kierkegaard connects with art, is surely not the kind of aesthetics that Fazi has in mind, since she states that she wants to go "beyond traditional concerns with art" (2019: 2). Therefore, she considers it unsuitable to capture digitality. However, Kierkegaard's aesthetics could then become productive when he is intertwined with Nietzsche. The latter writes in *Thus Spoke Zarathustra* (2006) in the chapter "The Dance Song": "Into your eye I gazed recently, oh life! And then into the unfathomable I seemed to sink. But you pulled me out with your golden fishing rod; you laughed mockingly when I called you unfathomable. 'Thus sounds the speech of all fish', you said. 'What *they* do not fathom, is unfathomable'" (Del Caro / Pippin 2006: 84).

In a way, this is a repetition of what I just referred to with Kierkegaard. Someone is lost because something is unfathomable to them. The solution here is however not to find an Archimedean point from which to start fishing in the unfathomable and groundless ocean. It is the subject itself that is getting hooked with a fishing rod, which now could be imagined as the Archimedean lever, without a stable (stand)point. Nietzsche instead transforms this phallic appropriation of reality by letting life mock the subject of aesthetics, calling it to be a fish itself. It is therefore not the human subject holding the Archimedean lever, but life in which the aesthetic subject is about to sink. This movement of sinking into the unfathomable is the attempt to fathom the unfathomable. It is a movement of penetration. By gazing into life's eyes, life is to be invaded to grasp its reality. Yet, life is always letting this movement fail by pulling back the human 'fish' to the surface. Therefore, there is a double bind: On the one hand, the human subject has to acknowledge that itis merely an object of aesthetics by itself. Life is in control here, always hooking the human subject and preventing it from fathoming the unfathomable.On the other hand, the method of aesthetics is questioned in itself: While Kierkegaard said he is only to grasp something if it lies clearly in front of him, pulled out of the sea by the Archimedean lever, Nietzsche rather talks about a method of diving into the sea by themself (which of course then must fail). It is therefore not the aesthetician who unhinges the world, comprehends it and deciphers it, but life catches the aesthetician with its golden rod, traps them and mocks them. Nietzsche goes on to say:

And when I spoke in confidence with my wild wisdom, she said to me angrily: 'You will, you covet, you love, and only therefore do you *praise* life!' Then I almost answered maliciously and told the angry woman the truth; and one can not answer more maliciously than when one 'tells the truth' to one's wisdom. Thus matters stand between the three of us. At bottom I love only life – and verily, most when I hate it! (Del Caro / Pippin 2006: 84)

The wisdom here stands for another figure, such as life themself. This paragraph describes a triangulation. There seem to be three separate but still tightly entangled parts of one entity: The philosopher of aesthetics, their wisdom and life. A conflict takes place between those three. While we have already scrutinized the conflict between life and the aesthetician, now a third party arrives: wisdom. In Gilles Deleuze's (1986) interpretation of Nietzsche's philosophy we learn, how this triangulation might be constellated:

*Philosophos* does not mean 'wise man' but 'friend of wisdom'. But 'friend' must be interpreted in a strange way: the friend, says Zarathustra, is always a third person in between 'I' and 'me' who pushes me to overcome myself and to be overcome in order to live (Nietzsche n.d.:82 as quoted in Deleuze 1986: 5–6). The friend of wisdom is the one who appeals to wisdom, but in the way that one appeals to a mask without which one would not survive, the one who makes use of wisdom for new, bizarre and dangerous ends – ends which are, in fact, hardly wise at all. He wants wisdom to overcome itself and to be overcome. (5–6)

So here, we see that the philosopher as a friend of wisdom, who is a philosopher of aesthetics in our case,is triangulated by the three named aspects.Wisdom is usually addressed by the philosopher only as a means to an end: As a means to mask their 'true' intention which is the love of life. To tell wisdom the truth (as quoted above) is therefore a malicious movement, because it unmasks the philosopher and reveals them not as a friend of wisdom, but as a lover of life.The love of life, however, is dangerous and bizarre and not wise at all.We also identified the means of the mediation that triangulate the aesthetician, wisdom and life: will, covetousness, love, hatred and truth. In the end, love and hatred seem to win the conflict to be the preferred means to emerge the privileged relation: the relation to life. So there is another double bind:While we saw above that the method of aesthetics that tries to pull aesthetic objects out of the unfathomable sea is profoundly questioned, we now see that life, as the one that actually holds the rod and therefore destroys masculine aesthetics, is the object of the aesthetician´s love. They love that which makes themself an object of aesthetics – a fish – therefore make themself something to be fathomed – by

life. Hence, an aesthetics that used to claim to be in charge of sensing, judging and tasting is made itself unfathomable.

Circling back to Kierkegaard´s conception of aesthetics that finds nature unfathomable, unlike art, which can clearly be revealed by using the Archimedean method of grasping reality, one can respond now to this that there is yet an inescapable love of life in this anthropocentric-idealistic form of aesthetics that Kierkegaard calls upon. The truth that Nietzsche's aesthetician (articulated by Zarathustra) would almost confess to their wisdom (see quote above) is the truth of the living. Wisdom is in that case a metaphor for the mask of rationalism. As this mask is something to be addressed and to overcome (as shown above), this rationalism always already masks a love of live and therefore every rationalism contains traces of this love. In this way, an anthropocentric form of aesthetics is decentred and another kind of aesthetics is made possible, one that knows no standpoint in the outside, no Archimedean point in space, but which itself is always caught, hooked, fished, and whose methodology of fathoming an author's intention is mocked. Kierkegaard says: "I see the author's whole individuality as if it were the sea, in which every single detail is reflected. The author's spiritis kindred to me. . .The works of the deity are too great for me; I always get lost in the details". (Kierkegaard 1967–78: §117; cited in Pattison 2006: 78)

It seems here that he conceptualizes the methodology of fathoming the unfathomable as an unfathomable ocean, of which only individual reflected points of light can be grasped, but never the whole picture. At this stage, an aesthetics of the digital can now be connected here, an aesthetics of the disparate, the discrete. There appears to be the option to see the entire individuality of the author while at the same time one gets lost in discrete points of light as details. This kind of aesthetics tries to fish dry the sea, but only has a golden fishing rod at its disposal, which always gets caught on its own clothes when it is cast.

With Nietzsche, it is this snagging, this entanglement between the individual and the whole, the indivisible individual part and the divisible whole, that could be named the "breakdown of the principium individuationis" (Tanner 2003: 875). In *The Birth of Tragedy*, he describes how the reconciliation of the contradiction between the two principles of the Apolline and the Dionysiac are synthesized into an artistic phenomenon as the "dream artist" and the "ecstatic artist" (Tanner 2003: 849). This is Nietzsche's aesthetic theory: he describes how in the "Dionysiac dithyramb, man's symbolic faculties are roused to their supreme intensity" (Tanner 2003: 883). He calls this "symbolism of the body" (ibid.) and thus transforms a representationalist symbolism of the mouth, the face, and the word to a rhythmic symbolism of the dance (cf. ibid.). Angèle Christin speaks of the complicated dance between researcher and algorithm, which is characterized by deception and manipulation (cf. Christin 2020: 912).The two principles of the Apolline and the Dionysiac – dream and ecstasy – emerge here. Those unite in this complicated dance to form a symbolism of the body, thus enabling an aesthetic of the world that tears apart the principle of the individual, divides it into small pieces that dissolve in the sea of the whole (that is unfathomable as such) and reflect individual points of light, each of which in itself tells the story of the whole.

Vicky Kirby (2011) speaks of the harmony of the whole in the fragmentary (25). In her book *Quantum Anthropologies. Life at Large* (2011), she draws a connection between linguistics, language, forensics, figures, mathematics, information and data. By referring to the example of forensics that has only a fleshless skull at its disposal and is nevertheless able to reconstruct the face by using statistical data that is seemingly unconnected with this individual, she identifies an "uncanny structuration" (30) between sense, ideality, subjectivity and objectivity. By elaborating on Jacques Derrida's work about Edmund Husserl's phenomenology, she shows, that there is a strange relation between individual perception of the world on the one hand and "lived coherence of a corporeal geometry that can 'join the dots'" (29) on the other hand. Therefore, objectivity is always haunted by subjectivity or – to say it in other words – "the origin is already alive with what has yet to come" (30). A strange inversion takes place of a scientific stance that sets the real as the sphere that converges towards the ideal.This inversion leads to the Derridean notion, that there is "an original (worldly) writing through whose radical interiority the referent presents itself " (46). Therefore, there is only a radical interiority, no lost origin or ideality that presence strives to reach. The whole, the origin, the ideal is always already within the fragment, within the referent, within the signifier. This notion is a transformation of a phonocentrism (38) that privileges reason and ratio over body and materiality. An original worldly writing however acknowledges that the split between cogito and being is not something that must or can be reconciled to eventually be able to face Nature as such, where the real and the ideal merge into each other, but is something that is originally divided without an exteriority: "There is nothing outside of the text" (47). By this analysis, Kirby is able to overcome human exceptionalism, since human language, perception and conception of the world relied on this very separation between representation and truth, reference and concept or signifier and signified. Instead, "'The human' would certainly be a unique determination, yet 'one' whose cacophonous reverberations would speak of earthly concerns" (39). World is therefore 'earthly', not human.

I believe that the symbolism of the body I referred to with Nietzsche captures this transformation. Kirby says: "The world's communion with itself would involve giving itself to itself; a *datum* of 'presents', blinking in the wonder of its *own* openness and generosity" (37). Each digital datum I want to imagine as being involved in this self-giving of the world to itself as radical interiority, which is in my view a radical materiality and bodily performance that could be imagined as original worldly writing or bodily symbolism and therefore as a digital data dance. This dance of digital data points who are fragments of the whole would each already tell the story of the origin and the future ideality. However, a dance has no purpose, no aim, no telos but invites to blink in the wonder of its openness and generosity.

I want to think digital data points as reflections in the sea of chance, as Deleuze is interpreting Nietzsche:

Only a dicethrow, on the basis of chance, could affirm necessity and produce 'the unique number which cannot be another'. We are dealing with a single dicethrow, not with success in several throws: only the combination which is victorious in one throw can guarantee the return of the throw. The thrown dice are like the sea and the waves (…). The dice which fall are a constellation, their points form the number 'born of the stars'. The table of the dicethrow is therefore double, sea of chance and sky of necessity, midnight-midday. (Deleuze 1986: 32)

I want to imagine the digital data points as the digits on a dice´s surface, a constellation that form the number "born of the stars" (ibid.).The stars in the sky of necessity are reflected in the sea of chance. The idiosyncrasy of this temporality is therefore not chance serving necessity by anticipating a necessary specific number that a series of dicethrows will converge to in the future. What is now possible to think instead is the entanglement of chance and necessity by looking at one single dicethrow which truly makes it possible to affirm multiplicity in unity – the multiplicity of one dicethrow.

The number-constellation is, or could be, the book, the work of art as outcome and justification of the world. (Nietzsche wrote, of the aesthetic justification of existence: we see in the artist 'how necessity and random play, oppositional tension and harmony, must pair to create a work of art' (Nietzsche n.d. as quoted in ibid)). Now, the fatal and sidereal number brings back the dicethrow, so that the book is both unique and changing. (ibid.)

As discussed above, Nietzsche's concept of aesthetics is about art that supersedes the anthropomorphic aesthetics of representation and perception, but instead sets a symbolism of the body. Now, as can be seen in the quote above, another aspect is stressed: Art and aesthetics are always an intertwining of randomness and necessity, tension and harmony, a play that is played by the world and existence. So in Nietzsche's way, aesthetics is still about a justification, while Fazi wants to overcome "the traditional tenets of the discipline [of aesthetics, JSt], such as beauty, taste and judgement" (Fazi 2019 : 2). However, in Nietzsche's way, aesthetics is stripped of its teleological aspect that makes aesthetics "aim (…) to record and take account of [relations] through the sensible" (Fazi 2019: 3). Nietzsche makes it possible instead to welcome an aesthetic temporality that is based on a play.

Returning to Fazi, we can also frame this play or dance of star-reflections on the surface of the sea of chance as a dance between the discrete and the continuous. Beatrice Fazi calls this, as already described above, "two conflicting ways of grasping and structuring the real" (Fazi 2019: 2) two kinds of aesthetics, wherein in turn the Apolline and the Dionysiac appear, as unmediated perfection on the one hand and ecstatic reality on the other (cf. Tanner 2003, 840). Fazi draws attention to the fact that aesthetics for Deleuze is related to an ontological continuity, a continuous variation.This, according to Deleuze,is life itself (cf. Fazi 2019: 3–4). Just as for Nietzsche the Apolline and the Dionysiac "as artistic powers […] spring from natureitself, without the mediation of the human artist" (Tanner 2003: 840), so too for Deleuze life is unmediated, subjectless, and indeterminate. There is also an unmediated relationship to the sensual dimension in Deleuze, just as there is an unmediated expression of dream and ecstasy in Nietzsche. In Deleuze's approach to aesthetics-as-aisthesis, the sensuous is central, but not necessarily tied to corporeality or subjectivity. Here, however, Fazi highlights the problem that in this perspective Deleuze always refers to continuity as the guiding principle of what makes new experiences possible. The discreteness of data points thus appears incommensurable to this (cf. Fazi 2019: 7). She criticizes attempts to read the digital as virtuality, as Anna Munster does in her opinion, in order to then also be able to consider it as ontological continuity according to Deleuze and include it in an aisthesis. As Fazi tells us, Brian Massumi rules out that the digital can be virtual, but problematically presupposes the analogue as superior to the digital (cf. Fazi 2019: 11–12). Instead, Fazi emphasizes that she also grants the digital – and here she agrees with Deleuze – the possibility of producing something new: Aesthetics concerns ontological creation. However, this creative potential does not lie in virtuality. She denies this because she fears that otherwise a separation will be made between the discrete operations of digital computers on the one hand and the continuity of lived experience on the other, which will then in turn be overcome by a kind of conjuring trick, but this again does not do justice to the peculiar ontology of digitality. Fazi describes digital computation as follows: "When machines compute, formally, they put a task into the finite and defined terms of executable instructions, the aim of which is to axiomatically determine consequences, or outputs, from validly symbolized premises (inputs). (Fazi 2019: 14).

What is problematic about this is that Fazi reduces computation to so-called symbolic methods. In terms of so-called artificial intelligence, there is a much more sophisticated list of methods today, as Schmid et al. describe:

On the side of symbolic methods, techniques of knowledge representation and logical reasoning are prominent, while the side of sub-symbolic methods is primarily represented by neural networks and machine learning techniques. Yet, this traditional distinction is not comprehensive. [...] [M]ore and more combined or hybrid approaches are coming to the fore, e.g. the entire field of hybrid learning. (Schmid et al. 2020: 427)

However, what I think also applies to sub-symbolic or hybrid AI methods is Fazi's observation that any aesthetics of the digital is an aesthetics of discreteness (cf. Fazi 2019: 14). What she now proposes is a centring of this formal-abstract character of computation as opposed to a consolidation of it in the name of theories of affect that would call the abstract formalism of digitality empty and cold. She wants to explore the extent to which this very formalism could be onto-aesthetic, productive, and generic, and the extent to which complexity could be found within formal abstraction (cf. Fazi 2019: 16). Fazi goes on to argue that for her, computation is not the same as life, the living, and the lived, and that computation cannot account for this. Here I would like to interject and present a different position. Fazi comes to this conclusion in part because she does not want to be accused of supporting totalizing tendencies of digital computation systems. If she were to argue that digital computation can do justice to reality and thus is not reductionist and all-explanatory, she would again be submitting to a regime of instrumental disembodied rationality. This is not her intention. However, she says that an indeterminacy inherent in the living could also be found within a formalism, and thus for a digital aesthetics the connection to the sphere of the generative-living is not necessary at all. Returning to Kierkegaard and Nietzsche, however, I would now like to present that life need not be rendered as generative-creative. Kierkegaard states:

The person who has not circumnavigated life before beginning to live will never live […]; the person who chose repetition – he lives […]. Indeed, what would life be if there were no repetition? Who could want to be a tablet on which time writes something new every instant or to be a memorial volume of the past? Who could want to be susceptible to every fleeting thing […]? Repetition – that is actuality and the earnestness of existence. (Hong / Hong 1983: 132–133)

Life is entangled with repetition and repetition is outside the logic of temporal progression and crude creation. According to Kierkegaard, life thus does not generatively write (as long as repetition is chosen, which is living in his view). He further says of repetition that it is transcendent. However: "I have abandoned my theory, I am adrift. Then, too, repetition is too transcendent for me. I can circumnavigate myself, but I cannot rise above myself. I cannot find the Archimedean point" (Hong / Hong 1983: 186).

Here a circle closes, a repetition of the beginning of this essay occurs, so to speak: While I showed there that Kierkegaard configures his aesthetics of nature via the Archimedean point, which he sets as a prerequisite for the clear grasp of reality, he refuses to do so at this point. Referring to Nietzsche, I already elaborated on the fact that the human aesthetician is decentralized because life itself holds the fishing rod in order to move the world with it. As just explained, life is repetition, while repetition is transcendent. Is therefore also life transcendent? Rather, I discern a deconstructive shift here. Namely, there is a crucial difference from a transcendence that directs life toward a futurity or a beyond. According to Kierkegaard, "Repetition's love is in truth the only happy love. […] – it has the blissful security of the moment" (ibid. 131–132). If every future and also past is a repetition, then time condenses itself in a moment, congeals into a point that contains an infinity of possible worlds. However, it is not an Archimedean point in the universe, but the wound in which the golden rod of life gets caught. Just the refusal, the laziness, takes thus the repetition seriously, is love of life. To emphasize this conclusion, two other statements of Nietzsche and Kierkegaard can be compared. Nietzsche says about the laziness: "And he will also find the little god, surely, […] he lies next to the well, still, with closed eyes. Indeed, he fell asleep in broad daylight, the loafer! Did he chase too much after butterflies?" (Del Caro / Pippin 2006: 83) With Kierkegaard, one could add: "[T]he person who chose repetition – he lives. He does not run about like a boy chasing butterflies or stand on tiptoe to look for the glories of the world, for he knows them". (Hong & Hong 1983: 132) In Greco-Roman mythology, the little god is Cupid or Eros, the personification of love, who is usually depicted as an adolescent boy. Both Nietzsche and Kierkegaard allude to this here. Love here seems to be loafing, tired of the pursuit of glories. Paradoxically, however, it is precisely at the moment of laziness that love chooses life and repetition, because it then no longer stands on its toes to reach upward craning the neck toward a transcendence. Instead, from the position of loafing, resting or retraction, love is then ready for the dance to which Zarathustra compels the little god (cf. Del Caro / Pippin 2006: 83). This then allows for a rhythmic symbolism of the body that would not be representationalist, but rather repeats discrete moments and data points.

The formatting of the living as repetition is thus fruitful in connecting it to the aesthetics of discreteness emphasized by Fazi, because the living is not divorced from the digital. The living is not continuous-creative and the digital discrete. Rather, the living is also always already discrete and does not produce anything new, is recursive and iterative, and is able to affirm the moment, a punctuated time, as Jackson Jr. puts it, following Jane Guyer (cf. Guyer 2008: n.d.; cited in Jackson Jr. 2013: 27). The living thus becomes or is always already digital-formal-abstractdiscrete and analogue-linear time is a digital space-time. My argument here is not that the digital overwrites and totalizes the living. Nor do I understand it in terms of a digital philosophy that posits digital or discrete codes as the core of physical complexity and therefore hegemonizes a deductive logic (cf. Parisi 2017: 80). Rather, through the approach of the symbolism of the body and the dance of repetition, I want to enable an ontology of the digital that does not have to make a distinction

between the sphere of the living and that of the digital, as Fazi does, in order to subsequently ask for a digital aesthetic. Rather, the living is always already intertwined with a digital via repetition; is connected with it in a common dance that would be an expression of an aesthetic that does not need an Archimedean point on the outside in order to grasp reality; an aesthetic that would not be creative-generic, but folded to a present discreteness.

# **Bibliography**


# **About the Authors**

**Robin Auer** is currently working towards a PhD as part of an interdisciplinary research project on automated creativity in literature and music at TU Braunschweig, funded by the federal state of Lower Saxony, Germany. His work focuses on the interplay between human and machine creativity in coupled embodied creative systems, and how this interplay reflects on and subverts traditional assumptions about (particularly artistic) creativity and art itself. Previously, he has completed degrees at Ruprecht-Karls-Universität Heidelberg and Merton College,Oxford. His research interests include theories of consciousness and creativity, philosophy of mind, science and language, as well as semiotics, and literary theory. He has published papers on transhumanism & embodiment, as well as the literary works of J.R.R. Tolkien.

**Hannes Bajohr** (PhD) is a postdoc at the Seminar for Media Studies at the University of Basel. He works on the history of ideas of the 20th century, political philosophy, and theories of the digital. Most recently, he published *Schreibenlassen. Texte zur Literatur im Digitalen* (Berlin 2022), *Schreiben in Distanz* (Hildesheim 2023) and *(Berlin, Miami)* (Berlin 2023). Beginning in the fall of 2024, he will be an assistant professor of German at the University of California, Berkeley.

**Jenifer Becker** is a writer and literary and cultural scholar living in Berlin. Her work deals with ambivalences of digital cultures. She completed her doctorate in 2021 at the Literaturinstitut Hildesheim,where she has been teaching and researching since 2015. Current artistic-scientific research projects deal with the influence of adaptive technologies (AI) on writing processes, writing procedures as well as genres. She is the director of the AI-Labkit project. She writes prose and multimedia performances. Her novel *Zeiten der Langeweile* was published by Hanser (Berlin) in 2023.

Dr. **Nikita Braguinski** works at the intersection of musicology and the study of musical technology. He has published on the use of music-related apparatuses and composition aids from a range of historical and contemporary settings, including a 19th-century paper-based device for creating quadrille scores, an early-20thcentury musical game, as well as late-20th-century electronic sound toys and video games and the use of AIin music. After workingin postdoctoral positions at Harvard University and Humboldt University of Berlin he is currently a fellow of the KHK Kolleg "Cultures of Research" at RWTH Aachen. His monograph *Mathematical Music. From Antiquity to Music AI*. was published in 2022 (Routledge; Korean translation: 2023).

**Mar Canet Sola** is an artist and researcher. Mar is a PhD candidate and research fellow at the CUDAN Research Group at BFM Tallinn University. He has a master's degree from Interface Cultures at the University of Art and Design Linz, two degrees in art and design from ESDI in Barcelona, and a degree in computer game development from University Central Lancashire in the UK. Mar was invited as a visiting researcher to XRL, Hong Kong City University, IAMAS (Ogaki, Japan), Blekinge Institute of Technology (Karlshamn, Sweden). As an artist, he works together with Varvara Guljajeva forming an artist duo Varvara & Mar. The duo has been exhibiting in international shows since 2009.Their works were shown at MAD in New York, FACT in Liverpool, Santa Monica in Barcelona, Barbican in London,Onassis Cultural Centre in Athens, Ars Electronica Center in Linz, ZKM in Karlsruhe, and more. http://var-mar.info/

**Isaac JosephClarke** is a PhD studentin ComputationalMedia and Arts at HKUST(GZ) investigating AI tools for artists.

Dr. **Jan Torge Claussen** is a postdoctoral researcher and media artist. He conducts research at the intersection of music and digital media and experiments with the perception and production of sound in various contexts. He teaches regularly at the Leuphana University of Lüneburg and at the Hamburg Media School. In 2019, he completed his PhD on "Music as Game: Guitar Games in Digital Music Education" at the Institute for Media, Theater and Popular Culture at the University of Hildesheim. For more information visit: https://gegenwaerts.com. Recent publications include "Gaming Musical Instruments: Music has to be Hard Work!" (*Digital Culture & Society* 2020) and "Welcome to the glitch and make some noise: Understanding media through audio hacking" (*Journal of Music, Technology & Education* 2023).

**Scott deLahunta** (PhD) is Professor of Dance, Centre for Dance Research, Coventry University, and co-Director of Motion Bank, now hosted by Hochschule Mainz University of Applied Sciences. His research seeks to deepen and apply the understanding of dance as a form of embodied knowledge and choreography as skillful bodily practice.This builds on over a decade of working within contemporary dance companies as Research Director and Facilitator. Since 2010, he has held a research position at Coventry University and assisted in setting up the Centre for Dance Research in 2015. http://www.sdela.dds.nl/.

Dr.**Dietmar Elflein** holds the position of a professor of popular music at TU Braunschweig. His dissertation deals with analyzing stylistic norms of heavy metal music. Besides Metal Studies he published essays on German popular music history and the analysis of popular music. His research interests include (national) popular music history, actor-network theory, popular music analysis, and post-colonial studies. He is a member of the advisory board of the German-speaking branch of the International Association for the Study of Popular Music (IASPM D-A-CH). For a complete list of publications visit www.d-elflein.de.

**Shoshannah Ganz** (PhD) is an associate professor of Canadian Literature at Grenfell Campus,Memorial University. In 2008 she co-edited a collection of essays with University of Ottawa Press on the poet Al Purdy. In 2017 she published *Eastern Encounters: Canadian Women's Writing about the East, 1867–1929* with National Taiwan University Press. Shoshannah is editing a collection of essays on Onoto Watanna/Winnifred Eaton with Rena Heinrich and Dominika Ferens..

**Pablo Gervás** holds a PhD in Computing from Imperial College, University of London (1995), and he is currently full professor on computational creativity and natural language processing (Catedrático de Universidad) at Universidad Complutense de Madrid. He is the director of the NIL research group (nil.fdi.ucm.es) and for many years he was the director of the Instituto de Tecnología del Conocimiento (www.u cm.es/itc). He has been the national co-ordinator for Spain of the FP7 EU projects PROSECCO, WHIM, and ConCreTe in the area of Computational Creativity. He has been coordinator for two national research projects (GALANTE and MILES) involving several institutions and principal investigator for two more (IDiLyCo and CAN-TOR). His main research interest currently lies in the study of the role that computers can play in helping people interested in literary creativity. He is the author of the PropperWryter software, which was usedin the process of creating*BeyondtheFence* – the first computer-generated musical, staged at the London West End in 2016.

**Varvara Guljajeva** (PhD) is an artist and researcher holding the position of Assistant Professor in Computational Media and Arts at the Hong Kong University of Science and Technology (Guangzhou). Previously, she held positions at the Estonian Academy of Arts and Elisava Design School in Barcelona. Varvara was invited as a visiting researcher to Creative School of Media at the Hong Kong City University, IAMAS (Japan), LJMU (UK), Interface Cultures in the Linz University of Art and Design, and more. Her PhD thesis was selected as the highest-ranking abstracts by Leonardo Labs in 2020. As an artist, she works together with Mar Canet forming

an artist duo Varvara & Mar. The duo has been exhibiting in international shows since 2009. Their works were shown at MAD in New York, FACT in Liverpool, Santa Monica in Barcelona, Barbican in London, Ars Electronica Center in Linz, ZKM in Karlsruhe, and more. www.var-mar.info Recent publications include The Meaning of Creativity in the Age of AI (with Raivo Kelomees, and Oliver Laas, (Estonian Art Academy 2022) and *From interaction to post-participation: the disappearing role of the active participant* (Doctoral Thesis, Estonian Academy of Arts 2018).

**Chelsea Haith** (PhD) is a commercial qualitative researcher based in Sydney, Australia. Her doctoral work at the University of Oxford focused on narratives around future scenarios and technologies and their political intent. While at Oxford she founded the research network Futures Thinking. She is the Executive Producer of the web series *Will Machines Make Us Laugh?* which considers the use of AI in producing comedy, and she is the host of the Narrative Futures podcast. In February 2020, she co-founded the AI & Creativity project Sound of Contagion. Since leaving academia in 2022, she has concentrated on ocean swimming and experiments with AI tools in commercial research.

**Ilona Krawczyk** (PhD) is a lecturer in Acting at Norwich University of the Arts, a performer, vocalist, and researcher of embodied voice. In her practice-as-research PhD, she developed a Process-oriented approach to voicework and performer training focused on care and preservation of a performer's well-being. Her recent work explores possible overlaps between physical, musical theatre, experimental music, and sound art,investigating new ethics and aesthetics of voicework and actingin the theatre informed by post-Grotowskian practice. Ilona is a founder of DreamVoice practice and a co-founder of Insoundout collective.

Dr. **Angela Krewani** is Professor of Media Studies at Marburg University. She is the author of *Moderne und Weiblichkeit: Amerikanische Schriftstellerinnen in Paris* (1992) and *Hybride Formen: New British Cinema – Television Drama – Hypermedia* (2001) and editor of *Artefacts/Artefictions* (2000) and co-editor of *Marshall McLuhan, Transatlantic Perspectives* (2014). She has also published on imaging in the natural sciences, including biomedicine and nanotechnology. She was a fellow at the ZIF in Bielefeld in 2006–2007 and visiting professor at Brooklyn College, New York, in 2008. She completed a book on media art, *Medienkunst.Theorie, Praxis, Ästhetik* (2016), and co-edited a book on authorship,*Constructions of Media Authorship. Investigating Aesthetic Practices from Early Modernity to the Digital Age* (2021). Her most recent publication traces the mediality of the Corona pandemic, *Das Virus im Netz medialer Diskurse. Zur Rolle der Medien in der Corona-Krise* (2022; co-edited with Peter Zimmermann).

**Mattis Kuhn,** artist and curator, works on the reciprocal design of humans, machines and the shared environment. The focus is on text based machines (algorithms, artificial intelligence, formal systems, software) and the intertwining of humanities with engineering. Essential aspects are identity, decentralization, human-machine associations, networks, and AI. Recently his books "Selbstgespräche mit einer KI" (Soliloquies with an AI) and "Grasslands for Insects" have been published by 0x0a and windpark books. He co-curated the exhibitions "I am here to learn – On Machinic Interpretations of the World" and "How to Make a Paradise – Seducement and Dependence in Generated Worlds" at Frankfurter Kunstverein. He studied art at University of Art and Design Offenbach and Experimental Informatics at University for Media Art Cologne. Currently, he is Artistic Associate for Creative Coding at Bauhaus-University Weimar and part of the research group ground zero at University for Media Art Cologne.

**Sebastian Kunas** is a musician, sound artist, and educator with a background in sub and DIY culture as well as in cultural and sound studies. Besides his strong penchant for machinic repetition and playing with sound in the electronic domain, he is a semi-proficient multi-instrumentalist and producer. He is active in various contexts, working with theater collectives, and producing sound art. He is also experienced in playing, recording, and touring with bands. In the department of Kulturwissenschaften & ästhetische Kommunikation at the University of Hildesheim, he teaches electronic sound and music practice and supervises the electronic studio and the recording studio there. He is a member of the collective ARK (Arkestrated Rhythmachine Komplexities), a changing association of artists, scholars, and electronic MusickingThings, who/which perform heterochronicity and multi-track knowledge, looking for post-representative sound formats. Their works have been presented at CTM Festival Berlin, HKW Berlin, ifa-Galerie Berlin, MK&G Hamburg, MARKK Hamburg, among others. https://skusku.de/

**Robert Laidlow**is a composer and researcher basedin theUK.His workis concerned with discovering and developing new forms of musical expression rooted in the relationship between advanced technology and live performance. He is currently a Fellow in Composition at Jesus College, Oxford. From 2018–22 he was the PRiSM PhD Researcher in Artificial Intelligence with the BBC Philharmonic Orchestra at the Royal Northern College of Music. His music has been broadcast on national and international radio and television. He has been awarded a Royal Philharmonic Society Composer's Prize, an Ivan Juritz Prize, and has been nominated for two Ivor Novello Composers Awards. Recently he has presented research at the Cyborg Soloists Music ex Machina symposium, the Royal Musical Association Annual Conference, the Society for Music Analysis annual conference, Creative Machine Symposium 2023, and the AI & Music Creativity annual conference.

**Sara Laubscher** is a South African digital artist, game developer, and filmmaker. She is currently working in environment design for video games and delivered public engagement outreach around the role of design in video games at GamesCom 2022. She regularly hosts online workshops dedicated to sharing knowledge, ideas, and skills in Cape Town's fast-growing digital art space. Sara's work has been exhibited and awarded by the New York Film Festival, Lagos International Animation Festival, and the Stuttgart International Festival of Animation. Her directorial work was nominated for a Student BAFTA. Sara has working experience in the fields of visual effects, 3D modelling, and photogrammetry for a variety of projects, including features by Warner Brothers and Disney. Drawing on this, she explores the intersection of technology and creativity in the digital arts, experimenting with ways to implement AI tools in 3D software in order to create visually compelling and immersive virtual worlds.

**Wenzel Mehnert** is a futurologist focusing on the imaginaries of new and emerging technologies. He researches, writes and teaches experimental methods of futurology. In his work, he focuses on the intersection between speculative fiction and the assessment of future technologies (e.g. A.I., SynBio, Internet of Things, etc.). He worked as a researcher at the Berlin University of the Arts, co-founded the Berlin Ethics Lab at Technical University of Berlin, and currently lives in Vienna, where he is employed at the Austrian Institute of Technology and works on ethical guidelines on new technologies for the European Commission in the project TechEthos. His most recent publications include "The future is going to be weird. Zur Ästhetik kommodifizierter Mind-Upload-Visionen (in M. Tamborini (ed..), Die Ästhetik der Technowissenschaften des 21. Jahrhunderts ((2023) and "Responsible AI Adoption Through Private-Sector Governance" (with S. Wiesmüller, S.,N. Fischer and S. Ammon; Springer 2023).

Dr. **Jan Röhnert** is Professor of Modern Literature in the Technical-Scientific World in the Department of German Letters in the Faculty of Humanities at TU Braunschweig. His research interests range from avant-garde poetics & cinema, autobiography & war, landscape & geopoetics, nature & wilderness writing to feminism & contemporary literature. Besides his teaching and critical work, he has also published poetry, nature writing, and translations mainly from American poets (including John Ashbery, Robert Creely, Ron Padgett).

**Kate Ryan** is a performer from the UK, who is currently working on projects based in the UK, Germany, and France. She has studied social anthropology and devised theatre, and is currently taking an MFA at Trinity Laban/Independent Dance in London. She spent two years collaborating with Studio Kokyu, based at the Grotowski Institute, Poland; and has worked with REPLICA Institute as a researcher and performer since 2018. She is interested in the physicality of vocal presence in space, and in the polyphonic qualities of both body and voice.

**Mika Satomi** is a designer and an artist working in the field of e-Textiles, Interaction Design, and Physical Computing. Her work explores how we relate with technology and what we really want in them. She often collaborates with musicians and performers creating technology embedded costumes and interactive systems. Since 2006 Mika has collaborated with Hannah Perner-Wilson, forming the art collective duo KOBAKANT creating projects with e-Textiles and Wearable Technology Art. She is a co-author of the online database How To Get What You Want. Currently, she works as a guest professor at Weissensee Kunsthochschule, Germany. http://www. nerding.at.

Dr. **Jens Schröter** is Chair of Media Studies at the University of Bonn since 2015. He is Director (together with Anna Echterhölter, Andreas Sudmann, and Alexander Waibel) of the research project "How is Artificial Intelligence Changing Science?" (VolkswagenStiftung 2022–26) and fellow at Center of Advanced Internet Studies (2021/22). Visit www.medienkulturwissenschaft-bonn.de / www.theorie-der-me dien.de / www.fanhsiu-kadesch.de. Recent key publications include *Medien und Ökonomie* (Springer 2019), *Media Futures. Theory and Aesthetics* (with Christoph Ernst; Palgrave 2021) and *Tech / Demos, Navigationen* (with Julia Eckel and Christoph Ernst 2023).

Dr.**Christoph Seelinger** is currently a research assistantinModern German Literary Studies at the Institute of German Studies at TU Braunschweig, where he completed his doctorate in 2021 (published as *Tod im Kino. Legitimationsstrategien indexikalischer Todesszenen in ikonisch-symbolischen Ordnungen des Kinos*, Büchner 2022). Previously, he completed the interdisciplinary Master's programme "Culture of the Techno-Scientific World" at TU Braunschweig. His research focuses on the interfaces between film and literature, border crossings in (audiovisual) media, the connection between literature/film and the avant-garde, and the so-called "trivial culture".

**Diana Serbanescu** is a transdisciplinary performance artist and researcher who worked as a group leadin the field of AI ethics at theWeizenbaum Institute andis the founder of REPLICA Institute. In her practice, Diana explores feminist approaches to knowledge creation, the potential of poetic machines, and the continued validity of traditions in an era of artificial intelligence and digital colonisation. She believes in the radical potential of performance art practices to inspire social change. She holds a PhD in Computer Science from the Free University of Berlin and a BA(Hons) in Performance from the University of the West of Scotland. In 2019–2020 Diana and REPLICA were awarded a VolkswagenStiftung Planning Grant within the funding track AI and the Society of the Future, for the project *The Shape of Things to Come*. http://diananeranti.com/.

**Jannis Steinke** is a PhD candidate at Heinrich Heine University Düsseldorf in Media and Culture Studies. He is conducting an ethnographic research project at TU Braunschweig on AI-based health applications that is informed by and situated in Feminist Science and Technology Studies. He has completed a Master´s degree in Social Work in Cologne. He teaches at several Universities (Universität Hamburg, Hochschule Düsseldorf, Universität zu Köln amongst others). His classes focus on New Materialism, Poststructuralist Philosophy (Jacques Derrida), Post-/Decolonial Theory, Gender and Queer Studies, and Feminist Science and Technology Studies. He has published papers on a new materialist reading of E.T.A. Hoffman´s 'The Sandman', on compostings (Donna Haraway) of the "Eye/I" and a feminist comment on the European Artificial Intelligence Act.

**Jan Løhmann Stephensen** (PhD) is an associate professor at the Department of Aesthetics & Culture at Aarhus University. His research interests are cultures and practices of participation, democracy and the public sphere, creativity and its diffusion into non-art related spheres like work life, economics, policymaking, university research agendas, new media technologies, etc. In recent years, he has been working on the project 'Post-Creativity' about AI, art, and creativity. Currently he is embarking on a project about the impact of algorithms on democratic public discourse and practices, tentatively entitled 'ChatDemocracy'. He is co-editor and founder of *Conjunctions — Transdisciplinary Journal of Cultural Participation*. Recent publications include the (short) book *Creativity*, Johns Hopkins UP (2022), and articles such as "Artificial creativity: Beyond the human, or beyond definition?", *Transformations* no. 36 (2022), and "Creativity versus automation: towards the last frontier, and with our jobs on the line?", *Balkan Journal of Philosophy* Vol. 15 No. 1. (2023).

**Björn Tillmann** is a freelance musician, music producer and (vocal-)teacher. He studied Pop Vocals at the Institut für Musik in Osnabrück and did a Master´s degree as Producing and Composing Artist at the Popakademie Baden-Württemberg. Musically, his interests encompass a wide spectrum, ranging from techniques derived from "Neue Musik" to modular synths, sampling, and neural synthesis. Tillmann is part of the German electronica band MAHENDRA, releases music under the name "baerk." and produces music with and for various artists. In addition, he is the founder of the Raufaser Musikgruppe, set together by a label and booking collective. Further information under www.bjoerntillmann.de.

Dr. **Eckart Voigts** is Professor of English Literature at TU Braunschweig. He has written, edited and co-edited numerous books and articles, such as the special issue of *Adaptation* (vol. 6.2, 2013) on transmedia storytelling,*Reflecting on Darwin* (Ashgate 2014), *Dystopia, Science Fiction, Post-Apocalypse* (WVT 2015), *Companion to Adaptation* (Routledge 2018, with Dennis Cutchins and Katja Krebs) and *Companion to British-JewishTheatre*(Bloomsbury 2021, with Jeanette Malkin and Sarah J. Ablett). His paper "Algorithms, Artificial Intelligence, and Posthuman Adaptation: Adapting as Cultural Technique" was published in *Adaptation* (2021).

Dr. **Wolf-Georg Zaddach** is Professor of Music Production and Songwriting at Macromedia University Berlin and researcher at the DFG / AHRC-project "Songwriting Camps in the 21st Century" at Leuphana University Lüneburg/Germany as well as the Artistic Research Center at the University of Music and Performing Arts Vienna/Austria. His research interests include art and practice research, climate crisis and music, jazz and heavy metal, music analysis, and production, as well as AI in music. He performs regularly as a guitarist. His publications include "Künstliche Intelligenz in der Musikproduktion" (with K. Frieler and Swen Meyer) in *Musik und Internet. Phänomene populärer Musikkulturen*, ed. Peter Moormann & Nicolas Ruth, Springer 2023) and "Death of Mother Earth, Never a Rebirth'? Zum Verhältnis von Musik, Klimawandel und ökologischer Nachhaltigkeit" (in: ∼*Vibes – The IASPM D-A-CH Series* 2/2022), http://vibes-theseries.org/zaddach-death-of-mother-earth/ . www.wolf-georgzaddach.com.