Is It Time for Augmented Reality Documentary?

Jeremy Kirshbaum 7-28-2019

I recently became a research fellow at the Johns Hopkins Immersive Storytelling and Emerging Technologies Lab. Working on projects there has inspired some thinking for me about immersive reality journalism in general, and augmented reality documentary in particular.

Documentary and the multiplicity of time:

In Carlo Rovelli’s book The Order of Time, he describes how anyone, with simple off-the-shelf timepieces, can observe the differences in time passing. A timepiece at a high elevation will slowly, but not imperceptibly, advance ahead of a counterpart at a lower elevation. This is not because of some quirk of measurement. It is because time, as we conceive of time, passes more quickly at a higher elevation. The times are different. The truths of time are neither universal, nor purely subjective.

There is an alternate universe in which the “fake news” phenomenon is actually a beautiful, thoughtful conversation about this subjectivity. In this alternate universe everyone is thinking about what it means to report on something that happened, when even the most fundamental  truths like the progression of time contain some relativism and orientalism. This, as we all know, is not the case. In our world today, “fake news” is an utter adulteration of truths with utility. It is an epistemology of transaction, and of ends. In this sphere, truths are prostituted totally. They are respected only for their effects or discarded and ignored. To dance with them is performance only, or self-delusion.

Like journalism, one thing (though not the only thing) that distinguishes documentary from other mediums is that, by definition, it claims to be connected to literal truths. A documentary claims to have “happened”, and to be showing the viewer, experiencer, or participant evidence of this happening. This is not as simple as it might at first seem. Many things seen in video or photo documentaries are constructed. They are re-enactments of what “usually” happens or “would have” happened. It is considered completely acceptable for a filmmaker to ask a family to act out what it looks like for them to be “getting ready for school,” and the filmmaker feels no responsibility to inform the audience that they did not just happen upon the event. The justification of this theory is that truths are more clearly expressed through this construction, not obfuscated. In other forms of art, like painting, sculpture or animation, the constructed nature is obvious, unavoidable, and deliberate. In documentary, it is deliberate, but neither unavoidable nor obvious. Yet we cannot call this a prostitution of truths. This is not “fake news.” There is, in the ideal case, the best possible effort to communicate something real. The relationship is not always pure, but it is deeply involved, institutionalized, dynamic, and at least in some way mutual. The dance is energetic but fraught. The lead changes often.

So we begin on shaky ground when we begin to ask the question of what “augmented reality documentary” is, and what it might be. The word “empathy” gets brought up a lot in this context, but it is not the crux of the question. Many things inspire empathy, fictional and non-fictional. After reading Steinbeck’s Grapes of Wrath I feel empathy for the mostly long-dead migrants of the Great Depression, although it is entirely fictional. It does connect to truths, but through deliberate and explicit non-truth—a fiction. Documentary aspires to be different. It claims to connect to “real” past or current events through directly collected evidence. It is on the basis of this certification that  documentary takes its exception from the realm of pure art, and attempts to decamp in another direction, one combining the instrumentation of science with the interpretation of art. Documentary emerges from a question of truth, not empathy. Thus does the crux of the question of “what is augmented reality documentary?” emerge from a question of truths, not empathy.

So what is this difference in the conveyance of these muddied truths if we move from photos and videos to immersive realities? Who is the inheritor of this corrupted epistemology? What does this mean for truths and art as an expression of them?  Are immersive media a path to salvation—a way to represent fully the multiplicity and simultaneous connected existence of many truths, or simply a descent into greater layers of electronically-generated obfuscation and solipsism? What does it mean to dance with the courtesan?

Immersive reality and augmented reality–a potential return to the ecstatic bodily truths of the festival:

A core part of the Jewish holiday of Passover is the Seder. It is a communal meal that includes specific foods, stories, and prayers. Everything in the meal connects back to an event that took place in history—the flight of the Israelites from Egypt. The specific foods of the meal connect back to specific parts of the narrative, and the story unfolds a series of collective processes with the food at its center. The story that unfolds from the food is also written in a text, the Torah, but the foods form a physical, or embodied experience of the story that accompanies the spoken or written narrative and brings the past experience into being in the present moment. The Passover Seder is an immersive reality experience that connects the present moment to an alleged truth—an event from history. The immersive reality is conveyed through food in order to bring into being a story from elsewhere into the present, collapsing the present and a narrative into a single set of moments.

With immersive realities, there is an embodied experience, and the potential for the bodily experience of a narrative or truths. It is one of our oldest forms of conveying of truths. Consider, for example, festival dances. During festivals in many cultures, traditional dances tell a story of what inspired the festival. Festival dances, as opposed to dance performances, are participatory. Many people, if not all, participate and act out the bodily movements that relate to a previous event in history. Like the Passover Seder, they create an experience in the present that attempts to connect to some truths.

So if both these experiences, like documentaries, create an experience in the present that connects to truths elsewhere, why or why not are they documentaries?  If they are not documentaries, what distinguishes them from documentary? If they are an immersive reality experience, are they augmented reality, or virtual reality? If they are not these things, what distinguishes augmented and virtual reality from these older forms of immersive reality truths-telling? And ultimately, if these new immersive reality truths-telling forms are different from what came before, what is the implication? How does it change the way we experience, convey, and understand truths?

In the Agbadza dance of the Ewe people of Ghana and Togo, participants re-enact the escape from the oppressive king Togbe Agorkoli several hundred years ago.

Documentary differs from other forms of narrative that claim to connect to literal truths through scientific instrumentation. Unlike other forms of narrative, documentary uses tools that directly collect information from an environment, and that information is then used as the based collateral to create a narrative. These tools are various but usually center around the collection of photons and sound waves, either imposed directly into a chemical medium or into some series of analog or digital symbols that enable their reproduction elsewhere. These reproductions are almost always experienced in a disembodied format. They appear through a series of photographs or through video that appear either in physical form or on a screen, sometimes accompanied by audio. In order to experience these narratives, the viewer must suspend their awareness of the present moment, and their immediate surroundings, and instead choose to attend to the images and sounds that form the narrative. The experiencer is not really such—they are a viewer. They are invited to leave themselves and disappear into something, as opposed to entering something as themselves, in a moment and a body.

Augmented and virtual reality differ from other forms of immersive experience in that their primary mechanism of conveyance is not dance or food, but a class of devices worn on the head that utilize carefully offset digital images to create the illusion of depth and visuals that appear in any rotational direction. With virtual reality, the experiencer cannot see any part of the physical environment which surrounds their body. With augmented reality, the physical environment surrounding them is visible, with visual elements overlaid on top of it with vary levels of interaction and responsiveness. Either of these can include dimensional sound elements, or spatialized sound. The specifics of different devices is many and wonderous, but these technical elements are sufficient to provide the differentiation important for our discussion. Like video screens and photographic images, these devices can be used to represent both truths or non-truths. For augmented reality and virtual reality documentary, we will concern ourselves with thinking through the implications of the former. One important feature of these devices is that they enable the creation of narratives in which the experiencer remains in a body, in a present moment. With augmented reality specifically, narratives can be built into the environment of the physical present moment as well.

With augmented and virtual reality, we have the opportunity to reintroduce the embodied truths-telling of festival and ceremony into documentary. The leverage point for this is in the interaction mechanics of the experience—how the experiencer controls and responds to the experience dynamically. If the truths are difficult, the interaction can be difficult, requiring uncomfortable movements from the experiencer. If the truth is exhilarating, it can elicit the experiencer to wave their arms or do something comedic. This is the most powerful and most difficult element of immersive reality storytelling. It is especially difficult because of the tools that are used to create the experiences, which are usually softwares created originally for video game design. Because of their lineage, these tools privilege frictionlessness, whizzing, popping, explosions and scorekeeping. There is nothing wrong with these things (in fact, they are often quite wonderful), but they are not necessarily the most useful mechanics for narrative storytelling, and modifying them to make them so often means working upstream against what is efficient and convenient. Designing mechanics that express embodied truths are the most powerful potential of augmented reality documentary, but they are the least commonly seen, and the most difficult to execute. If augmented reality documentarians can achieve this, augmented reality documentary will combine the instinctive assumption of reality attributed to technologically collected and generated visual and auditory experience with the embodied experience of meaning that is one of our most ancient forms of collective memory. This will create enormous possibilities for the experience of truths previously impossible in human history. If they cannot achieve this, the genre will be a gimmick that will quickly fade as people move to easier methods that achieve better results.

The Referent:

When I look at a photo or video, something specific happens in my kinesthetic, pre-intellectual experience of it that does not happen with other images. I instinctually feel it has happened. I, without thinking, attribute a reality to it. I feel its connection to past event, real or imagined. I do not feel this when I look at other forms of representation.

When I see Pablo Picasso’s Guernica, I do not instinctively feel that what I am seeing really happened, in the sense that the painting reflects a moment in time that really occurred. The painting depicts the town of Guernica’s brutal destruction by the Germans in the world’s first air-bombing during the Spanish Civil War. This destruction really did occur. However, this is only revealed to me through layers of explanation and contextualization.

When I look at the photograph, “Lunch Atop the Skyscraper,” I instinctively feel that those people really were there sitting on the skyscraper’s girder quietly eating, and they were. However, the event was not spontaneous, but staged as part of a publicity stunt for the Rockefeller Center. The instinct of its truthfulness is immediate and kinesthetic. After context and additional information, its untruth, its deliberate misleadingness, is revealed. In both cases, the produced visual art piece connect to a “real” referent. In the case of Picasso’s Guernica, the instrument is Picasso and his brush, filtered through Picasso’s emotional frustration and pain, his cubist painting style, and his distance from the actual event. The “Lunch Atop the Skyscraper” refers to a real moment when real workers sat on a girder because they were paid to do so during a publicity stunt, filtered through a scientific instrument—a camera that captured the photons bouncing from them, the photographer’s desire to reflect the Rockefeller Center in a positive light. The difference is the instinctual reaction of reality and unreality. This instinct is what makes photo and video documentary the most dangerous medium for the seeker of truths.

There are methods of using scientific instrumentation in immersive reality, some of which elicit this instinctive feeling of reality, and some that do not. 360 videos do usually elicit this feeling, similar to a flat video. 3D scanners, many of them handheld, also can introduce “real” objects into immersive reality, but they are in general too imprecise to create an instinctive connection to reality. Outside of 360 video, nothing in virtual or augmented reality yet generates this instinctual connection to real events. From the point of view of immediate, scientific representation this might be an impediment. From the perspective of revealing truths, it is not. Currently in virtual and augmented reality media, it is obvious and undeniable that the visual components are constructed, and the viewer must be convinced that they relate to truths in spite of this. This unavoidable honesty is an advantage. Being unable to include visuals which someone might mistake for reality when they are in fact constructed prevents the immersive reality journalist from misleading the viewer without justification.

However, even if we accept this inevitable fact of artistic interpretation, the referent still matters. For evidence of this, we look not to immersive media or film, but to Claude Monet. Monet’s water lilies were not, in fact, painted from some pond that he happened upon by accident. Monet’s water lilies were in a pond at his home that he himself designed. He was obsessive about its maintenance. A team of four employees maintained the pond, with jobs like skimming the water’s surface to remove dust, and dunking the water lilies to keep them pristine. This effort was apparently not enough, since Monet eventually paid to have the roads surrounding his house paved (in his time this was not the norm) in order to further prevent dust from settling on the water feature so central to his practice. This obsessive practice is notable because the instrument of collection for the water lilies is not a camera but Monet’s own eyes and mind. His focus on the real referent as an essential component of his final product is an acknowledgement of the inextricable dynamic between what he actually witnessed and what he eventually produced. This, perhaps, is the best metaphor for the high bar in immersive reality journalism today. Deep attention to the original referent and a belief in the necessity of its quality, constructed or no, despite the inevitability of interpretation, desirable or no.

It would be tempting to say that the difference between this experience in augmented reality versus video or even just audio is that it feels more real because it is more immersive, but this is not the case. Perhaps one day this will be the case, but it is not with what is available today. What distinguishes augmented reality documentary right now from other media with regard to its relationship to the original referent is in fact the additional layers and distance from the original referent, with the factual pitfalls and creative opportunities that emerge in that distance.

Narrative, desire, and agency:

A photograph provides an entry point into a moment in time. The image evokes a feeling or a thought, but a photograph itself does not convey a narrative. A photograph can provide clues to an underlying narrative, or cause the viewer to imagine a narrative, but it cannot convey a narrative in itself. A narrative must have progression, and a photo lacks progression, except through implication. A painting, however, can tell a story because it is not bound to a moment—it can contain progression and thus a narrative. Within progression is multiplicity, in the sense that progression must have multiple interlinked moments that form a whole. This is a requirement for narrative.

Neither a moment nor a narrative are truths, but they can be access points to truths, an incomplete intervention in the fabric of the present moment into somewhere else that either contains some principle that applies to the present moment, claims to be happening alongside the present moment, or happened before in a way that demands the attention of the present moment. With documentary, the truths at hand are not general principles, but an event, or series of events that claims to exist or have existed.

Most text-based and video narratives share linearity as a common necessity. One part of the narrative necessary follows the next deterministically. This is a destruction of reality’s inherent multiplicity for the sake of simplification. With the case of video (pre-internet) it is a requirement of the technology that narratives be constructed in this way. The effect of this is that the viewer has no control over what is revealed and when. This is not so for augmented reality narratives. The parts of the narrative may interlink with each other freely, branching and interweaving. This interrelation and divergence is a much more powerful standpoint from the perspective of describing truths than linearity. It enables a direct conveyance of truths as they actually exist—neither singular nor isolated.

Since a narrative is a disruption of the present moment, the viewer or experiencer must be held within a narrative by desire. Through desire, a creator elicits the attention of a viewer into a narrative, and elicits their acceptance of the cost of experiencing the present moment they embody. With a linear narrative, the only desire necessary is to attend. Other than this prerequisite, the narrative proceeds automatically, with no other input from the viewer. In non-linear narratives with branching elements, the desires of the audience must be more complex, because agency is introduced. The experiencer must choose which parts of the narrative to engage, and maintain motivation to continue moving through the narrative. Thus the challenge of the creator is less to decide what information to present to the experiencer and when, and more to provide them for a basis of decision-making. Who are they in the narrative? What parts of the narrative interrelate most strongly? What parts of the narrative do they need to experience to achieve completeness, and which are optional? All of this must be conveyed, or the experiencing subject will lack desire to proceed through the experience and will move from vague curiosity, to randomness, to fatigue.

The experience of embodied truths can be uniquely ecstatic. Reflecting on the festival dances we considered earlier, we notice that the group remembering and reinforcement of these truths often happens in celebration, alongside­­ mass consumption of surplus resources and alleviation of work duties. The experience of the story itself is also joyful—there may be some participatory dances that elicit sadness but they are few and far between. Even solemn festivals and ceremonies contain a catharsis, or a righteousness. This pleasure holds people in the narrative and collective memory and rewards them for immersing themselves in the story.

Augmented reality experiences also take place within the embodied location of the present moment of the experiencer. This is an old form of narrative, but it is new for electronically generated narrative formats. This, potentially, collapses the usual cost of attending to narrative, since attending to the narrative and to the present embodied moment, in the case of augmented reality, can the same. This presents both opportunities and pitfalls. Creating an augmented reality experience that does not interact with the environment of the experiencer is useless, since if it does not it could more easily and effectively be made using virtual reality, video or another medium. For documentary, this presents an impossible situation. For a fictional narrative, it can claim to exist anywhere, so its being in the location of the viewer presents no issue. With documentary, we claim that the narrative takes place in a specific place that is not in the location of the viewer. There is no easy way to reconcile this tension.

Conclusion:

The tools of augmented reality documentary, and augmented reality storytelling in general, are still new and imprecise. Perhaps counterintuitively, this means that it is at the height of its power. The nascence of the media demands an intentionality to its artistry, and an exposed justification of its truthfulness. As tools evolve to record and portray events with increasing amounts of scientific accuracy, this demand will slowly fade, and gravity toward a convenient, economically efficient, single, self-evident and totally ersatz truth will reappear. The artistic potency of the medium as it exists today emerges from its imprecision and difficulty. This time is the greatest opportunity for immersive reality art before it is overwhelmed by an impending dictatorship of quantification. This great oppression of “data” threatens to overwhelm all epistemologies, but the danger is greatest for digital media, most so digital documentaries. The use of scientific instruments and computation in documentary mistakenly conflates the accuracy of truths with the precision of those instruments. For documentary, that employs the instruments of science for the practice of art, there will always exist an inevitable mongrelization. However, in documentary using emerging technologies, it straddles not merely a border but a battlefront. A war is taking place over the usage and role of these technologies in our lives, whether they will be pathways to the expression of artistic truths or entry points for corporations to pipe data-optimized advertising and images directly into our minds. Augmented reality documentary and documentary with emerging technologies in general must fight to subsume these technologies into art and truths and avoid being consumed into quantification and scientific materialism. This is the latest war in a conflict that began with the emergence of photography into art. Or is there something more?

What is most at threat in these battles is the reality of multiple, relative, coexisting truths. Augmented reality documentary has the potential to generate narratives that represent truths as we know they exist in the world—as neither universal nor entirely subjective. The danger is that enhanced precision will draw us back into an ersatz unity, a single truth that never existed. Back into economically efficient linearity. The truly new opportunity of augmented reality journalism is to end this conflict once and for all. To create narrative formats that contain relativism without descending into subjectivity. That allow us to step as ourselves in the present moment into greater understanding without any sacrifice made by medium-enforced limitations or expediency. There is time now to set off in this direction now, although the opportunity will not last forever. We must take advantage of this time, as it will pass quickly for us, though not as quickly as for those dancing in the mountains above.