The Versioning Machine for Audio: Introducing VM 5.0

Over the past year, the MVP team has been working on updates to the Versioning Machine. The Versioning Machine is a framework and an interface for displaying multiple versions of text encoded according to the Text Encoding Initiative (TEI) Guidelines. While VM 4.0 had been updated to P5 compatibility, VM 5.0 is now HTML5 compatible too. The most significant outcome of this update is that the VM can now incorporate sound-based, image-based, and text-based versions as part of our understanding of the constellated “text” (in the Barthesian sense). VM 5.0 is still under development and will be released in the coming months with samples that demonstrate the new audio functionality. Tanya Clement, Martin Holmes and Susan Schreibman contributed to this writeup.

These modifications to the framework expand the range and breadth of texts that can be worked with. Jerome McGann has pointed to the need to compare documents across “much more extensive textual fields” (12). In this vein, Paul Eggert points to the letters and reviews surrounding a text of interest, Joseph Grigely points to condensed books and colorized films, and Neil Fraistat and Elizabeth Bergmann Loizeaux propose the comparison of objects like films, television shows, buildings, maps, and even bodies. This expanded field calls for an expanded set of tools, as existing methods for comparing texts may be unable to work with sound and/or image. Updates included in the VM 5.0, especially related to the representation and comparison of sound, attempt to respond to this need for tools that address a range of texts and reveal features for comparison that may be otherwise invisible.

Changes to the Versioning Machine 5.0 include:

  • Updating the XSL processing to output HTML5.
  • Including the functionality for location-referenced encoding, as described in the current documentation, in the main XSL stylesheet
  • Including the functionality to include sound documents and encode alignments between sound and text documents.
  • Creating a customized ODD file and RelaxNG schema covering all the features in the sample texts.

Assembled in the VM interface from left to right, the versions first appear to the reader in a prescribed order. The Versioning Machine style sheets transform TEI parallel segmentation encoding to transform the texts into an HTML interface. The interface facilitates access to (1) all of the versions in a horizontally scrolling page; (2) the comparisons an editor makes between versions; (3) images; (4) sound clips and transcriptions. The default order of the panels (each of which represents a version of a work) is based on the order prescribed by the editor but can be reordered by the user. Audio clips and transcriptions go in a single panel, and clicking on a location in the text will scroll the audio players to the correct position and pause them.

The VM 5.0 interface, showing alignments between three versions of a section of Gertrude Stein’s Tender Buttons. The right panel includes audio of Jackson Mac Low reading the section.
This update to the VM represents an opportunity to think about the TEI as it corresponds to multi-media editing environments facilitated by HTML5 as well as the very definition of versioning prescribed by our print-based culture.

The VM 5.0 includes two sample texts created to demonstrate the new functionality for displaying sound documents. The first brings together a section of Gertrude Stein’s Tender Buttons with audio of the poet Jackson Mac Low reading the section as well as the text of an essay written by Mac Low. Aligning Mac Low’s performance of the text allows users to isolate a phrase and focus on its pronunciation; it also points to the future utility of aligning multiple performances of a text and allowing users to compare sounds across temporal, geographical or technological change. Further, aligning the audio with Mac Low’s essay allows users to experience that argument—which deals significantly with the sounds of Stein’s words—in ways that are otherwise not possible.

The second sample included brings together several documents related to Spalding Gray’s Swimming to Cambodia, including multiple notebooks used by the actor to plan his monologues, two performances that include audio and a version of the text published in book form. Aligning the notebooks with the performances affords users an understanding of the work’s evolution, as fragments that are condensed into a single word in the notebooks expand out into sections of Gray’s characteristically rambling performances. This presentation complicates a user’s sense of the work, making visible its changes through history as well as its wanderings through different media.

