Beyond MIDI: The Handbook of Musical Codes

Code Categories

Eleanor Selfridge-Field

The organization of this book reflects an overall tripartite categorization--codes for sound applications, codes for notational applications, and codes of analytical and/or more abstract applications. The influence of more specific aims, and sometimes of platform strength and weakness, causes the further differentiation found below. The raison d'être of the items to which whole chapters are devoted is explained briefly here. Many additional codes are cited briefly in the glossary.

1.5.1 Sound-Related Codes (1): MIDI

This first section gives a presentation of the Standard MIDI File Format and then presents several schemes for enriching MIDI. These extensions barely scratch the surface of a complex world of possibilities.

Since there is a wealth of detailed literature on MIDI, it is not our intention to provide a definitive resource here. Interested readers will want to consult the specialized literature, such as that given in the references section. MIDI is included here as much to show its deficiencies, in comparison with most of the other codes described, as its accomplishments. Our presentation does differ from most in giving plain ASCII text, organized in columns within which pitch information and duration information are grouped separately, as an alternative to the hexadecimal code (hex) shown in most MIDI manuals. For musical readers unfamiliar with hex, we also include an annotated hex file.

The proposals for MIDI extensions represented here are but a few of hundreds of adaptations that have been conceptualized and dozens that have been put into practice in proprietary applications software. All of these have been tested in working environments and all are in the public domain.

It is symptomatic of the larger issue of resolving the potentially conflicting needs of sound, notation, and analysis applications that even here, within the context of mere extensions of a code acknowledged to be based on sound, many separate tangents are being pursued.

In the course of the preparation of this book we asked some contributors whether they might consider adopting some of these other sets of extensions. Those we asked all preferred their own systems on the grounds of being better suited to their own applications.

There are good reasons for this failure to coalesce: all the intended applications are different.(20) The differences are suggested below:

Kjell Nordli's NoTAMIDI suggested meta-events facilitate a more complete representation of attributes required for printing from data captured by synthesizer. That is, the input is a MIDI file, while the output is printed music. The Expressive MIDI parameters of David Cooper, Kia Ng, and Roger D. Boyle are designed to make data capture by optical recognition of the printed page more practical. That is, their input is a printed page, while their output is a MIDI file. Walter B. Hewlett's MIDIPlus extensions facilitate accurate conversion of MIDI note numbers in generating accurate enharmonic notation (a staple of tonal music) from a MIDI file. This more explicit encoding provides a solid foundation for harmonic analysis, which is otherwise lacking in MIDI file information. Max V. Mathews's Augmented MIDI extensions facilitate more articulate control of MIDI information in a real-time controller environment. The dot that creates a staccato in print does not give a precise specification of the duration. Many sequencer programs apply a simple algorithm that gives every staccato the same percentage of "off" time from the stated value of the related note. Here the aim is to allow the user to vary the degree of staccato as well as the degree of accentuation and other nuances from instance to instance.

5.2 Sound-Related Codes (2): Other Codes for Representation and Control

Most of the sound codes featured in this section are predominantly associated with specialized or research-related activities. Our coverage includes these systems for representing musical sound:

Csound, developed by Barry Vercoe at the Massachusetts Institute of Technology in the early Eighties, is probably the most widely used sound code apart from MIDI. Implementations exist on several platforms. The code has capabilities for handling speech as well as music. David Bainbridge is familiar with Csound through his efforts to create an optical recognition program that produces Csound files. Music Macro Language is widely used on the Japanese NEC computer platform, particularly for games software. Toshiaki Matsushima has enabled us to offer this description, the first ever in English.

UNIX workstations have not been well suited to real-time processing, which conflicts with their fundamental approach to multitasking, so MIDI files have been less easily used on them. The NeXT operating system, a popular platform for the composition of electronic music, is an exception. The NeXT ScoreFile format, described here by David Jaffe, works with his widely implemented NeXT Music Kit on both NeXT and Silicon Graphics workstations and with the NeXTStep operating system available for PCs.

While we cannot describe all the data formats designed to work with the many controllers in use today, Max V. Mathews provides a specification for one--his Radio Baton Conductor score file. The associated code exists in two iterations--as an independent code and as a track that can be added to a Standard MIDI File--and suggests the main attributes of performance that must be represented in some way if they are to be controlled. Note its debts to a seminal sound-related code, Mathews's Music V.(21)

5.3 Musical Notation Codes (1): DARMS

DARMS is the oldest comprehensive code for music still in use. Originally developed for mainframe computers, it is highly compact. DARMS does not describe "music" so much as it describes written symbols and their placement. Because of its relative antiquity, DARMS exists in several dialects. Many further extensions are conceivable.

Canonical DARMS, described in Chapter 11, gives explicit and complete information but is rarely implemented in actual software. Its theoretical existence allows applications designers to select, abridge, and enhance those features that are relevant to their tasks.

The Note-Processor DARMS dialect, developed by J. Stephen Dydo in the early and middle Eighties, is known to the largest number of current users, since the Note Processor has been available for the PC for ten years. The A-R DARMS dialect, developed by Thomas Hall in the late Seventies, has produced the largest quantity of music (more than 100,000 pages), first from mainframe computers and in recent years on Sun Sparcstations at the offices of A-R Editions, Inc.

The extensibility of DARMS is demonstrated by two specific applications--those of Frans Wiering for lute tablature (Chapter 14) and of Lynn M. Trowbridge for Renaissance notation (Chapter 15).

5.4 Musical Notation Codes (2): Other ASCII Representations

Other ASCII representations of musical notation are quite numerous and we present only four of them here. These can be grouped into two sub-classes. The first two address particular typesetting contexts--PostScript in the first case and TeX in the second. However, the first is primarily designed to pipe electronic compositions to a printer, whereas the second is oriented towards the character encoding of material.

Common Music Notation, a program still under development by Bill Schottstaedt, works with a sound-synthesis program (Common Music by Heinrich Taube) to produce PostScript files and is intended primarily for composers. The M*TeX family of programs produces musical examples for the TeX typesetting program used prevalently in scientific document production. Werner Icking enlightens us on the background and use of these programs, which have multiple authors. M*TeX is widely used in Germany, particularly for typesetting musical examples in articles and books.

The second two systems are associated with specific operating systems--DOS (with pending implementation in Windows) and the system native to the Acorn microcom-puter. In principle, either could be implemented on other systems. However, these and most other input codes for notation programs are converted to an intermediate code (not shown) that determines the final appearance of the page. Philip's Music Scribe, a notation program by Philip Hazel, is designed to run on Acorn computers.(22) The scheme of music representation, however, typifies what is required in producing notation--irrespective of the platform--and in this respect enables the reader better to understand the requirements of representations intended to support printing. SCORE, a program by Leland Smith that traces its roots back almost 25 years, and its data structures may constitute the system sustained longest by one individual. Originating on a mainframe and operating today on a PC, SCORE is celebrated for its superior graphical results and its facility in handling the extended and sometimes arbitrary notations of twentieth-century music. It is used as a professional publishing program for the complete works of many noted composers (among them Schumann, Verdi, Wagner, and Schoenberg) and for a great deal of popular music from such companies as the Hal Leonard Company.

5.5 Musical Notation Codes (3): Graphical-Object Descriptions

DARMS and SCORE typify the enormous effort expended in the Sixties and Seventies to describe the symbols used in music notation and their relative placement on the page. The arrival of the Macintosh platform, with its graphical user interface, in the early Eighties created a strange new environment for conceptualizing musical notation. Why represent the object at all? Why not simply draw it, store the image for future reference, and finalize the placement with a mouse?

The two systems we offer here have roots in mainframe notation programs of the early Seventies:

LIME's Tilia representation, developed by Lippold Haken and Dorothea Blostein, is oriented entirely towards the graphical image and is stored as binary code, assuming that the user has no requirement to "read" or comprehend the code. Its elegant predecessor, OPAL, was a highly transparent ASCII representation comparable in many ways with Kern and MuseData, but with more extensive integration of sound, notation, and "logical" representation. Nightingale is the Macintosh adaptation of Donald Byrd's extensive earlier work in providing a printing system for the MUSTRAN representation, which supported numerous research projects of the Seventies and early Eighties. Byrd has cleverly dealt with the problem of the loss of a meaningful representation of the music by creating a metacode (Notelist) for data transport to and from codes with an ASCII representation. Tim Crawford, a leading exponent of Nightingale in the U.K. and an expert on music representation, describes the code here.

5.6 Musical Notation Codes (4): Braille

Braille music code is a graphics-oriented one that substantially precedes in date of origin the use of computers to typeset music. Since its aim is to represent the sound and gestural elements of notated music, but not elements of layout or cosmetic refinements, some elements of its logic are unique.

The automatic generation of Braille scores from other codes for music is often cited as an intention of the developers. Music programmers sometimes overlook the interest of many Braille music code users in creating scores in common music notation for sighted associates.

The syntax of Braille musical notation has evolved into many national dialects, but a recently adopted international code should encourage greater standardization in the codes used.

Roger Firman's overview (Chapter 22) considers concepts and organizational issues, mentions frequently used programs, and provides two annotated examples (in the "English" dialect) of its use. Bettye Krolick and Sile O'Modhrain explain the most commonly used symbols (Chapter 23), based on the newly published (1996) "standard" international version of the code.

5.7 Musical Data for Analysis (1): Monophonic Representations

Monophonic music is simpler to handle than polyphonic music, and for this reason its use in actual applications has been much greater. The codes cited in this section are used in a single field of relational databases. Combined access to text and music fields is valuable in the management of information about musical sources. These kinds of projects give the clearest idea of what kinds of analytical work may be possible in the future with polyphonic databases.

The Essen Associative Code (EsAC), developed by Helmut Schaffrath, has been the backbone of a series of projects in the transcription and analysis of folksong repertories. More than 14,000 works have been encoded in a database framework with six basic fields that supports twelve basic search types aimed at identifying musical similarity. Plaine and Easie Code, developed in the Sixties by Barry Brook and Murray Gould, has been the basis of the musical incipit databases of the Répertoire International des Sources Musicales (RISM). Almost 300,000 incipits have now been transcribed in databases containing more than 100 fields. These materials are used both to catalogue works and to locate matching and derivative versions of works in multiple locations.

5.8 Musical Data for Analysis (2): Polyphonic Representations

Only since the Eighties have computer memory and processing speed made it feasible to think of processing large quantities of musical repertory. Of the two projects listed in this section, the first is intended principally for musical analysis, while the second is intended to provide a comprehensive representation scheme to support applications in sound, notation, and analysis.

The Kern representation is one of several used with the Humdrum Toolkit, a set of UNIX tools for music applications developed by David Huron. Sixty analytical operations(23) are currently supported. Since analytical objectives may focus on only one or a few attributes of music, Kern permits the selective encoding of musical attributes. The MuseData representation, under development by Walter B. Hewlett since 1982, aims to facilitate applications in multiple domains. It supports the ongoing development of large corpora of standard repertory, exporting works to various formats associated with printing and sound applications. Few analytical applications have thus far been developed.

5.9 Representations of Musical Form and Process

The intricacies of musical representation have provided a fertile field for the attention of researchers in artificial intelligence. Where those concerned with notation see objects in spatial positions and those concerned with sound hear events and timbres, researchers in artificial intelligence and related disciplines(24) look for grouping mechanisms that place objects into semantic clusters. Of the many efforts at representation guided in this way, we cite two recent ones below. Note that in both cases, the result is less well suited to translation between data formats than more explicit representations.

Some works are much more formulaic than others. Our Mozart trio example is characteristic of works in which large clusters of notes form one pattern. In Chapter 28, Ulf Berggren demonstrates an approach to encoding based on descriptions of these formulas. Segmentation is a favorite device of linguistics researchers for whom sound is the fundamental level of a work's identity. In Chapter 29, Andranick Tanguiane gives us a parallel example to model the train of changing perceptions that occur in the learning of a new musical work.

Such examples represent only a small portion of the literature on theoretical aspects of musical cognition and epistemology. Psychologists of music sometimes speak of the mental "encoding" that facilitates the perception and understanding of music. While this topic lies well beyond our scope here, a literature on it exists.(25)

5.10 Interchange Codes

No one questions the desirability of defining ways of interchanging data between applications. What this book clearly demonstrates is how treacherous the ground becomes when efforts are made to accommodate multiple domains (sound, graphics, analysis) in one representation scheme. Sound exists in time, notation exists in space, and analysis can be based on either or both, or on elements of the "logical" work not represented in either one, or even on implied information (such as accent) experienced in performance (the "gestural" domain of some commentaries).

Our coverage here rests on the three interchange codes that have achieved some kind of presence. Each is introduced below.

HyTime and Standard Music Description Language (SMDL) are offshoots of Standard Generalized Markup Language (SGML), a set of document description tags intended to facilitate generic markup of texts and to simplify computerized typesetting of text among multiple systems and vendors. SGML, which for the first 15 or so years of its existence was used primarily in the printing of U.S. government documents, got its second wind from another offshoot--HyperText Markup Language (HTML), the markup language currently dominant in World-Wide Web applications. HyTime was approved by the International Standards Organization (ISO) in 1992; SMDL was adopted by the ISO in 1996. The main uses of HyTime, a multimedia scheduling protocol, have been outside the field of music; SMDL has not yet been tested to a significant degree in music applications. The Notation Interchange File Format (NIFF) was still in beta-test stage in 1996. It was originally conceived as an aid to notation placement for the output of optical recognition programs. It was drafted to conform to the Microsoft Resource Interface File Format (RIFF) for multimedia applica-tions running under Windows. Much of the recent development work has attempted to accommodate MIDI data. Whether a blend of these three needs makes for a practical standard will be determined once the format has been widely tested. Many notation and sequencer program developers have looked in on the NIFF effort but few have volunteered the kind of time required to make a robust standard.(26) Standard Music Expression (SMX) was the first effort at data interchange that was put into actual, effective use. Developed only to a provisional level in the mid-Eighties, SMX has been used to facilitate the sharing of data between printed scores, Braille notation, and musical robotics. SMX printing is optimized for the Dai Nippon Music Processor, a dedicated machine using hardware that evolved from the Danish SCANNOTE representation of the early Seventies.

Certain issues of relationships between codes, and between codes and hardware, are are touched upon in the five appendices, which are located at the ends of the sections to which they are relevant. A great many other codes--some very distinctive, some very venerable, some relatively new and little known, and some excluded from the main section only for lack of authority to describe them--are listed in the glossary.

Footnotes

20. See also Peer Sitter's proposed "Extended Standard MIDI File Format" in the glossary. Too few details of these enhancements, intended to facilitate analytical and pedagogical applications, were available at press time to warrant a full chapter. In this system "dummy events" are created to remedy some of MIDI's deficiencies, such as its failure to represent rests.

21. Described in the Glossary.

22. The Acorn computer is in use principally in the United Kingdom and in certain commonwealth countries (especially Canada, Australia, New Zealand, and South Africa).

23. Humdrum supports particular tasks frequently carried out in tonal analysis, row analysis, thematic analysis, and in perceptual studies of music.

24. Notably in quantitative and structural linguistics and in cognitive psychology.

25. See, for example, "The Values, Limitations, and Techniques of Encoding Low-Level Cognitive Structures in Melody," the final chapter of Eugene Narmour's The Analysis and Cognition of Melodic Complexity: The Implication-Realization Model (University of Chicago Press, 1992), pp. 330-360.

26. Twelve Tone Systems announced in January 1996 the intention of implementing a NIFF conversion utility in some of its music software.