&dA &dA Application for Patent &dA &dA Representing proper pitch spelling in MIDI I. Background A. Common Musical Notation (CMN) The modern system for notating the music of Western Civilization, referred to here as Common Musical Notation (CMN), has a long and distinguished history. The origins of the modern system can be traced as far back as the 10th Century AD with notation of early church chant. These simple melodies were made up entirely of the notes of what we today call the diatonic scale. This is the origin of the "white" keys on the modern keyboard. Early chant was not composed in what we today call the major-minor system of keys but rather in an older system call modes. All modes used the same diatonic scale tones, but each mode started at a different degree (note) of the diatonic scale. Thus, for example, the Dorian Mode started on what we today call the diatonic pitch of D and consisted of the notes, &dBD&d@, &dBE&d@, &dBF&d@, &dBG&d@, &dBA&d@, &dBB&d@, &dBC&d@. This mode sounds a lot like the modern key of D minor, but includes a "raised" sixth degree (the note &dBB&d@ instead of the &dBB-flat&d@ that would be called for in D minor. The system for notating pitch in Chants and other early music was quite simple. A set of lines was drawn (sometimes four, sometimes five, sometimes more than five), and the degrees of the scale were represented as positions on the lines or on the spaces in between them. This is the origin of our modern five-line staff system. In the case of the Dorain Mode referred to above, the notation of the scale would look as shown below. --------------------------------------------------------------------- --------------------------------------------------------------------- C ---------------------------------------------------B----------------- A -------------------------------G------------------------------------- F -----------E--------------------------------------------------------- D The important thing to notice is that each degree (note) of the scale has a position that is &dBone level higher&d@ than the previous degree, but that the &dBactual size&d@ of musical interval between two consecutive degrees is not the same in all cases. For example, the size of musical interval between &dBD&d@ and &dBE&d@ is what we today call a whole step. In terms of sound frequency, the pitch &dBE&d@ is in the order of 12.24 percent higher than the pitch &dBD&d@ (the actual size will depend on the system of tuning used). The size of the musical interval between &dBE&d@ and &dBF&d@ is what we today call a half step. In terms of sound frequency, the pitch &dBF&d@ is in the order of 5.94 percent higher than the pitch &dBE&d@. To restate the point in another way, the levels on the musical staff &dBdo not&d@ all represent the same size musical interval. The modern system of major-minor keys, which is the basis of practically all music written and/or performed today (both classical and popular), grew out of the earlier modal system. What allowed the major-minor system to develop was ability to alter the basic pitches of the modes either by raising them with what we today call a &dBsharp&d@, or lowering them with what we today call a &dBflat&d@. The amount by which a pitch is raised or lowered by a sharp or a flat is a half-step, about 5.94 percent of the base (starting) frequency. The Dorian scale in the previous example can be made into a D-minor scale by flatting the B. In CMN the flat is put in front of the note making the scale shown below. ------------------------------------------------------------------------- ------------------------------------------------------------------------- C ---------------------------------------------------B -------------------- A -------------------------------G----------------------------------------- F -----------E------------------------------------------------------------- D For the purpose of this discussion, it is important to note that raising the &dBA&d@ in the example above with a sharp will produce a pitch (musical frequency) which is almost identical to the pitch (musical frequency) arrived at by lowering the next consecutive note &dBB&d@ with a flat. IN THE &dAEVEN-TEMPERED&d@ SYSTEM OF TUNING, THESE FREQUENCIES ARE IDENTICAL. B. Tuning Systems and the Modern (Piano) Keyboard. It is beyond the scope of this application to present a full explanation of the problem of tuning. Such a discussion would involve the theory of musical intervals and their relationship to musical harmonics. Suffice to say that over the course of the development of Western music (principally the 14th through the end of the 17th centuries), several systems of tuning were devised and used. With the advent of the modern major-minor system of scales and keys (toward the very end of the 17th century), the need developed for a tuning system in which the size of any particular musical interval would be the same (in terms of the ratio of sound frequencies) for all musical scales and keys. The system which does this is called the Even-Tempered System of tuning. In this system, all half-steps are the same size, and there are twelve of them in an octave. Since an octave is the interval between two pitches whose ratio is 2 to 1, the size of the half-step interval in the Even-Tempered System is the twelfth root of two, or approximately 1.059463 to 1. With the acceptance of the Even-Tempered System, it became possible to standardize, once and for all, the configuration of the keyboard (piano, clavichord, harpsichord, organ). This configuration, with accidentals (i.e., modifications to the principal degrees, e.g. &dBC#&d@, &dBD#&d@, &dBF#&d@, &dBG#&d@, and &dBA#&d@) being represented by shorter, thinner keys (the "black" keys) placed between wider, longer keys (the "white" keys), had already developed over the course of the previous 5 centuries as the primary (most common) layout of the musical keyboard. In this configuration, there is only one key (a "black" key) for the musical pitches notated as C-sharp and D-flat; and in the Even-Tempered System of tuning, these musical pitches have the same frequency. Likewise with the other "black" keys of the keyboard: i.e., &dBD-sharp&d@ = &dBE-flat&d@; &dBF-sharp&d@ = &dBG-flat&d@; &dBG-sharp&d@ = &dBA-flat&d@; and &dBA-sharp&d@ = &dBB-flat&d@. This "alternate spelling" of frequency-equivalent pitches is not limited to the black keys of the keyboard. There are, in fact, an infinite number of pitch spellings for each pitch as defined by musical frequency. The note &dBA4&d@, which in modern tuning is the pitch of 440 Hertz, can alternatively be spelt F4-sharp-sharp-sharp-sharp, G4-sharp-sharp, A4, B4-flat-flat, C5-flat-flat-flat, D5-flat- flat-flat-flat-flat. In "real-life" situations, it is rare to find more than two sharps or flats attached to a primary degree (note letter). C. MIDI representation of pitch. The Musical Instrument Digital Interface (MIDI) standard for representing musical events developed originally as a convention for communicating between electronic instruments. Since the primary (and musically most sophisticated) electronic instrument was the music (piano) keyboard, a system was devised to represent all possible (musically "likely") keys on the keyboard. The note, &dBmiddle C&d@, normally designated &dBC4&d@, was assigned the number 60. Each successive key above &dBC4&d@ was assigned a successively higher integer, and each successively lower key was assigned a successively lower integer. This system has served its original purpose well, since each note (key) on the keyboard has one and only one number associated with it. II. The Problem with MIDI and the printing of music A. Extensions to MIDI The MIDI standard was originally designed for communication between electronic instruments. About the time it was being developed, however, it was also becoming clear to many people that not only could one musical instrument be used to control another, but musical instruments, themselves, could be controlled by computer. Put another way, a sequence of electronic (MIDI) commands that might be generated (in a performance) on a musical keyboard could also be generated by computer. The receiving instrument has no way to know what the origin of the commands is (computer vs. live performance). In a similar manner, the "sending" instrument knows nothing about the nature of the "receiving" instrument under the MIDI standard. It is therefore possible for the receiving instrument to be a computer, which is actually recording (receiving and storing) the MIDI signals from the sender. In this way, a musical performance (defined as a series of physical gestures on an electronic keybaord), can be recorded (received and stored) on a computer and later played back on (sent to) the same keyboard or some other sound generating device which "understands" MIDI commands. With the advent of "MIDI" recordings and simulated recordings compiled by software, the need arose to find a way to pass this data from one computer to another. Also there were several descriptive aspects of the music not originally representable by the initial MIDI standard which people wanted to include in their files. Among these aspects are the key of the piece (number of sharps and flats), the mode (major or minor), the time signature, the tempo, musical lyrics, and other attributes. An extention of the original MIDI standard was developed for passing this information. B. MIDI and the spelling of musical pitches There is, however, one aspect of the music which is not included in this standard, and for a very good reason. This aspect is the SPELLING OF THE MUSICAL PITCHES. This information (1) does not come from a live performance, since the configuration of the musical keyboard does not distinguish between the spelling of pitches (e.g. &dBC-sharp&d@ and &dBD-flat&d@ ARE the same key), and (2) is irrelevant in a playback performance (e.g. C-sharp and D-flat would be "routed" to the same pitch on a MIDI playback device). Asside from the fact that pitch spelling is neither "representable" on an electonic keyboard nor relevant to a MIDI playback, there is technical reason why the framers of the extended MIDI standard did not include this attribute in their specifications. This information MUST accompany every note that is struck, and the basic MIDI standard for representing "note events" does not allow space for communicating this information. In particular, a note event in MIDI consists of four parts; (1) a time interval (2) a command, (3) a pitch, and (4) a velocity. Under certain circumstances, the second part (command) may be omitted. Otherwise, all parts are necessary to properly communicate a note event. The command, the pitch and the velocity each occupy one byte. By convention, the pitch and velocity bytes must have their high order bit be zero. This limits the range of these attributes to 128 values, i.e., 0 to 127. The velocity parameter communicates the speed with which a key is pressed down, and this, in turn communicates information about the attack as well as the volume (loudness) of the note in question. C. The spelling of musical pitches in the printing of music Whereas the spelling of a musical pitch is irrelevant to an electronic performance, it is absolutely vital to music printing. A piece of music printed improperly with incorrect spellings is difficult to read correctly, and can be virtually unreadable in some cases (at least as far as performance of the music is concerned). Under the current MIDI standard, therefore, it is not possible to include the information necessary for the proper printing of the music represented. The MIDI standard has become a de-facto method of exchanging musical data; yet no practical method has been devised for communicating the proper spelling of musical pitches, and this limitation has crippled the standard as far as software for printing music is concerned. (For the moment, printing software must make a "guess" about the correct spelling, and our experience with entering over 1000 movements of music from the 17th through the 19th centuries shows that while such guessing can produce the correct spelling much of the time, there is still the need for the user to proof-read the results. This works out to be a tedious and time-consuming process, which itself can never be guaranteed to be error free). III. Our solution to the problem (request for patent) The Center for Computer Assisted Research in the Humanities (and its director, Walter B. Hewlett) have devised a method of communicating pitch spelling using the conventional MIDI standard. This method degrades the communication of velocity information by a small amount, but allows the inclusion of information vital to music printing. A. Background to the Method For most musical applications we see today, it is sufficient to represent spellings which fall in the range of two flats to two sharps. In any particular octave, these are specifically: C-flat-flat C-flat C C-sharp C-sharp-sharp D-flat-flat D-flat D D-sharp D-sharp-sharp E-flat-flat E-flat E E-sharp E-sharp-sharp F-flat-flat F-flat F F-sharp F-sharp-sharp G-flat-flat G-flat G G-sharp G-sharp-sharp A-flat-flat A-flat A A-sharp A-sharp-sharp B-flat-flat B-flat B B-sharp B-sharp-sharp Thirty-five in all. It turns out that for any MIDI number, there are three possible pitch spellings within the range of two flats to two sharps. These are listed below for octave range 60 = &dBC4&d@ (middle C) to 71 = &dBB4&d@: 1 / Dff / Fff / Gf / Bff 1 2 60 - C 63 - Ef 66 - F# 69 - A 2 3 \ B# \ D# \ E## \ G## 3 1 / Df / Ff / Aff / Cff 1 2 61 - C# 64 - E 67 - G 70 - Bf 2 3 \ B## \ D## \ F## \ A# 3 1 / Eff / Gff / Af / Cf 1 2 62 - D 65 - F 68 - G# 71 - B 2 3 \ C## \ E# \ F### \ A## 3 where "f" = flat, and "#" = sharp. B. Our Proposal The velocity parameter, which can vary from 1 to 127 is, in general, more sensitive than is really needed. We propose to use the two low-order bits of this parameter to convey pitch spelling. Let v be the value of the two low-order bits of the velocity parameter. v can take on the values 0, 1, 2, or 3. The following table shows how this can be used to represent pitch spelling: pitch = 60 61 62 63 64 65 66 67 68 69 70 71 --------+------------------------------------------------------------ v = 0 | - - - - - pitch - spelling - undetermined - - - - - - - | v = 1 | Dff Df Eff Fff Ff Gff Gf Aff Af Bff Cff Cf | v = 2 | C C# D Ef E F F# G G# A Bf B | v = 3 | B# B## C## D# D## E# E## F## F### G## A# A## An advantage of this system is that the variation in the velocity parameter caused by specification of spelling is, in fact, minimal for any particular key. This is because most music in a key is notated in a consistent way. Let us take, for example, the key of E with four sharps. The picture below shows the pitch spellings one would most likely find in this key. The most common ones are shown in &dABold Type&d@; the next most common are shown in &dDunderline&d@. v = 1 | Dff Df Eff Fff Ff Gff Gf Aff Af Bff Cff Cf | v = 2 | &dDC &d@ &dAC# &d@ &dAD &d@ Ef &dAE &d@ &dDF &d@ &dAF# &d@ &dBG &d@ &dAG# &d@ &dAA &d@ Bf &dAB | v = 3 | &dAB# &d@ B## &dDC##&d@ &dAD# &d@ &dDD##&d@ &dAE# &d@ E## &dAF##&d@ F### &dDG##&d@ &dAA# &d@ A## As can be seen, all of these pitches have v = 2 or v = 3. This means that if we for example want to use a velocity in the range of 90, we would be choosing between 90 and 91 virtually all of the time. In my experience with MIDI, I have never been able to hear the difference between a velocity of 90 and a velocity of 91. Let's look at a second case (worst case I can find), namely the key of &dBF&d@ (one flat). v = 1 | Dff &dDDf &d@ Eff Fff Ff Gff &dDGf &d@ Aff &dDAf &d@ Bff Cff Cf | v = 2 | &dAC &d@ &dAC# &d@ &dAD &d@ &dAEf &d@ &dAE &d@ &dAF &d@ &dAF# &d@ &dAG &d@ &dAG# &d@ &dAA &d@ &dABf &d@ &dAB | v = 3 | B# B## C## &dDD# &d@ D## &dDE# &d@ E## F## F### G## &dDA# &d@ A## In this key, the most common notes are distibuted between all three rows, but significantly, all of the bold underline ones are in the v = 2 row. This means that 99 percent of the notes would have the same value. In our case above, almost all of the velocities would be 90, with a few 89's and a few 91's. I don't think the listener would notice the difference. C. Analysis of the Proposal There is one way in which such a system would degrade somewhat the representation of a musical performance. While it is true that a listener would have trouble distinguishing between velocities of 90 and 91, in the case of especially low velicities (soft notes), say for example numbers below 20, it is possible to hear the difference between consecutive numbers. Even more important, if someone wanted to represent a gradual &dBcrescendo&d@ or &dBdecrescendo&d@, they would not have access to the minute gradations in velocity provided by the MIDI standard. For the note C4, for example, they could not increase velocity in this manner: 2,3,4,5,6,7,8,9,10, etc., but would be required to increase it by increments of 4, i.e., 2,6,10, etc. But here we come to the most important point. There is a vast market for the transmission of traditionally notated music via the MIDI standard. &dAThis data is typically not generated by a &dAperformance, but rather is compiled via data entry and software &dAfrom musical scores. &d@ The information about dynamics in these scores is sketchy and very much open to interpretation. The user is therefore &dAnot&d@ particularly interested in this information (as communicated in the velocity parameter), since he/she will probably be using his/her software to modify these parameters in performance. &dAThe inclusion of pitch spelling &d@ &dAinformation in these music files via the low order bits of the &d@ &dAvelocity parameter provides the user with valuable information &d@ &dAon music printing while in no way disrupting the established &d@ &dAMIDI standard. Software that knows about this information can &d@ &dAeasily make use of it (we have software running that does this),&d@ &dAwhile at the same time software that knows nothing about this &d@ &dAinformation is in no way confused or disrupted. &d@ D. More detail It should be pointed out that the velocity parameter applies to both note-on and note-off events. Pitch spelling information need only be attached (and should only be attached) to note-on events. (MIDI 9x commands). The reason is that the need to know pitch spelling comes at the time a note is turned on, not off; and it can happen that a note is turned on several times and subsequently turned off the same number of times. In this case, there is no way to connect each note-off event with a specific note-on event, so representing pitch spelling in note off-events might lead to confusing or erroneous results. It is desirable that software reading MIDI files with pitch spelling information know that this information is, in fact, present. This fact can be communicated in a number of ways, the most sensible being via the &dBformat word&d@ (16 bits) of the Header Chunk of the MIDI file. Since the MIDI specification specifies this word for such purposes, our method for signalling the presence of pitch spelling information would not be included in this patent request. Altogether, there are 13 bits available in the velocity parameter; 6 in the note-on event, and 7 in the note-off event. One bit in the note-one event is needed to indicate that the note, indeed, is to be turned on. We are proposing to use two of these bits (the low order two bits of the velocity byte of the note-on event) to represent pitch spelling. We believe these are the "best" bits to use. But we would like to request a patent on the use of any of the velocity bits for this purpose. Furthermore, there are other musical attributes important in music printing which can be represented by bits in the velocity byte. We believe the most important of these attributes is information on the beginning and ending of slurs. Musical slurs are an indispensable part of music printing; yet the current MIDI specification provides no means to represent them. In our experience, we have found no cases where more than 3 slurs are operable at one time in one part. Therefore, it requires only the use of another 4 bits of the velocity parameter (the next four lowest in order) to represent slur-on and slur off information. This can be done in the following way: Value of the bits Meaning -------- ------------------------------------------ 0000 No slur information attached to this note 0001 Start slur "A" on this note 0010 Start slur "B" " 0011 Start slur "C" " 0100 Stop slur "A" " 0101 Stop slur "A" and re-start slur "A" on this note 0110 Stop slur "A" and start slur "B" on this note 0111 Stop slur "A" and start slur "C" on this note 1000 Stop slur "B" on this note 1001 Stop slur "B" and start slur "A" on this note 1010 Stop slur "B" and re-start slur "B" on this note 1011 Stop slur "B" and start slur "C" on this note 1100 Stop slur "C" on this note 1101 Stop slur "C" and start slur "A" on this note 1110 Stop slur "C" and start slur "B" on this note 1111 Stop slur "C" and re-start slur "C" on this note "A", "B", and "C" can be assigned as needed to slurs as they occur. If more than three slurs are active at one time (unusual case) the excess over three would be ignored. While it is less easy to use the velocity byte of note-off events to provide specific information about notes, this byte is ideal for setting general flags in the context of music printing. One possible use might be to signal stem directions, i.e., all notes after this point in the file have stem up, or all notes after this point in the file have stem down. Another use could be to signal to software how to make guesses about pitch spelling, assuming the exact spelling was not provided in the note-on velocity byte. Such a method would be less intrusive on the MIDI standard, since the note-off velocity byte is basically ignored by all applications; but it would require a tighter coupling between the data and the printing software, since the "guessing" algorithm would have to be uniquely specified. E. Summary of our patent request &dA The two low order bits of the velocity byte of note-on events &dAcarry very little information about attack and dynamics as far as &dAelectronic performance is concerned. In the case where MIDI files &dAare compiled from scores (and/or parts), we would like to patent &dAthe technology of representing pitch spelling information in these &dAlow order bits. We would further like to patent the technology of &dArepresenting pitch spelling information in any of the bits of the &d@ &dAvelocity byte in note-on and note-off events. &d@ &dA The general principle is that for the purposes of transferring &dAmusical information as compiled from a score to a MIDI file, 13 of &dAthe 14 bits of the velocity parameters (6 from note-on and 7 from &dAnote-off events) can be reassigned to represent information &dAimportant to music printing. We would like to patent this general &dAprinciple for the purpose of distributing our data for purposes of &dAboth musical playback AND music printing.