Thinking: 9-10-93 
       ДДДДДДДДДДДДДДДДД 

Subject: Release of data 

    We are planning to release our data in 5 formats; namely 

      1. Score input files 

      2. DARMS code 

      3. MIDI files: format 0 and format 1 

      4. David Huron Ho-Hum files 

      5. CCARH files 

    The first question to address is what will the CCARH format 
consist of?   At the moment, there are 2 options: namely, 
condensed stage 2 and full stage 2.  We need to ask the question, 
what is the motivation for releasing our format at all?  There 
is no third party software out there that can use this data; the 
only software that will work on the data is the stuff I have 
written.  This gets to the question of whether or not I want to 
release the monster or not.  There are pros and cons to this 
question.  
    
    PROS: 1) revenue 
    ДДДД  2) recognition 
          3) satisfaction in helping people do 
               their work 
          4) get "real world" feedback on the 
               value of this software 
          5) 

    CONS  1) There is a lot of work to do before 
    ДДДД       release: documentation, packaging 
               added features.  
          2) There is a certain amount of work in 
               marketing and selling.  
          3) The program is not easy to learn, does 
               not follow conventional methods, re- 
               quires a serious investment of time 
               and energy.  
          4) The program has not been checked on 
               different kinds of hardware and with 
               different configurations of operating 
               systems.  
          5) A protracted period of "training" of 
               customers might be required.  This 
               could divert valuable time away from 
               other important projects.  
          6) Releasing CCARH internal programs might 
               be giving away important technology.  
               The only people who would want these 
               programs might be people who were real 
               or potential competitors.  
           
An alternative to consider is to not release the monster at this 
time but rather to release user/task oriented programs later, 
packaged in a "Windows" format, and designed to compete with 
Finale, Score, the NoteProcessor, and other "User Friendly".  
sorfware. 

Assuming I do not release my software, then the question is, is  
there any reason to release data in a CCARH format.  Let's look 
at what people are getting in the other formats.  

  1. Score    Ability to print scores and parts 
              Ability to create live performances 

  2. DARMS    Ability to print scores and parts 
              Ability to create live performances 

  3. MIDI     Ability to create live performances 
              Input to Finale?  

  4. Ho-Hum   Anaylsis 

By the way, there is an enhancement to MIDI we can make that would 
make our files much more attractive for printing (eventually), when 
software was able to take advantage of it.  This would be to add 
pitch name information to the attack parameter.  It turns out that 
for any MIDI number, there are three possible pitch spellings within 
the range of two flats to two sharps.  These are listed below: 

     1       / D       / F       / G        / B     1 
     2    60 Д C      63 Д E     66 Д F     69 Д A       2 
     3       \ B        \ D        \ E       \ G     3 

     1       / D        / F        / A       / C     1 
     2    61 Д C     64 Д E      67 Д G      70 Д B      2 
     3       \ B       \ D       \ F       \ A      3 

     1       / E       / G       / A        / C      1 
     2    62 Д D      65 Д F      68 Д G     71 Д B       2 
     3       \ C       \ E        \ F      \ A     3 

The attack parameter, which can vary from 1 to 127 is really a lot 
more sensitive than is really needed.  What I propose is that the 
attach parameter mod(4) be used to convey pitch spelling.  Let 
v = the attack parameter.  Then the following table shows how this 
can be used to determine pitch spelling: 

 pitch   60   61   62   63   64   65   66   67   68   69   70   71 
ДДДДДДДДЕДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДДД 
v mod(0)і  - - - - - pitch - spelling - undetermined - - - - - - -  
        і 
v mod(1)і D  D   E  F  F   G  G   A  A   B  C  C 
        і 
v mod(2)і C    C   D    E   E    F    F   G    G   A    B   B 
        і 
v mod(3)і B   B  C  D   D  E   E  F  F G  A   A 

An advantage of this system is that the variation in the attach 
parameter caused by specification of spelling would, in fact, be minimal 
for any particular key.  This is because most music in a key is notated 
in a consistant way.  Lets take, for example the key of E with four 
sharps.  The picture below shows the pitch spellings one would most 
likely find in this key.  The most common ones are shown in &dARED&d@ and 
the next most common are shown in &dIGREEN&d@.  


v mod(1)і D  D   E  F  F   G  G   A  A   B  C  C 
        і 
v mod(2)і &dIC  &d@  &dAC &d@  &dAD  &d@  E   &dAE  &d@  &dIF  &d@  &dAF &d@  &dIG  &d@  &dAG &d@  &dAA  &d@  B   &dAB  
        і 
v mod(3)і &dAB &d@  B  &dIC&d@  &dAD &d@  &dID&d@  &dAE &d@  E  &dAF&d@  F &dIG&d@  &dAA &d@  A 

As can be seen, all of these pitches are either mod(2) or mod(3).  
This means that if we for example want to use an attach in the range 
of 90, we would be choosing between 90 and 91 virtually all of the 
time.  In my experience with MIDI, I have never been able to hear 
the difference between an attach of 90 and an attach of 91.  Let's 
look at a second case (worst case I can find), namely the key of F 
(one flat).  


v mod(1)і D  &dID &d@  E  F  F   G  &dIG &d@  A  &dIA &d@  B  C  C 
        і 
v mod(2)і &dAC  &d@  &dAC &d@  &dAD  &d@  &dAE &d@  &dAE  &d@  &dAF  &d@  &dAF &d@  &dAG  &d@  &dAG &d@  &dAA  &d@  &dAB &d@  &dAB  
        і 
v mod(3)і B   B  C  &dID &d@  D  &dIE &d@  E  F  F G  &dIA &d@  A 

In this key, the most common notes are distibuted between all three 
rows, but significantly, all of the red ones are in the mod(2) row.  
This means that 99 percent of the notes would have the same mod value.  
In our case above, almost all of the attacks would be 90, with a few 
89's and a few 91's.  I don't think the listener would notice the 
difference.  

By the way, if the listener is really interested in attacks, he is 
probably going to edit the MIDI file anyway.  For this purpose, he 
could set all of the attacks first to mod(0) and proceed from there.  


Getting back to the question of whether to release in the CCARH format, 
I have more to add.  The CCARH format is the most complete and most 
specific format we have for the data.  DARMS is next, but DARMS is hard 
to work with and DYDO's program is full of bugs.  If we want to release 
text, for example, I think the best way is to release it in our own 
stage2 format.  The fact that it is available at all may contribute to 
a higher standard of data representation.  I still worry about the 
fact that little or no software would be available to take advantage of 
this data, but at lease it sends a message about why we are different.  


So, back to the question of what format to use for CCARH files.  My 
current thinking would be to release stage2 files, and not the con- 
densed stage2 that I previously designed.  Reasons: 

  (1) condensed stage2 is not as complete as SCORE or DARMS, and 
        therefore not as good, because there is no software other 
        than mine that works with it.  

  (2) if we do include pitch spelling in our MIDI release, then 
        condensed stage2 has little more real information that 
        MIDI and not one-tenth the appeal.  

  (3) The only advantage I can see to condensed stage2 is that it 
        would give more precise information on ties and the breakup 
        of rests and the structure of repeats, etc. than MIDI would.  
        Also, condensed stage2 takes up less space than stage2 

  (4) The space factor is one that needs to be considered, but I 
        could instigate a condensing algorithm that would compact 
        stage2 files significantly.  This could effectively eliminate 
        the need for condensed stage2.  I would have to write a 
        decompression program, however.  

So, what are the conclusions to all of this thinking?  

   1. Release MIDI data in "pitch-specific" format.  

   2. Release CCARH data in stage2 format.  Devise some way to 
        compress the files.  

   3. Don't do any more work on the compressed stage2 format 

   4. Don't worry for the moment about the release of CCARH 
        internal software.  

Where do we stand at this moment?  

   1. SCORE   I have written some experimental programs, but 
              these need to be formalized and tested.  I need 
              to get SCORE working on this computer and also 
              need to understand more about how it works.  

   2. DARMS   It would appear that Brent's program works 
              pretty well.  I need to get Stephen's program 
              working on this computer and test out the 
              results of Brent's program for each of the 
              data sets we plan to release.  

   3. MIDI    I now have a good understanding of what format 1 
              MIDI files look like.  I need to get a program 
              that takes these as input and does something 
              with them.  I also need to get that black box 
              working on this computer.  It would be nice if 
              the box could be made to work with the Monster 
              program.  

   4. HO-HUM  This will be the last format to be completed.  I 
              do not contemplate working on it for the time 
              being.  

   5. CCARH   I need to devise a plan for compressing these 
              files.  Also, I need to invent and test sound 
              records for the various types of ornaments.  
              Also, I need to improve the S2MCOMP program to 
              deal with chords and the backspace command.