. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


                         H U M D R U M   N E W S


        Issue No. 2                                 1995 February 20

        A Newsletter for Music Researchers Using the Humdrum Toolkit.


   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .



HUMDRUM NEWS is a newsletter intended to facilitate communication with
music researchers who are either using the Humdrum Toolkit, or are
contemplating using the Humdrum Toolkit.

This issue of HUMDRUM NEWS conveys several announcements and three articles.

The announcements include notice of the availability of preprinted copies
of the Humdrum Reference Manual, the availability of Bach's Brandenburg
Concertos in Humdrum format, and a two-week Humdrum course to be held
at McGill University in May.

The articles:

(1) We begin with a general article on the Humdrum approach to music
    representation.

(2) This is followed by an extended tutorial showing how to use Humdrum
    to locate violations of the conventional rules of voice-leading.
    The tutorial ends with an 80-line "clip-out" shell script that allows
    users to check any score for 9 types of voice-leading transgressions.

(3) Finally, Gregory Lewin provides helpful tips for printing the 552-page
    Humdrum Reference Manual on a local printer.  The tips are especially
    useful for those who don't have access to a postscript printer, and
    who can't afford a preprinted copy of the Manual.

Your comments and questions are welcome.  Mail to

     David Huron
     dhuron@watserv1.uwaterloo.ca

     Music Department
     University of Waterloo,
     Waterloo, Ontario     N2L 3G6
     Canada


   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

                                    \|||/
                                    (o o)
                      -----------o00-(_)-00o-----------
                     |                                 |
                     |      HUMDRUM ANNOUNCEMENTS      |
                     |                                 |
                      ---------------------------------


::::::::::::::::::::::::
HUMDRUM REFERENCE MANUAL  Preprinted Copies Available
::::::::::::::::::::::::

Printed copies of the Humdrum Reference Manual are now being made
available by the Center for Computer Assisted Research in the Humanities.
The 552-page manual sells for just $25 (US); it is printed (double-sided)
on 8x11 paper stock with plastic-ring binding.

For further information write, phone, or e-mail:

     Center for Computer Assisted Research
        in the Humanities
     525 Middlefield Road, Suite 120
     Menlo Park, California    94025-344

     1-800-JSB-MUSE  (toll-free order & information line)

     ccarh@netcom.com

   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


::::::::::::::
HUMDRUM COURSE  Offered at McGill University in May 1995
::::::::::::::

The Faculty of Music at McGill University (Montreal) is offering an
intensive 2-week course entitled "Computer-Assisted Musicology Using
Humdrum."  The 3-credit-hour course will be held from Monday May 15th
to Friday May 26th.  The course offers a total of 30 hours of
instruction in the mornings, with tutorial sessions each afternoon.
The advertised cost of the course is C$746 (approx. $529 US) for
non-Canadian residents and C$167 for Canadian residents.

The printed course description and an outline of lecture material
are reproduced below.

For further information contact the Summer Studies programme, McGill
University, Tel. (514) 398-6896.


DESCRIPTION:

     This course introduces participants to the use of the Humdrum
     Toolkit in computer-assisted music research.  The course will
     emphasize the development of practical hands-on skills.  Tutorial
     problems will stress editorial, analytic, cultural, historical,
     stylistic, and perceptual problem-solving.  (NOTE: Music printing
     and MIDI composition are not covered in this course.)

     Participants will learn how to use Humdrum to pose and answer a
     very wide range of problems, from characterizing patterns of
     orchestration in Beethoven symphonies to finding similar fingering
     patterns in Japanese shamisen tablatures.  This course will be of
     particular interest to those contemplating advanced music research
     projects.


PREREQUISITES:

     Some working knowledge of UNIX is a considerable asset.


OUTLINE OF LECTURE/TUTORIAL MATERIAL:

     Introduction & Preview.
     Representing Musical Information using '**kern'.
     Electronic Editions and control of variant documents.
     Extracting & Assembling Information.
     Defining & Searching for Patterns.
     The Art of Classifying.
     Generating Inventories.
     Pitches, Intervals, Contours, Tablatures, & other derivatives.
     Musical Similarity.
     Establishing Contexts.
     Patterns of Patterns.
     Syncopation, Accent, Dissonance, & other Tools.
     Tonal, Serial, & Information-Theoretic Analyses.
     Applications in Ethnomusicology, Early Music, & Perception.
     Data Encoding & Verification.
     Hypothesis Testing.
     Connecting Humdrum to Other Tools.


   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


:::::::::::::::::::::
BRANDENBURG CONCERTOS  Available in Humdrum Format
:::::::::::::::::::::

The Center for Computer Assisted Research in the Humanities has 
announced the released of J.S. Bach's Brandenburg Concertos in electronic
form.  The data are available in both the Humdrum **kern representation
as well as in MIDI format.

The Center is selling the data for $13.95 for either the Humdrum **kern
version or the MIDI version.  Both versions can be purchased for $19.95.

The electronic edition of the six Concertos is an accurate transcription
of the Bach-Gesellschaft edition of 1871.  Other releases of Humdrum
scores from CCARH are expected later this summer.

For further information write, phone, or e-mail:

     Center for Computer Assisted Research
        in the Humanities
     525 Middlefield Road, Suite 120
     Menlo Park, California    94025-344

     1-800-JSB-MUSE  (toll-free order & information line)

     ccarh@netcom.com


   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

                                    \|||/
                                    (o o)
                      -----------o00-(_)-00o-----------
                     |                                 |
                     |            ARTICLES             |
                     |                                 |
                      ---------------------------------

::::::::::::::
REPRESENTATION
::::::::::::::

A common question in computer-based musicology is "What is the best
way to represent musical information?"

Consider, by way of example, the problem of representing *pitch*
information.  Should pitches be represented by letter-name and
accidental, by key-number, by frequency, by staff-position, or by
some other quantity?

Similarly, are melodies best represented as a sequence of pitches, or
as a sequence of intervals, or as a sequence of up-down contour
movements, or as successions of scale-degrees?

The answer to these questions depends on the final goal of the
representation.  For example, representing a melody as a sequence
of intervals (rather than a sequence of pitches) will make it easier
to find transposed versions of the melody.  But an "interval"
representation makes the melody more difficult to perform -- after
all, instruments play pitches, not intervals.

Similarly, a contour-based representation of a melody may prove to
be a better way of allowing a musically illiterate person to look
up the source of a melody.  As public librarians are well aware,
many ordinary people are eager to know the name of a melody -- but
they are unable to describe the melody in terms of pitches or
intervals.  As a result, popular musical reference tools use
contour-based representations for melodies.

Unfortunately, a contour-oriented representation will not allow a
theorist to identify whether an imitative theme is "real" or "tonal",
and contour representations are impossible to perform.  Also, many
melodies share the same initial contours.

You might think the best way to represent music is to use pitches,
and let other people write programs to derive the interval or contour
information they wish.  But forcing users to encode pitch will meet
with fierce resistance from, for example, early music scholars, who
might rather represented *neumes* without making pitch assumptions.
Similarly, a lute scholar may prefer representing tablatures rather
than making assumptions about pitch.

Some representations are based on the idea that rather than
representing pitches, we should represent the appearance of common
musical notation.  This approach can solve a number of problems, but
it also introduces a few problems.  Some users don't care what the
notation looks like -- such as ethnomusicologists, or electroacoustic
composers.  For many tasks, it doesn't matter whether middle C is
represented at the bottom of a treble staff or in the middle of an
alto staff.  In addition, distinguishing these as different represen-
tations can cause problems when searching for melodic patterns.
Moreover, the appearances of notations such as lute tablatures,
figured bass, or Gamelan number notation differ radically from common
staff notation.

Humdrum addresses the problem of representation by saying "Choose
whatever representation is most appropriate to your goal."  In other
words, we shouldn't attempt to use a single representation to cater
to all needs.

Having said that, Humdrum provides a number of tools that let users
translate between some of the most common types of musical infor-
mation.  (And, of course, you can make up your own.) What links all
of these notations is that they share a common syntax -- the Humdrum
Syntax -- described in Section 1 of the Reference Manual.

Consider, for example, a file (named "bach") which contains the
following representation:

**pitch
Bb3
A3
C4
B3
*-

The "**pitch" representation is pre-defined in Humdrum.  It corresponds
to the American National Standards Institute designation for pitches,
where "C4" is middle C.

Not everyone will like this type of pitch representation.
For example, using the pre-defined German **Tonh (Tonhoehe)
representation, we could represent the same pitch sequence as:

**Tonh
B3
A3
C4
H3
*-

Moreover, given either one of the above representations, we could
generate the other using either the PITCH command or the TONH
command.  The command:

     tonh bach

will produce German **Tonh output for the "bach" file.  Alternatively,
the command:

     pitch bach

will produce American **pitch output for the "bach" file.

Humdrum provides plenty of other pitch-related tools.  For example:

     solfg bach      - produces French solfege output for the "bach" file
     freq bach       - produces frequency output for the "bach" file
     cents bach      - produces cents output for the "bach" file
     cocho bach      - produces cochlear coordinates output

     And so on.

What we've just illustrated for pitch can also be done with rhythm
(e.g. duration, elapsed time, metric position, onset structure, tempo
deviations, etc., etc.).  In a future discussion we'll consider a
number of rhythm- and timing-related representations.

In this newsletter's tutorial, you'll see how alternative pitch-related
representations are used to help process voice-leading tasks.


   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

                                    \|||/
                                    (o o)
                      -----------o00-(_)-00o-----------
                     |                                 |
                     |            TUTORIALS            |
                     |                                 |
                      ---------------------------------

::::::::
TUTORIAL  Locating Violations of the Rules of Voice-Leading
::::::::

The traditional rules of voice-leading have formed a standard component
of conservatory training for art musicians.

For illustration purposes, we'll apply some of the Humdrum tools
to the problem of identifying betrayals of the classic rules of
voice-leading in a **kern-encoded score.  Note that our purpose here
is not to legislate how to compose or arrange!  We're simply using
the traditional voice-leading rules as a way to introduce various
pattern-searching techniques.

:::::::::
(1) PARTS Out Of Range
:::::::::

    RULE:  "Avoid parts that are out of range."

    The Humdrum CENSUS command provides a summary of various
    elementary features of any Humdrum input.  With the "-k"
    option, CENSUS provides a summary of a further ten features
    pertaining to **kern format inputs.  This includes the highest
    and lowest notes present.

    census -k <inputfile>

    Since we are interested in the highest and lowest notes for
    each individual part (rather than for the whole piece),
    we should EXTRACT each part before processing it with CENSUS.

    extract -i '*soprano' <inputfile>  > soprano.part
    census -k soprano.part

    On UNIX, a set of commands that sequentially process a given
    input can be joined together as a "pipeline".  A pipeline feeds
    the output of one process to the input of another process.  This
    means that we can simplify the above sequence of commands into a
    single pipeline:

    extract -i '*soprano' <inputfile> | census -k

    We could then repeat the pipeline for each voice present:

    extract -i '*alto'    <inputfile> | census -k
    extract -i '*tenor'   <inputfile> | census -k
    extract -i '*bass'    <inputfile> | census -k

    If we wanted to get a little fancier, we could filter the
    output so that only the highest and lowest pitch information
    is output.  The Unix GREP command will let us define a string
    for searching; EGREP permits compound strings, such as the
    use of the OR bar (|):

    extract -i '*soprano' <input>  | census -k | egrep 'Highest|Lowest'

::::::::::::::::::::::::
(2) AUGMENTED/DIMINISHED Melodic Intervals
::::::::::::::::::::::::

    RULE:  "Avoid parts that move by augmented or diminished
            intervals."

    Implementing this is simple.  We first translate our pitch-
    related data to the melodic interval format -- **mint.  This
    can be done using the Humdrum MINT command.  For example,
    consider the following melodic fragment from the 2nd movement
    of Bach's Brandenburg Concerto No. 5:

    **kern
    8r
    8f#
    8b
    16.cc#
    32dd
    8a#
    16.b
    32cc#
    8dd
    *-

    Given this input, the MINT command will produce the following
    output.  Plus signs indicate ascending intervals, while minus
    signs indicate descending intervals; 'P' means perfect, 'M'
    means major, 'm' means minor, 'A' means augmented, and 'd'
    means diminished:

    **mint
    r
    [f#]
    +P4
    +M2
    +m2
    -d4
    +m2
    +M2
    +m2
    *-

    Searching for diminished or augmented intervals is as simple
    as using the Unix GREP command, with the appropriate regular
    expression:

    grep -n '[Ad]' <inputfile>

    The -n option will cause GREP to prepend the line number of
    any matching patterns, so we can refer back to the original
    input file.

    In order to avoid the letters `A' or `d' found in comments or
    interpretations, we might consider using the Humdrum RID command
    (see HUMDRUM NEWS #1):

    rid -GLI <inputfile> | grep -n '[Ad]'

    However, this will cause the line numbers output by GREP to
    be wrong.  The line numbers will correspond to the input file
    with the comments and interpretations removed.

    A better approach is to send the complete file to GREP, and
    use a more circumspect regular expression to eliminate comments
    and interpretations WITHIN GREP.  EGREP allows us to define
    more complex regular expressions:

    egrep '^[^!*].*[Ad]'

    The expression `^[^!*]' means "not an exclamation mark or asterisk
    at the beginning of a line."  The expression `.*[Ad]' means
    "zero or more instances of any character followed by either an
    upper-case letter `A' or a lower-case letter `d'.

    In other words, the complete regular expression matches any
    line that contains either an upper-case `A' or lower-case `d'
    as long as the beginning of the line does not start with an
    exclamation mark (i.e. Humdrum comment) or an asterisk (i.e.
    Humdrum interpretation).

    If we want to look for augmented or diminished intervals in a
    particular part or voice, we would begin by using the EXTRACT
    command to isolate the voice of interest.

    Finally, putting all of the elements together in a Unix pipeline,
    we get the following command for identifying augmented or
    diminished melodic intervals:

    extract -i 'alto' <file> | mint | egrep -n '^[^!*].*[Ad]'

    If there is no output, then there are no augmented or diminished
    intervals present.

:::::::::::::::
(3) CONSECUTIVE Fifths or Octaves
:::::::::::::::

    RULE:  "Avoid consecutive fifths and octaves between any two
            parts."

    Let's focus on identifying consecutive fifths -- since the
    process is the same for octaves.

    Either the Humdrum PATT or PATTERN commands can be used to find
    patterns that span more than one line or record.  For this
    example, we'll use PATT.

    First, we need to reformat our input so the data represent
    harmonic intervals rather than pitches.  The Humdrum HINT
    command will change most pitch representations to the harmonic
    interval representation -- **hint.  Consider, for example, the
    following input:

    **kern   **kern
    =1       =1
    4c       4e
    4G       4d
    =2       =2
    2F       2c
    *-       *-

    Given the following command:

    hint <inputfile>

    The following output will be produced:

    **hint
    =1
    M3
    P5
    =2
    P5
    *-

    (Notice that, in this case, the consecutive fifths are
    separated by a barline.)

    Second, we need to define a pattern template for the PATT
    command.  The template is a series of one or more regular
    expressions that are stored in a separate file.  In this case
    the pattern is trivial: just two consecutive perfect fifth
    tokens.  We might store the following pattern in the file
    "template":

    P5
    P5

    (Note that if we were looking for consecutive `fifths' that need
    not necessarily be `perfect,' we could simply eliminate the
    letter "P" in each interval given in the template.)

    Given the above output from the HINT command, we could search
    for occurrences of the defined pattern using the following
    command:

    patt -f template -s = <hint.file>

    The "-f" option is used to identify the file ("template") in
    which the pattern-template has been stored.  The "-s" option
    tells PATT of any input records that should be SKIPPED during
    the search process.  The "-s" option is followed by a regular
    expression -- in this case the equals-sign -- so that any
    input records containing the equals-sign (i.e. **hint barlines)
    are ignored.

    The default output from PATT identifies the location of any
    instances of the pattern it finds in the source document.

    The appropriate pipeline is:

    hint <inputfile> | patt -f template -s =

    There are a few refinements we ought to add to this process.
    Currently, we are searching for parallel perfect fifths only.
    The consecutive fifths rule pertains to all intervals that
    are compound-equivalents to perfect fifths (such as perfect
    twelfths, etc.).

    This additional criteria is easily handled.  The HINT command
    provides a "-c" option that causes all compound intervals
    to be represented by their non-compound equivalents.  For
    example, major tenths and major seventheenths, etc. will all
    be represented as "M3", and so on.  Hence we would modify our
    pipeline:

    hint -c <inputfile> | patt -f template -s =

    (Note that an alternative way of handling the compound-intervals
    question would be to define slightly more complex regular
    expressions in our template file, e.g.

    P5|P12|P19
    P5|P12|P19

    In regular expressions the vertical bar (|) denotes the logical
    `OR' operation.  So the above pattern says "a perfect fifth OR
    a perfect twelfth OR a perfect nineteenth followed by a ..." )

    Another refinement relates to the selection of voices.  So far,
    we have presumed that the input consists of just two Humdrum
    spines containing separate parts.  In a multi-part score, we
    must examine each pair of voices in turn, in order to determine
    whether any pair exhibit consecutive fifths or octaves.

    The simplest (but more tedious) approach, is simply to execute
    our command pipeline for each pairing of voices.  For example,
    in a traditional four-part harmonization:

    extract -i '*soprano,*alto'  <file> | hint -c | patt -f template -s =
    extract -i '*soprano,*tenor' <file> | hint -c | patt -f template -s =
    extract -i '*soprano,*bass'  <file> | hint -c | patt -f template -s =
    extract -i '*alto,*tenor'    <file> | hint -c | patt -f template -s =
    extract -i '*alto,*bass'     <file> | hint -c | patt -f template -s =
    extract -i '*tenor,*bass'    <file> | hint -c | patt -f template -s =

    (There are shorter ways of doing these permutations that involves
    a little shell programming, but we'll leave that for another time.)

:::::::::::
(4) DOUBLED Leading Tone
:::::::::::

    RULE:  "Avoid doubling the leading-tone."

    Pitches can be identified as "leading-tones" only when we have
    some idea of their key-related scale-degree.  The Humdrum DEG
    command translates pitch representations to a scale-degree 
    representation where the numbers 1 to 7 represent tonic to
    leading-tone designations.

    Notice that the score input must contain an explicit key
    indication (a special type of Humdrum tandem interpretation).
    For example, the key of G major is indicated through the presence
    of the following interpretation:

    *G:

    Minor keys are indicated using lower-case characters.  For
    example, the following passage is in B minor:

    **kern
    *b:
    8r
    8f#
    8b
    16.cc#
    32dd
    8a#
    16.b
    32cc#
    8dd
    *-

    The DEG command can be used to transform this representation to
    scale degree.  The passage begins on the dominant (degree `5'),
    ascends (^) to the tonic (`1'), ascends to the supertonic (^2),
    ascends to the median (^3) and then descends to the leading-
    tone (v7), etc.:

    **deg
    *b:
    r
    5
    ^1
    ^2
    ^3
    v7
    ^1
    ^2
    ^3
    *-

    The "-x" option for the DEG command eliminates from the output
    stream any characters that don't pertain to scale-degree.  Hence
    the following command will eliminate durations or other possible
    number representations that might conflict with scale-degree
    designations:

    deg -x <inputfile>

    Having translated the representation in this way, we need to search
    for any lines which contain two instances of the number `7' --
    that is, two concurrent instances of the leading-tone.

    Searching for the number `7' is easily done using the standard
    Unix GREP (get regular expression) command:

    deg -x <inputfile> | grep -n '7'

    This will find and output all records that contain the number 7;
    the "-n" option means that the corresponding line number will
    also be output.

    However, we want to find instances where two or more 7s occur
    on a single line.  For this, we can use a slightly more
    complex regular expression

    deg -x <inputfile> | grep -n '7.*7'

    In the construction ".*" the period (.) means any character, and
    the asterisk means "zero or more instances of ..."  Hence, the
    regular expression means "the number 7 followed by zero or more
    instances of any character, followed by the number 7".  In short,
    this expression will match any record in which the number 7
    occurs at least twice.

    As in the case of our earlier search for augmented and diminished
    intervals, GREP is insensitive to whether the matching character
    string is found in a data record, or whether it occurs in a Humdrum
    comment or interpretation.  In order to avoid matching comments
    or interpretations, a further refinement to our regular expression
    is appropriate.

    deg -x <inputfile> | egrep -n '^7.*7|^[^!*].*7.*7'

    In this case, the regular expression says the following: "find
    any occurrence of the number 7 at the beginning of the line
    followed by zero or more characters followed by the number 7;
    or match any character at the beginning of the line -- other than
    an exclamation mark or an asterisk -- followed by zero or more
    characters, followed by the number 7, followed by zero or more
    characters, followed by another number 7.

    Depending on the input, it is possible that Humdrum spines will
    be present that do not represent scale degree information.
    It is possible that these other kinds of data may also make
    use of the number 7 -- but NOT to represent the leading-tone.
    In other words, it is possible that a matching `7' has nothing
    to do with scale degrees.  We can ensure that this doesn't
    happen by first ensuring that ONLY scale-degree spines are
    present in the input to be searched.

    In order to do this, we can use the Humdrum EXTRACT command as
    a filter, and identify the types of interpretations we want to
    pass.  In the following modification to our pipe, the EXTRACT
    command has been used to ensure that only **deg spines are present:

    deg -x <inputfile> | extract -i '**deg' | egrep -n '^7.*7|^[^!*].*7.*7'

    There are still some refinements that we could add to this
    command sequence, but as it stands it is guaranteed to find all
    doubled leading-tones -- provided the notes begin at the same
    time.  Consider the following hypothetical passage:

    **kern    **kern
    *C:       *C:
    8c        8g
    =1        =1
    4B        8g
    .         16a
    .         16b
    4A        4cc
    *-        *-

    Given the above command sequence, no doubled leading-tones
    would be identified in this passage.  However, we might wish
    to implement a more stringent criterion that would seek out any
    instances where the leading-tone sounds at the same time in more
    than one voice.  This occurs in the above example with the
    sixteenth-note B concurrent with the held quarter-note B in
    the other part.

    This criterion can be accommodated by a further refinement to
    our command pipeline.  The Humdrum FILL command is used to
    replace null data tokens by the immediately preceding data token
    in the same spine.  Consider first, the output from the DEG
    command for the above example:

    **deg     **deg
    *C:       *C:
    1         5
    =1        =1
    v7        5
    .         ^6
    .         ^7
    v6        ^1
    *-        *-

    If we now invoke the FILL command, the modified output is:

    **deg     **deg
    *C:       *C:
    1         5
    =1        =1
    v7        5
    v7        ^6
    v7        ^7
    v6        ^1
    *-        *-

    Notice that the two null tokens in the left-hand spine have
    been replaced by copies of the most recent data token.
    Now our GREP command will find the two leading tones in the
    second last data record.

    In summary, the complete command pipeline would be:

    deg -x <file> | extract -i '**deg' | fill -s = | egrep -n '^7.*7|^[^!*].*7.*7'

    This may seem somewhat complicated, but the basic structure of
    this pipeline is suitable for a very wide variety of pattern
    searches.

:::::::::
(5) AVOID Unisons
:::::::::

    RULE:  "Avoid the sharing of pitches by two parts."

    For this rule, let's assume that we also want to identify unisons
    that are spelled enharmonically (such as F-sharp and G-flat).

    First, we need to translate the two parts into some absolute
    pitch representation -- such as frequency or semitones.  This
    will ensure that enharmonically equivalent pitches have the
    same representation -- and so will facilitate comparison.

    The Humdrum SEMITS command translates pitches to semitone
    distances where middle C is denoted as zero.  For example,
    where two voices both play B3 at the same time, both the parts
    will have a **semits value of minus one (-1).

    Like the DEG command, the SEMITS command provides a "-x" option
    that eliminates from the output stream any characters that don't
    pertain to semitones.  Hence the following command will eliminate
    durations or other possible numerical representations that might
    conflict with semitone designations:

    extract -i '*alto,*tenor' <file> | semits -x

    Next we need to compare the two parts at each moment in order to
    determine whether they have the same numerical value.  The Unix
    AWK command will allow us to do some arithmetic.  AWK auto-
    matically parses an input, so the value of the first spine is
    referred to as `$1', the value of the second spine is `$2' and
    so on.  The AWK expression `$1==$2' is a test of whether the
    first and second spines are equivalent.  The AWK action `print NR'
    means to print the current line number (record number is `NR').

    So the following command will print the line number for any
    input in which the semitone value is the same for both the
    alto and tenor voices:

    extract -i '*alto,*tenor' <file> | semits -x | awk '{if($1==$2) print NR}'

    There is a problem with this pipeline however.  The AWK command
    will match all sorts of non-numeric inputs -- such as where
    null tokens (.) occur in both parts at the same time.  Consequently,
    we need to be careful to avoid non-numeric inputs and comments.

    The regular expression `[^0-9]' will match any line that doesn't
    consist solely of numbers.  The expression `[^0-9-]' will match
    any line that doesn't consist solely of numbers or the minus sign.
    Since the tab character will also be present in our data records,
    we should also include the tab in our regular expression.  The
    tab may be denoted in regular expressions by the lower-case
    letter `t' preceded by a backslash.  Hence the expression
    `[^0-9\t-]' will match only those lines consisting solely of
    numbers, the minus sign, and tabs.

    The following AWK script will output the line numbers for all
    inputs where the first and second spines contain identical
    numbers:

    awk '{if($0~/[^0-9\t-]/)next}{if($1==$2) print NR}'

    Adding this construction to our pipeline produces the following
    command for identifying unisons:

    extract -f 1,2 <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1==$2) print NR}'

:::::::::::
(6) CROSSED Parts
:::::::::::

    RULE:  "Avoid the crossing of parts."

    Part-crossing occurs when a nominally higher voice uses a pitch
    that is lower than a nominally lower voice.

    The relations "higher" and "lower" suggest the use of an arithmetic
    operator such as greater-than (>) or less-than (<).  In brief, we
    will approach this question by translating the pitches to a
    numerical scale, and then use the general-purpose Unix AWK command
    to test whether the nominally lower voice is truly lower.

    First we need to translate the pitch representation to some sort
    of numerical form.  We have several options.  We could translate
    the pitches to frequency (**freq), or we could translate them to
    semitones (**semits), or we could translate them to cents (**cents).
    Let's use **semits.  Once again, in this representation, middle-C
    is represented by the number zero, and all other pitches are
    represented by their semitone distance (positive or negative) with
    respect to this reference.

    We extract the two parts of interest, and then translate them to
    the semitone numerical representation:

    extract -i '*soprano,*alto' <file> | semits -x

    Since part-crossing may occur when one voice is holding a note,
    we should use the Humdrum FILL command, as we did for the doubled
    leading-tone problem.  Hence:

    extract -i '*alto,*tenor' <file> | semits -x | fill -s =

    Finally, we can use the Unix AWK command to do a little arithmetic.
    Once again, in AWK, `$1' and `$2' refer to the first and second
    input fields, and the built-in variable `NR' refers to the current
    record (line) number.  The expression `{if($1>$2) print NR}' is
    a miniature program that says: "if the first input field is
    numerically greater than the second field for the current line,
    then print the line number:

    extract -i '*alto,*tenor' <file> | semits -x | fill -s = | awk '{if($1>$2) print NR}'

    In short, if the left-most spine has a lower numerical value than the
    second spine, then tell us where that occurs.

    Since the **semits representation uses the lower-case letter `r' to
    represent a rest, we should avoid the possibility of comparing a
    number (note) with a rest.  We can use a variation on the same AWK
    script as we used when checking for unisons:

    awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'

    Finally, the complete pipeline for identifying crossed parts:

    extract -i '*soprano,*alto' <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'
    extract -i '*alto,*tenor'   <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'
    extract -i '*tenor,*bass'   <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'

:::::::::::::::::::
(7) PARTS SEPARATED By Greater Than An Octave
:::::::::::::::::::

    RULE:  "Avoid intervals greater than an octave between the soprano
            and alto voices.  Also avoid intervals greater than an
            octave between the alto and tenor voices."

    Finding infringements of this voice-leading rule requires just
    a slight modification to our method for identifying the crossing
    of parts.

    Having transformed the pitch input to a numerical form, we simply
    need to check whether the difference between the two semitone
    values is greater than 12 semitones.

    The AWK portion of our command is modified so that we are informed
    if the nominally higher voice is more than 12 semitones away
    from the other voice:

    extract -i '*soprano,*alto' <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($2-$1>12) print NR}'
    extract -i '*alto,*tenor'   <file> | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($2-$1>12) print NR}'

::::::::::::::
(8) OVERLAPPED Parts
::::::::::::::

    RULE:  "Avoid the overlapping of parts, where the pitch in an
            ostensibly lower voice moves to a pitch higher than
            the previous pitch in an ostensibly higher voice; or
            where the pitch in an ostensibly higher voice moves
            to a pitch lower than the previous pitch in an
            ostensibly lower voice."

    The following passage illustrates a violation of the part
    overlapping rule:

    **pitch   **pitch
    C4        E4
    F4        A4
    E4        G4
    *-        *-

    (In the second sonority, the lower voice (F4) moves to a
    pitch higher than the previous pitch in the higher voice (E4).)

    This rule is similar to the part-crossing rule, only we have
    to compare the current pitch in one part with the previous
    pitch in another part.

    Rather than making a direct comparison, for the purpose of this
    tutorial, we will make a modification to our earlier part-crossing
    detector.  In brief, we will extract one of the parts, shift the
    data tokens within that part, paste the two parts back together,
    and then check to determine whether the shifted part shows any
    "crossed parts" -- using our earlier command pipeline.

    The following command pipe will shift the data tokens in a spine
    down one record.  (The last data record will disappear.)

    context -n 2 -p 1 -d XXX <file> | humsed 's/XXX.*//'

    In this tutorial, we won't discuss how this works, since the
    CONTEXT and HUMSED commands will be covered in a future
    tutorial.  For now, we can note that shifting (say) the alto
    part can be done by extracting the appropriate voice, and then
    using the shift command sequence shown above:

    extract -i '**alto' <file> | context -n 2 -p 1 -d XXX | humsed 's/XXX.*//' > alto.shf

    If we want to compare, say, the soprano and alto voices, we need to
    extract both parts, and shift one of them:

    extract -i '*soprano' <file> > soprano
    extract -i '*alto' <file> | context -n 2 -p 1 -d XXX | humsed 's/XXX.*//' > alto.shf

    Next, we need to assemble the shifted and unshifted parts back into
    a single score:

    assemble alto.shf soprano > tempfile

    Then we test this intermediate file for instances of "part-crossing"
    -- using our earlier command pipeline:

    semits -x tempfile | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'

    Avoiding the temporary file altogether:

    assemble alto.shf soprano | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'

    Note that this procedure has determined whether any of the notes
    in the soprano voice are lower than the previous note in the
    alto voice.  We also need to check whether any of the notes in
    the alto voice are higher than the previous note in the soprano
    voice.  To do this, we simply repeat the process, shifting the other
    voice:

    extract -i '*soprano' <file> | context -n 2 -p 1 -d XXX | humsed 's/XXX.*//' > soprano.shf
    extract -i '*alto' <file> > alto
    assemble alto soprano.shf | semits -x | fill -s = | awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print NR}'

    This processing needs to be applied for each pair of successive
    voices -- soprano/alto, alto/tenor, tenor/bass.

:::::::::::
(9) EXPOSED Octaves
:::::::::::

    RULE:  "When approaching an octave by similar motion, ensure that
            at least one of the parts moves by step."

    Violations of the exposed octaves rules must meet three
    conditions.  First, the two voices must be separated by an
    octave (or two octaves, or a unison, etc.).  (This suggests
    that we use the HINT command with the "-c" option in order to
    reduce compound intervals to their non-compound equivalents.)
    Second, the voices must be moving in the same direction.  (The
    **deg representation may be suitable here, since it distinguishes
    notes according to whether they are approached from below ("^")
    or above ("v").)  Third, both voices must be moving by leap
    (e.g. more than two semitones).

    To address this problem, let's plan to create five different
    spines.  The first spine will encode harmonic interval size so
    that all compound equivalents to a unison are represented by
    the string "P1".

    The second spine will indicate whether the lower voice is
    ascending or descending ("^" or "v").  Similarly, the third
    spine will indicate whether the upper voices is ascending or
    descending.

    The fourth spine will indicate whether the melodic motion for
    the lower voice is by leap ("leap"), and the fifth spine will
    indicate whether the melodic motion for the upper voice is by
    leap.

    Examples of violations of the exposed octaves rule will appear
    as one of the following two situations:

   (**hint     **updown    **updown    **size   **size)
    P1         ^           ^           leap     leap
    P1         v           v           leap     leap

    Any other situation means that the exposed octaves rule has not
    been violated.

    In other words, our final test can be expressed using the
    following EGREP command:

    egrep -n 'P1.*^.*^.*leap.*leap|P1.*v.*v.*leap.*leap'

    Now all we need to do is generate our five spines and assemble
    them in the proper order.

    The first spine is easily generated using the HINT command.
    Remember that the "-c" option means that all intervals an octave
    or greater will be represented by the within-octave equivalent.

    extract -i '*alto,*tenor' <file> | hint -c > spine1

    The second and third spines can be generated using the Humdrum
    DEG command:

    extract -i '*alto' <file>  | deg -x > spine2
    extract -i '*tenor' <file> | deg -x > spine3

    The fourth and fifth spines require a little more work.  First,
    we calculate the melodic intervals for each voice using the
    Humdrum MINT command.

    extract -i '*alto' <file> | mint  ...

    Secondly, we need to change all data tokens indicating intervals
    greater than a diatonic second (3 or more semitones) into the
    data token consisting of the (arbitrary) character string "leap".
    This can be done using the HUMSED stream editor.

    ...  humsed 's/.*[3-9].*/leap/' > spine4   [spine5 for the other voice]

    Putting it all together, the following command sequence will let
    us identify any instances of exposed octaves between two arbitrary
    voices:

    extract -i '*alto,*tenor' <file> | hint -c > spine1
    extract -i '*alto' <file>  | deg -x > spine2
    extract -i '*tenor' <file> | deg -x > spine3
    extract -i '*alto' <file>  | mint | humsed 's/.*[3-9].*/leap/' > spine4
    extract -i '*tenor' <file> | mint | humsed 's/.*[3-9].*/leap/' > spine5
    assemble spine1 spine2 spine3 spine4 spine5 > tempfile
    egrep -n 'P1.*^.*^.*leap.*leap|P1.*v.*v.*leap.*leap' tempfile

   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

:::::::
REPRISE
:::::::

Finally, let's bring together all of the elements of our voice-leading
tutorial into a single package.  Below is a shell program (dubbed
`LEADER') that identifies all instances of betrayals of nine classic
rules of voice-leading for a two-part input.  A number of refinements
have been added to the program -- including input file checking, and
formatting of the output.

The program is invoked as follows:

    leader <file>

The input is assumed to contain two voices, each in a separate **kern
spine.  The nominally lower voice should be in the first spine.  For
music containing more than two voices, the Humdrum EXTRACT command
should be used to select successive pairs of voices for processing
by LEADER.

                                   \|||/
                                   (o o)
                     -----------o00-(_)-00o-----------
                    |                                 |
                    |            SOFTWARE             |
                    |                                 |
                     ---------------------------------


-------------------------------- clip here ---------------------------------
# LEADER
#
# A korn-shell program to check for voice-leading infractions;  invoked as:
#
#   leader <filename>
#
# where <filename> is assumed to contain two voices, each in a separate
# **kern spine, where the nominally lower voice is in the first spine.

# Before processing, ensure that a proper input file has been specified.
if [ ! -f $1 ]
then   echo "leader: file $1 not found"
       exit
fi
if [ $# -eq 0 ]
then   echo "leader: input file not specified"
       exit
fi

# 1. Record the ranges for the two voices.
echo 'Range for Upper voice:'
extract -f 2 $1 | census -k | egrep 'Highest|Lowest' | sed 's/^/      /'
echo 'Range for Lower voice:'
extract -f 1 $1 | census -k | egrep 'Highest|Lowest' | sed 's/^/      /'

# 2. Check for augmented or diminished melodic intervals.
extract -f 1 $1 | mint -b r | sed '/\[[Ad][Ad]*\]/d' | egrep -n '^[^!*].*[Ad][^1]' |\
    sed 's/:/ (/;s/$/)/;s/^/Augmented or diminished melodic interval at line: /'
extract -f 2 $1 | mint -b r | sed '/\[[Ad][Ad]*\]/d' | egrep -n '^[^!*].*[Ad][^1]' |\
    sed 's/:/ (/;s/$/)/;s/^/Augmented or diminished melodic interval at line: /'

# 3. Check for consecutive fifths and octaves.
echo 'P5'  > $TMPDIR/template;  echo 'P5'  >> $TMPDIR/template
hint -c $1 | patt -f $TMPDIR/template -s = | \
    sed 's/ of file.*/./;s/.*Pattern/Consecutive fifth/'
echo 'P1'  > $TMPDIR/template;  echo 'P1'  >> $TMPDIR/template
hint -c $1 | patt -f $TMPDIR/template -s = | \
    sed 's/ of file.*/./;s/.*Pattern/Consecutive octave/'

# 4. Check for doubling of the leading-tone.
deg -x $1 | extract -i '**deg' | fill -s = | sed 's/^=.*/=/' | \
    egrep -n '^7.*7|^[^!*].*7.*7' | egrep -v '7[-+]' | \
    sed 's/:.*/./;s/^/Leading-tone doubled at line: /'

# 5. Check for unisons.
semits -x $1 | fill -s = | \
    awk '{if($0~/[^0-9\t-]/)next}{if($1==$2) print "Unison at line: " NR}'

# 6. Check for the crossing of parts.
semits -x $1 | fill -s = | sed 's/^=.*/=/' | \
    awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print "Crossed parts at line: " NR}'

# 7. Check for more than an octave between the two parts.
semits -x $1 | fill -s = | awk '{if($0~/[^0-9\t-]/)next} \
    {if($2-$1>12) print "More than an octave between parts at line: " NR}'

# 8. Check for overlapping parts.
extract -f 2 $1 | sed 's/^=.*/./' | context -n 2 -p 1 -d XXX | \
    rid -GL | humsed 's/XXX.*//' > $TMPDIR/upper
extract -f 1 $1 | sed 's/^=.*/./' > $TMPDIR/lower
assemble $TMPDIR/lower $TMPDIR/upper | semits -x | fill | \
    awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print "Parts overlap at line: " NR}'
extract -f 1 $1 | sed 's/^=.*/./' | context -n 2 -p 1 -d XXX | \
    rid -GL | humsed 's/XXX.*//' > $TMPDIR/lower
extract -f 2 $1 | sed 's/^=.*/./' > $TMPDIR/upper
assemble $TMPDIR/lower $TMPDIR/upper | semits -x | fill | \
    awk '{if($0~/[^0-9\t-]/)next}{if($1>$2) print "Parts overlap at line: " NR}'

# 9. Check for exposed octaves.
hint -c $1 > $TMPDIR/s1
extract -f 1 $1 | deg -x > $TMPDIR/s2
extract -f 2 $1 | deg -x > $TMPDIR/s3
extract -f 1 $1 | mint | humsed 's/.*[3-9].*/leap/' > $TMPDIR/s4
extract -f 2 $1 | mint | humsed 's/.*[3-9].*/leap/' > $TMPDIR/s5
assemble $TMPDIR/s1 $TMPDIR/s2 $TMPDIR/s3 $TMPDIR/s4 $TMPDIR/s5 > $TMPDIR/temp
egrep -n 'P1.*\^.*\^.*leap.*leap|P1.*v.*v.*leap.*leap' $TMPDIR/temp | \
    sed 's/:.*/./;s/^/Exposed octave at line: /'

# Clean-up some temporary files.
rm $TMPDIR/template $TMPDIR/upper $TMPDIR/lower $TMPDIR/s[1-5] $TMPDIR/temp
-------------------------------- clip here ---------------------------------

[End of Tutorial]

   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


::::::::::::::::::::::::
HUMDRUM REFERENCE MANUAL    Printing without Postscript
::::::::::::::::::::::::

Not all users have access to a postscript printer, and this
can cause problems printing the 552-page Humdrum Reference Manual.

Gregory Lewin writes that a good solution is to make use of
GHOSTSCRIPT -- a public domain program that will allow postscript
files to be printed on non-postscript printers.

Ghostscript reads .ps files and converts the information to a
form suitable to drive either a video display or a printer.
Lewin reports that he was able to print out the Humdrum Manual
on his Canon BJ10ex (an inexpensive bubble jet printer), and
also printed a few test pages on an old Epson 9-pin Dot matrix
printer with quite acceptable results.

GHOSTSCRIPT is available on several platforms, but will probably
be of most use to MS-DOS users who do not have access to a
postscript printer.

The DOS version of GHOSTSCRIPT is available via ftp from many sites.
One source is:

   ftp.cica.indiana.edu (129.79.20.84)
   in the directory pub/pc/win3/util/.

(If you don't know how to use FTP, refer to Issue #1 of HUMDRUM
NEWS.)

Lewin continues with the following description:

   "Two files are needed from the ftp site:
    
         gsXXXfnt.zip  contains font files
         gsXXXexe.zip  contains the executable binaries.
    
    ("XXX" represents a version number. In my case 260 and 261.)
    
    Copy the files in gsXXXexe.zip to a directory c:\gs and those
    from gsXXXfnt.zip into c:\gs\fonts. (Unfortunately there is a
    bug that prevents the program from working when installed on
    disks other than the c: drive.)  Also, the DOS "TEMP" environ-
    ment variable must point to a valid temporary directory.
    
    GsXXXexe.zip contains two DOS versions plus one for Windows.
    Plain gs.exe is a DOS version for XT and 286 machines.  This
    version does not seem to be able to grab enough memory to print
    the Humdrum files.  Gs386.exe is the same program with support
    for extended memory. I was able to use this to print the manual
    without any problems. Gs386 requires a 386/486/+ machine.
    
    Decide which of the built-in drivers is the best for your
    printer.  (You can find this out by typing gs386 -h for a
    list.)  Then invoke the program using
    
           'gs386 -sDEVICE=<device driver> <file.ps>'.
    
    If your printer has a page feeder add the parameter -dNOPAUSE
    before <file.ps>.
    
    If there is no suitable driver for your printer, the source
    files with information for writing drivers is available in
    gsXXXsrc.zip. However, the built-in drivers cover a very wide
    range of printers so it is unlikely that this will be necessary.
    
    An easier solution might be to use the Windows version (gswin)
    along with another program ghostview (available from the same
    place -- gsviewXX.zip).  This combination includes some nice
    extras (like printing pages out of sequence).  Unfortunately,
    it objects to most of the Humdrum manual files unless the data
    for the initial extra page is cut out using a text editor.
    Once this is done, it will print on any graphics capable printer
    set up for Windows."
    
Lewin reports that his only problem was running out of ink half-way
through the especially big Humdrum "manual5.ps" file.
    
   "The gs386.exe program can only print the pages sequentially so
    if the process is halted it has to be restarted from the beginning
    (no joke after 200 pages)."
    
So be sure you have enough paper and ink!
    
[End of HUMDRUM NEWS]