. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
H U M D R U M N E W S
Issue No. 3 1996 February 29
A Newsletter for Music Researchers Using the Humdrum Toolkit.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
HUMDRUM NEWS is a newsletter intended to facilitate communication with
music researchers who are either using the Humdrum Toolkit, or are
contemplating using the Humdrum Toolkit.
Highlights from this issue of HUMDRUM NEWS:
* A new database of 6,000+ folksong melodies available.
* A 2-week Humdrum seminar to be held at Stanford University.
* An advanced tutorial on repertoire selection and searching.
This issue of HUMDRUM NEWS conveys several announcements and a
single Humdrum tutorial. The tutorial illustrates how to generate
repertoire-lists of works that conform to complex selection criteria.
For example, the tutorial shows users how to identify all works in
a database that are rondos in compound meters and are in minor keys.
Your comments and questions are welcome. Mail to
David Huron
dhuron@watserv1.uwaterloo.ca
Music Department
University of Waterloo,
Waterloo, Ontario N2L 3G6
Canada
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
\|||/
(o o)
-----------o00-(_)-00o-----------
| |
| HUMDRUM ANNOUNCEMENTS |
| |
---------------------------------
::::::::::::::
MUSICAL SCORES The Essen Folksong Collection
::::::::::::::
The "Essen Folksong Collection" is a computer database of some 6,255
folksong transcriptions. This collection is a Humdrum version of
the "ESAC" database which was assembled at the Gesamthochschule of
Essen University under the direction of Dr. Helmut Schaffrath. The
original ESAC database was developed from 1982 until Dr. Schaffrath's
sad death in 1994.
The Essen Folksong Collection encodes complete folk melodies from
a large number of traditional and scholarly sources. Most of the
materials are folksongs from Germany, Austria and Switzerland, but
the collection includes several hundred melodies from other regions
of the world -- mostly from Europe. Of special note is a collection
of 213 German Kinderlieder (children's songs).
The encoded data includes pitch, duration, rests, barlines, keys
and key signatures, meter signatures, as well as title, source,
geographical region of origin, and genre or style designations.
The "Essen Folksong Collection in the Kern Format" is distributed
on four DOS-format disks and includes a 34-page reference, tutorial,
and installation guide. In order to install the database, you must
have access to the UNIX "uncompress" command.
The Collection is available for $25.00. For ordering information
write, phone, or e-mail:
Center for Computer Assisted Research
in the Humanities
525 Middlefield Road, Suite 120
Menlo Park, California 94025-3443
1-800-JSB-MUSE (toll-free order & information line)
ccarh@netcom.com
The continued development of the ESAC database is currently directed
by Dr. Ewa Dahlig at the Helmut Schaffrath Laboratory of Computer
Aided Research in Musicology, in Warsaw, Poland (eda@plearn.edu.pl).
Net proceeds after costs for sales of the Essen Folksong Collection
are donated to the Schaffrath Laboratory.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
::::::::::::::
HUMDRUM COURSE Offered at Stanford University in August 1996
::::::::::::::
Following last summer's successful 2-week Humdrum seminar at
McGill University in Montreal, a second seminar will be offered
at Stanford University from August 19th to August 30th, 1996.
The Stanford course provides a comprehensive introduction to
computer-assisted music research using the Humdrum Toolkit.
Participants will learn to manipulate computer-based scores,
tablatures, and other documents in order to solve a wide variety
of musicological problems. By way of example, participants will
learn how to characterize common patterns of orchestration in
Beethoven symphonies, examine harmonic progressions in Bach
chorale harmonizations, and investigate text/melody relationships
in Gregorian chant. Dozens of sample problems will be discussed,
including problems in ethnomusicology.
Participants will have on-line access to thousands of full scores
for processing -- including repertoires from innumerable cultures,
periods, and genres. The course will be of particular value to
scholars contemplating graduate-level or advanced music research
projects.
All software and documentation from the workshop (including
a sizeable database of musical scores) are free to take.
The software is available for UNIX, DOS, OS/2 and Windows-95
operating systems. (Some technical restrictions apply.)
Some working knowledge of UNIX is an asset. Familiarity with
the `emacs' or `vi' text editors is recommended.
The course fee is $800 and does not include accommodation.
The Center for Computer Assisted Research in the Humanities is
offering modest financial assistance for needy students. Contact
dhuron@watserv1.uwaterloo.ca for details.
For further information concerning the Stanford Humdrum Workshop,
e-mail Alex Igoudin:
aledin@ccrma.Stanford.edu
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
::::::::
BERKELEY CCARH Open House
::::::::
Following the Society for Music Perception and Cognition Conference
at the University of California at Berkeley last summer, the Center
for Computer Assisted Research in the Humanities held a special one
day open-house. The open-house featured tours of the facilities,
demonstrations by the staff, and displays of recent electronic editions
and published music releases from CCARH. Thanks to all those who
participated in this event.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
\|||/
(o o)
-----------o00-(_)-00o-----------
| |
| TUTORIAL |
| |
---------------------------------
::::::::
TUTORIAL Assembling a Musical Repertoire according to Specified Criteria
::::::::
In initiating a research project, music scholars often begin by
selecting a suitable repertoire for study. Typically, the scholar
will focus on a particular composer, period, form, or style. For
example, a researcher might focus on 17th century canons, or on
Beethoven's compositions prior to Opus 1, or on French piano works
composed during the 1870s. Depending on the research task, the
scholar may wish to locate works that conform to highly complex
criteria -- such as solo Baroque flute works written in compound
meters, or slow Russian symphonic movements that don't include any
wind instruments and are written in major keys.
Of course, all of these selection criteria assume the existence of
a large database of encoded music from which a suitable repertoire
can be selected. As the number of scores in the Humdrum format
approaches the 10,000 mark, automated methods for locating specific
repertoire become increasingly important.
In this tutorial, we discuss how to search entire file-systems
for Humdrum scores that conform to user-specified criteria. In
most cases, one or two commands are all that is necessary to assemble
a repertoire list of works conforming to complex selection criteria.
The preeminent "leitmotif" in this tutorial is the UNIX "find"
command. The "find" command traverses through a file hierarchy,
and finds all files that match certain conditions. The "find"
command takes the following syntax:
find <PATH> <OPTIONS> <ACTIONS>
The "PATH" is a directory from which the search commences. All files
in the specified directory are examined including all files in the
subdirectories, sub-subdirectories, and so on.
For example, the path
/
means the root directory containing all files on the system. Even
single-user systems are apt to have several thousand files subsumed
under the root directory.
The path
/scores
means all files under the "scores" directory, whereas
/scores/bach
means all files under the "scores/bach" directory. The period
character:
.
tells "find" to commence searching from the current directory.
Since "find" searches all files under the given path, its operation
may be quite slow if there are thousands of files to search. It's
wise to restrict the search by choosing a reasonable starting point for
the search. For example, specifying the path "/scores/bach/chorales"
may save a great deal of time compared with the path "/scores". Although
we won't discuss them in this tutorial, the "find" command provides
a number of options that help to restrict the depth of searches or
otherwise "prune" the search. When first trying "find" -- it's a
good idea to limit the searches to small parts of the file system.
When searching through the specific PATH, "find" is able to carry
out a wide variety of possible tests on each file. One simple action
is to test whether the file-name conforms to a given regular expression.
Consider, for example, the goal of identifying all files representing
pitch-class (**pc) information. The Humdrum convention is to
identify these files by adding the ".pc" extension to the filename
-- such as "opus24.pc".
The following "find" command will traverse through the /scores
directory and sub-directories searching for files that contain
the ".pc" extension:
find /scores -name *.pc
The above command uses the "-name" option followed by the appropriate
regular expression. This command is unusual in that it has no
explicit ACTION. In such cases, the implied action is to print
the name of all files whose names match the regular expression.
Note that regular expressions may be literal strings and so may be
used to locate a specific file. For example, if you are looking for
a file named "findme" -- you might use the command:
find /scores -name findme
An example of an explicit ACTION might be to delete files conforming
to a particular criterion. For example, the following command
searches the /scores path for files that contain the ".tmp"
extension. Any matching file is then deleted using the UNIX "rm"
command:
find /scores -name *.tmp -exec rm "{}" ";"
This command illustrates a number of features of the "find" command.
The "-exec" identifies the ACTION as that of executing a command.
The arguments between "-exec" and the semi-colon are treated as the
command to be executed. Each time a file is found with the .tmp
extension, the -exec action is executed. The paired curly braces {}
have a special significance to "find." The braces are replaced by
the filename found to match the regular expression. For example,
if the first file found is named "a.tmp" the braces will be replaced
by the string "a.tmp". The quotation marks around the braces and
around the semi-colon are necessary in order to prevent the UNIX shell
from substituting inappropriate information before passing the command
line to "rm".
Note that the -name option defaults to filenames only; it does not
apply to directory names. The above command is equivalent to the
more explicit form:
find /scores -type f -name *.tmp -exec rm "{}" ";"
The -type option can be used to match specific types -- such as
regular files (f), directories (d), network files (n), and so on.
By way of example, the following command deletes all DIRECTORIES
whose names have the ".tmp" extension.
find /scores -type d -name *.tmp -exec rmdir "{}" ";"
:::::::::::::::::
CONTENT SEARCHING
:::::::::::::::::
For most music research applications, we are interested in identifying
files on the basis of their contents. That is, we'd like to know
what's inside the file before we take any action.
The "grep" command is especially useful in determining whether certain
items of information are present in a file. For example, the following
command identifies all files in the path /scores, that contain
passages in 7/8 meter:
find /scores -type f -exec grep -l '\*M7/8' "{}" ";"
Note that the "-l" option for "grep" causes the output to consist only
of names of files that contain the sought regular expression.
The *IC tandem interpretation is used to encode "instrument classes"
such as strings, voice, percussion, etc. The following command
searches all files in the path /scores, and prints a list of files
that encode scores containing one or more woodwind instruments:
find /scores -type f -exec grep -l '\*ICww' "{}" ";"
The following command identifies all files in the path /scores,
that contain passages in the key of C major:
find /scores -type f -exec grep -l '\*C:' "{}" ";"
The following command identifies all files in the path /scores,
that contain passages in any minor key:
find /scores -type f -exec grep -l '\*[a-g][#-]*:' "{}" ";"
Humdrum "reference records" are ideal targets for such searches.
Reference records are formatted global comments (beginning "!!!") that
encode general (reference-related) information concerning a file.
Reference records encode such information as the composer's name,
composer's dates, title of work, date of composition, movement number,
instrumentation, meter classification, and up to 70 other types of
basic information. (For further information refer to the Humdrum
Reference Manual pp. 26-37).
For example, the following command identifies all files in the
path /scores, that are composed by Sweelinck:
find /scores -type f -exec grep -l '!!!COM.*Sweelinck' "{}" ";"
The following command identifies all files in the path /scores,
that are written in compound meters:
find /scores -type f -exec grep -l '!!!AMT.*compound' "{}" ";"
The following command identifies all files in the path /scores,
that are rondos:
find /scores -exec grep -il '!!!AFR.*rondo' "{}" ";"
Note that the "-i" option for "grep" makes the pattern-match
insensitive to upper- or lower-case.
The following command identifies all files in the path /scores,
that have been designated as having heterophonic textures:
find /scores -exec grep -il '!!!AST.*heterophony' "{}" ";"
The following command identifies all files in the path /scores/jazz,
that contain the style-designation "bebop":
find /scores/jazz -exec grep -il '!!!AST.*bebop' "{}" ";"
The following command identifies all files in the path /scores,
that include French horns and bassoons:
find /scores -exec grep -il '!!!AIN.*cor.*fagot' "{}" ";"
Of course, more complex regular expressions can be also be defined.
For example, the following command identifies all works composed
between 1805 and 1809:
find /scores -exec grep -l '!!!ODT.*180[5-9]' "{}" ";"
There is no restriction on the complexity of the regular expression.
The following command identifies all works composed between 1812
and 1848:
find /scores -exec grep -l '!!!ODT.*18(1[2-9])|([23][0-9])|(4[0-8])' "{}" ";"
Like most other UNIX commands, "find" is particularly powerful when
linked to other commands via one or more "pipes." For example, the
following command identifies all files in the path /corelli that
contain a change of meter signature:
find /corelli -name '*' | xargs grep -c '^\*M[0-9]' | grep -v '[2-9]$'
The output specifies each filename followed by a colon, followed by
the number of meter signatures in the corresponding file. For example,
in the following output, the third movement from Opus 1, No. 5 by
Corelli is identified as containing 6 meter signatures at different
points in the score:
/corelli/opus1n5c.krn:6
/corelli/opus1n9a.krn:3
/corelli/opus1n9b.krn:2
/corelli/opus1n9d.krn:2
Similarly, the following command identifies all works that contain
a change of key signature:
find /scores -name '*' | xargs grep -c '^\*k\[' | grep -v ':[2-9]$'
:::::::::::::::::::::::::
REPERTOIRES AS FILE LINKS
:::::::::::::::::::::::::
Once you have located score-files that meet your selection criteria,
it is often helpful to have them accessible in one location. On DOS
systems you might copy the files in a special directory. However,
UNIX systems make it possible to create "links" to files in other
directories without having to make duplicate copies of already
existing files.
Suppose you wanted to make a directory of all scores containing
vocal parts. The following command uses the UNIX file-redirction
feature (">") to create a file ("vocalfiles") listing all files
in the path /scores that contain one or more vocal parts:
find /scores -exec grep -l '!!!AIN.*vox' "{}" ";" > vocalfiles
The contents of "file" may look like the following:
/scores/bach/cantatas/cant140.krn
/scores/bach/chorales/chor217.krn
/scores/bach/chorales/midi/chor368.hmd
etc.
Next, we make the appropriate new directory using the "mkdir" command.
mkdir vocal
Next, edit the file containing the list of filenames as follows.
Insert "ln -s" prior to each filename, and append the directory
name "vocal" at the end of each line.
ln -s /scores/bach/cantatas/cant140.krn vocal
ln -s /scores/bach/chorales/chor217.krn vocal
ln -s /scores/bach/chorales/midi/chor368.hmd vocal
etc.
(The "-s" option is used to create a "symbolic" link.)
Next, we make this file executable, and execute it:
chmod +x vocalfiles
vocalfiles
We now have a new directory whose files contain scores with vocal parts.
Before we end, let's consider two more complicated examples.
Recall that it is possible to use the UNIX "xargs" command to link
several successive searches. This allows us to search for scores
that conform to multiple selection criteria. For example, consider
the case where we are interested in selecting all works that were
written in the 1930s for woodwind quintet. We need to carry out two
searches (the order is unimportant). In the following command, we
traverse through all of the subdirectories in the path /scores using
the "find" command; then we identify those works written in the 1930s
(using "grep"), and finally we identify the subset that contains the
appropriate instrumentation (using another "grep").
find /scores -type f -exec grep -l '!!!ODT:.*193[0-9]' "{}" ";" | xargs | \
grep -l '!!!AIN.*clar, cor, fagot, flt, oboe"
The result is a list of all available scores that were written in the
1930s for woodwind quintet. Once again, we could capture the output
in a file, edit this file to make soft links, and execute the links
to create a suitable repertoire directory.
As our final example, consider the following question: Are German
drinking songs more likely to be in triple meter? The encoded works
in the Essen Folksong Collection contain genre-related tags (encoded
using the Humdrum "AGN" reference record). One of the genres
distinguished is "Trinklied" (drinking songs).
In order to answer our question, we need to search the file system
for all works that have the "Trinklied" designation, and then generate
an inventory of meters classifications (available in "AMT" records).
find /scores -type f -exec grep -l '!!AGN.*Trinklied' "{}" ";" | \
grep '!!!AMT.*' | sort | uniq -c
The output:
1 !!!AMT: compound duple
4 !!!AMT: irregular
14 !!!AMT: simple quadruple
5 !!!AMT: simple triple
There are just 24 drinking songs in the Essen Collection and only
five are in triple meters. This result turns out to be no different
than the distribution of triple meters in general for German folksongs.
In other words, it is not the case that German drinking songs are
more likely to be in triple meters.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
:::::::
REPRISE
:::::::
The UNIX "find" command provides an excellent way to traverse through
an entire file-system looking for files that conform to specific
criteria. In musicological tasks, the "find" command is especially
well suited to assembling a repertoire of scores that exhibit some
characteristic(s) of interest. Multiple selection criteria can be
accommodated by using one or more "pipes" in connection with the
"grep" command.
For convenience, it is often helpful to create a new directory
that holds all of works selected for a study repertoire. On UNIX
systems, soft "links" can be created, so that there is no need to
make multiple copies of the same score. This means that several
concurrent directory structures can be created without duplicating
files. For example, a given score may be accessed in one directory
structure via composer, in another directory via instrumentation,
in a third directory via genre, and so on.
[End of HUMDRUM NEWS]