Thesaurus Musicarum Latinarum (TML)(1)

Thomas J. Mathiesen
Center for the History of Music Theory and Literature
School of Music
Indiana University
Bloomington, IN 47405 USA


The literary tradition of music theory, though less well known to the general scholar than other literary traditions, is nevertheless a treasury of extraordinary value. A number of the most important figures of late antiquity and the early Middle Ages--figures such as Augustine, Boethius, Cassiodorus, and Isidore of Seville--wrote substantial works on music, and from at least as early as the ninth century, an unbroken stream of monks, clerics, philosophers, and musicians contributed to the tradition until Latin gradually began to fade in the seventeenth century as the language of scholarship. This material is important for the information it provides on technical problems of great interest to musicologists--problems such as musical notation, tuning, the interpretation of mode, musica ficta, and countless other subjects that directly affect the way in which early music is transcribed, performed, and recorded.

The tradition of Latin music theory has an importance beyond this practical level. Music was not viewed by the writers of this period as an exclusively aesthetic phenomenon but rather as an essential part of their social, religious, and intellectual life. The liturgy, of course, was inextricably wedded to music by tradition and scriptural injunction, but music was also the companion discipline to arithmetic, geometry, and astronomy in the medieval quadrivium. Building on the musical paradigms developed by figures such as Martianus Capella, Macrobius, and Calcidius, medieval cosmologists drew the writings of contemporary music theorists into a more complex view of the universe. Later theories of infinity, place, time, and void were shared by mathematicians and music theorists, while medieval medicine and psychology borrowed and applied musical theories of consonance, proportion, and especially rhythm. In a number of cases, the same medieval scholar wrote on all these disciplines, and at the University of Padua, for example, the medical and arts faculties were closely allied.

Music theory therefore forms a major piece of the cultural and intellectual history of the first millennium and a half of our era. This history cannot be fully understood without a comprehension of the larger paradigmatic and philosophical role played by music in the Middle Ages and the Renaissance. Likewise, the music theory itself cannot be fully comprehended without understanding the extent to which it borrows vocabulary, logical systems, patterns of analogy, and even topics for discussion from the other related disciplines and sciences, especially logic, mathematics, geometry, astrology, and philosophy.

The Thesaurus Musicarum Latinarum (TML) was developed to facilitate the study of this important literary tradition. The TML is an evolving database that already includes more than 4.8 million words of text supported by more than 4,500 graphics. The TML aims to contain the entire corpus of Latin music theory written from the third century to the end of the sixteenth--and eventually perhaps the treatises of the seventeenth and eighteenth centuries as well. Complementing but not in any way duplicating the Lexicon musicum latinum (LmL), the TML makes it possible to locate and display in a matter of minutes every occurrence of a particular term, a phrase or passage, or a group of terms in close proximity in all the texts included in the database, both in published editions and in manuscript sources, thereby greatly facilitating the study of terminology, the identification of parallel passages or unattributed quotations, and the preparation of new critical editions. This in turn will assist scholars in developing a more comprehensive view of Latin music theory and--on a broader level--the intellectual history of the Middle Ages.

Nature and Significance of the Project

In the seventeenth and eighteenth centuries, scholars such as Marcus Meibom, Martin Gerbert, and G. B. Martini began to apply the new methods of classical philology to the problem of gathering, editing, and publishing significant monuments of Greek and Latin music theory. While the humanist scholars of the Renaissance had concentrated especially on Greek music theory, these younger scholars were interested in both traditions. In 1784, Martin Gerbert published the three volumes of his Scriptores ecclesiastici de musica sacra potissimum, and the work of Gerbert was continued in the nineteenth century by Edmond de Coussemaker, who published between 1864 and 1876 the four volumes of his Scriptorum de musica medii aevi nova series a Gerbertina altera. By modern standards of textual scholarship, the editions are inadequate, but for their day, these were collections of great authority and importance. They made available for the first time a large number of important treatises and greatly facilitated a more comprehensive study of Latin music theory.

Interest in the history of music theory continued and developed in the twentieth century, especially in the form of improving and enlarging the number of critical texts, translations, and commentaries. Series such as the Corpus scriptorum de musica (CSM), Divitiae musicae artis, Music Theory Translation Series, and Greek and Latin Music Theory (GLMT), to name only four, have considerably expanded the body of material available to scholars for study.

If musicologists, lexicographers, and historians of music theory have long recognized the importance of Latin music theory, which provides so much information on technical, aesthetic, and conceptual matters, they have also recognized the difficulty of studying this literary tradition. The difficulty is compounded by the fact that many treatises have never been edited; some, on the other hand, have been edited several times, and the reader must choose among competing editions. In addition, there is the problem of locating parallel concepts, establishing the meaning--or the changing meaning--of technical terms, identifying topoi, and so on. Until recently, scholars dealing with this material have had to rely largely on their memories and the laborious process of manually comparing dozens of editions and manuscripts in their attempts to draw out a historical picture or resolve a theoretical point bearing on the interpretation of a piece of music. Thus, a truly comprehensive knowledge of thousands of years of Latin music theory has simply not been possible.

In the Spring of 1989, a small group of scholars active in textual criticism, codicology, editing early music, cataloguing manuscripts, and the general history of music theory began casual conversations about the possibility of forming a database that would eventually contain the entire corpus of Latin music theory--printed and manuscript--written during the Middle Ages and the early part of the Renaissance. A larger representative group was invited to convene for discussion of the project at the annual meeting of the American Musicological Society in Austin, Texas on 26 October 1989. A general commitment to see the project to its conclusion was made and preliminary editorial and technical committees were established, with Thomas J. Mathiesen (Indiana University) as Project Director. Indiana University provided some substantial funding to establish the principal TML Center, and the Department of Music at Princeton University hosted a subsequent and extended planning conference, 17-21 January 1990. At this time, the project was officially established and designated as the Thesaurus Musicarum Latinarum. Both meetings included musicologists, specialists in computer applications, and librarians. By the end of the January conference, unanimous agreement emerged on matters of coverage, organization, access, medium, file structure, and similar sorts of technical decisions, and a general plan of work was adopted. Participants in the conferences and other interested scholars remained in close contact through the mails, regular telephone conversations, and a TML distribution list on Bitnet and Internet. From the first, the TML has been a consortium project involving universities from all regions of the United States.

In November of 1990, the TML began public distribution of its database, which initially consisted of only a few texts, general instructions for accessing the database, and basic applications for decoding files and viewing graphic material. In ten years, the TML has grown to include all of the texts of both the Coussemaker and Gerbert Scriptores; all the Latin texts from the series Greek and Latin Music Theory, Corpus scriptorum de musica, Divitiae musicae artis, the Colorado College Music Press, and a number of other series; all the treatises of Gaffurius, Burtius, Glarean, Ramus, and the like; texts derived from various manuscripts; all the musical treatises in the Patrologia Latina; and so on (published texts under copyright are used by permission of their copyright holders). Work is currently concentrating on manuscript material, a few earlier texts in the public domain, and remaining texts published after 1930. The TML is intended to include eventually every printed and manuscript source so that scholars will be able not only to retrieve published material but also readings that appear in the source material itself.

With this large data set, the TML makes it possible for scholars to locate and display in a matter of minutes on their personal computers--whether Windows, Macintosh, or other machines--every occurrence of a particular term, a phrase or passage, or a group of terms in close proximity in more than 760 separate text files. The database was designed so that scholars would be able to use it quickly and easily in a large number of ways and with little, if any, investment in new computer hardware or software.

After the desired material has been located, any number of further actions are possible: a detailed report may be printed, showing the number of "finds" and their location (in as large or small a context as the scholar may specify); nested searches may be initiated to narrow the focus; passages located may be readily imported into a word processing document; the entire text of the treatise containing the passage may be printed; and so on. If the user's machine has the capability of exhibiting graphic material, musical notation, figures, and other sorts of illustrations that appear within the treatises may also be displayed. The database has been structured so that it can be tailored to each user's interests; any part of it will run separately (text without graphics, graphics without text, or text and graphics together) and all or any part of it may be downloaded to the individual user's machine or viewed on the World Wide Web.

The TML facilitates the study of terminology, the identification of parallel passages or unattributed quotations, and the preparation of new critical editions. Until the advent of the TML, all these tasks required an enormous amount of time and relied on a certain amount of serendipity. As an example, the study of terminology, crucial for a clear understanding of the meaning of texts, had been limited to specialized word-lists, which represent only a limited view of selected sources and in any case are not comprehensive. Now, it is possible to develop quite complete records of the usage of individual terms ranging from Augustine's De musica, written at the end of the fourth century, to Gaspar Stoquerus's De musica verbali, written at the end of the sixteenth century. Likewise, the identification of parallel passages in Latin music theory required that the scholar manually search dozens of published editions, most of them without any index verborum. Now, a major portion of this material can be searched in just a few minutes. The task was further complicated by the fact that there was no comprehensive index of editions for this field such as is provided by the Thesaurus Linguae Graecae's Canon of Greek Authors (3d ed. by Luci Berkowitz and Karl A. Squitier [New York: Oxford University Press, 1990]). Now, the TML's Canon of Data Files (ed. by Thomas J. Mathiesen, Publications of the Center for the History of Music Theory and Literature, vol. 1 [Lincoln: University of Nebraska Press, 1999]. ISBN: 0-8032-8233-8. US $45) provides complete bibliographic information for all the texts available online, thereby improving bibliographic control of the field. Finally, the collation of manuscripts in the preparation of a critical edition requires, first of all, that they be located and then compared word-for-word one with another. As the TML locates manuscripts and enters them into its database, a portion of the task of collation can be automated.

In the long run, the TML will assist scholars in developing a comprehensive view of Latin music theory, both as a sub-discipline and as a part of the larger medieval world view. The TML is regularly used by scholars throughout the world. Since the TML began keeping statistics in 1995, it has delivered in excess of 158,000 files through the TML LISTSERV, the TML Gopher, and the TML-FTP; more than 45,000 connections have been made to the TML home page of the web site; and more than 21,500 online searches have been conducted.

Organization of the Project

The principal TML office, which forms a part of the Center for the History of Music Theory and Literature (CHMTL) at the School of Music at Indiana University, is joined by funded centers at Princeton University, the University of Nebraska-Lincoln, the University of Colorado-Boulder, Louisiana State University, Ohio State University, and the Moscow Conservatory. In 1992 and 1994, the TML received major grants from the National Endowment for the Humanities in support of work on the project through 1996. Work continues to the present day thanks to grants from the Office of Research and the University Graduate School and from the School of Music at Indiana University.

The TML is supervised by two committees, coordinated by the Project Director. The Project Committee includes the core of those involved in the initial planning sessions; it oversees all aspects of the database and is responsible for determining any future modifications in the system of delivery, structure of the data files, and editorial documents. The Editorial Advisory Committee, which is intended to include as many interested and qualified scholars as possible, shares responsibility with the Project Committee for the final review of each text, as described below. In addition, members of both committees are involved in entering the data from at least some manuscript sources and supervising TML graduate assistants at their institutions. Members of the current committees represent a wide range of experience in the history of music theory, textual criticism and editorial technique, and computer applications in the humanities. The joint committees meet once a year at the annual convention of the AMS but also maintain regular contact throughout the year by telephone and electronic and conventional mail.

Project Methodology

The Project Committee agreed at the outset on three fundamentals; these have proven to be highly successful and have been maintained to the present time. First, the TML should be a database of sources, not simply single versions of individual works. Second, the TML should enable scholars to locate and retrieve the text of the source, just as it stands and without editorial intrusions. For example, an author attribution appearing in the Coussemaker Scriptores that modern scholarship might consider erroneous would still be retained in the data file (although annotations clarifying attributions and providing other information about the data file are included in the TML Canon of Data Files). The only exceptions to the rule of representing the text as accurately as possible are explained in the "Principles of Orthography" and the "Table of Codes for Noteshapes." The "Principles of Orthography" provide the very minimal standardization required in any type of data entry, but with a single minor exception, there is nothing in the standardization that alters the individual words of the text in any way. The "Table of Codes for Noteshapes" provides alphanumeric equivalents that represent the precise appearance of musical notation; this allows the notation itself to be included in intelligent searching, while in no way foreclosing a scholar's individual interpretation of the meaning of the notation. Third, the TML should contain every printed edition, even if an earlier edition might seem to have been supplanted by a more modern one. And, because many Latin music treatises have not been published in critical texts, the TML should eventually contain every manuscript source. Scholars who wish to exclude manuscripts or earlier editions from their database searches are easily able to do so; on the other hand, scholars who might wish to compare sources for any number of good reasons are also able to configure the database for that purpose.

These principles led the TML to the following methodology and general plan of work. As a first priority, published editions were placed in the database as ASCII text, according to the "Principles of Orthography" approved by the Project Committee. The database contains only the edition itself; critical apparatus, editorial notes, translations, introductions, and so on are not included. The second priority was the conversion of manuscript sources into ASCII text, once again following the guidelines on orthography. This is a relatively extended task relying on the special paleographic skills represented on the TML committees, and most of this work is being done by these persons. As the manuscripts are entered, they are also catalogued, if necessary, according to the general design for manuscript descriptions employed by the Répertoire International des Sources Musicales. In order to accomplish this part of the task, a comprehensive union list of microfilms and a collection of microfilms of all Latin music theory codices is being assembled.

The TML encourages scholars who may have transcribed treatises but never published their work to submit them for inclusion in the database, and this has happened in a number of important cases. In addition, the TML makes it possible for individuals working at separate institutions to collaborate on data entry from manuscripts for any given text, each scholar entering the readings of a single manuscript. As the results are added to the data set, the resources will grow for new critical texts that might never have been undertaken by a single scholar, simply because of the vastness of the task of collation. The TML also makes it possible for scholars to evaluate the text critical work of earlier editions because the database will eventually provide in a single place the raw data of the earlier edition as well as the edition itself.

Unlike other types of texts commonly studied by scholars in fields such as classics, literature, or philosophy, Latin music theory often includes abundant figures and musical notation for which no ASCII equivalents exist. This material cannot simply be omitted. Musical notation that can be precisely entered as codes has been encoded in the ASCII text file, while full musical examples or figures are scanned, saved in GIF format, and flagged to locations within the text files themselves. If the example includes text, this is given in the ASCII file within brackets (e.g., [Berkeley, 86; text: E-la, D-la-sol, C-sol-fa, B-fa-B-mi, A-la-mi-re, G-sol-re-ut, F-fa-ut, E-la-mi, D-la-sol-re, C-sol-fa-ut, D-sol-re, C-fa-ut, B-mi, A-re, Gamma-ut, 5 superacute, 7 acute, 8 graves, Declaracio manus secundum usum]). It is important to note that the codes and flags enable most of the graphic material to be subject to the same sort of intelligent searching provided for the text itself. When the user looks for a word or text string, the search engine locates and displays this material within the graphics flags as well as within the treatise proper. Likewise, by employing the proper code for a notational symbol, the user can instruct the search engine to locate occurrences of a noteshape. By combining codes with words or text strings in the search, the user can simultaneously discover graphic, notational, and textual references within the data set. Thus, users with graphics capabilities on their machines are able to both search and display text as well as musical notation and figures. Although users without graphics capabilities cannot display graphics, they are still able to search all parts of the database.

All material--whether scanned or keyed, previously published or manuscript--is compared at least twice with the original source before being placed in the database: the first review is typically done by the member of the TML Project Committee supervising the graduate assistants working at his TML Center or by a member of the Editorial Advisory Committee; the file is then sent to the Indiana University TML Center, where is checked once again and the necessary corrections are entered. After these final corrections are entered, the data file is locked, reviewed by the Project Director, and entered into the database. Each file begins with the name of the database, followed by lines indicating the names of the persons entering, checking, and approving the data, and a bibliographic header. Data files are entered into the database only upon approval by a member of the TML committees and the Project Director.

The TML Canon of Data Files provides for each file the author's name (or traditional cognomen, such as Anonymous XII); the title of the treatise; the incipit; the source used for data entry; the names of the persons responsible for the data entry, data checking, and final approval; the file name in the TML; file size; and annotations. It also includes a full description of various means for accessing and using the TML, the "Principles of Orthography" and "Table of Codes for Noteshapes," and a concordance to the contents of major series and their location in the TML.

A particular strength of the TML is its open-ended character. It is not intended to do just certain defined tasks in only a certain way. Rather, the TML provides a mechanism for bringing together the work of scholars of widely diverse interests, assuring that it adheres to the TML's editorial guidelines, and then distributing it for the widest variety of scholarly uses.


After extended discussion of various systems of delivery and current media, it was decided in 1990 that the TML should be accessible in a variety of ways and able to run in all environments: Macintosh, Windows (or, in those days, MS-DOS), and mainframe. The methods of access have grown over the past ten years, but the TML has maintained multiple points of access to accommodate needs in all parts of the world. The raw data files of the TML can be retrieved free of charge by e-mail (through LISTSERV), FTP through a TML-FTP site, and the World Wide Web through a TML Web site ( For those who do not or cannot use online resources, the entire database (together with CANTUS) is also available on a CD-ROM for a minimal charge to cover the cost of manufacturing and shipping the disc. The data files are organized in the TML in subdirectories for each century (or, for the earlier treatises, groups of centuries). Full instructions are provided in the TML Introduction, which is published in the Canon of Data Files and is also available as a file on the various TML sites.

The graphics files, by their nature, are somewhat more complex. Each graphic is prepared in GIF format and anchored to the appropriate spot within the text; these may be viewed online through the TML web. The versions distributed through the TML LISTSERV and TML-FTP are UUencoded to insure that they are not corrupted in the process of being transferred from the TML site to a user's individual machine; once decoded, they may of course be viewed directly on the individual's computer.

Final Product

As an evolving and open-ended database, there is no projected end to the TML. Use of the TML has been growing dramatically since 1990 and continues to increase every month. Unlike many research and reference tools, the TML will never exist in a fixed and limited form. Indeed, part of the utility of the TML resides in the imagination of those who use and contribute to it. At a minimum, however, the TML has already realized two of its initial goals: (1) facilitating the study of terminology, the identification of parallel passages or unattributed quotations, and the preparation of new critical editions; and (2) assisting scholars in developing a more comprehensive view of Latin music theory and the intellectual history of the Middle Ages. In short, work on and with the TML will continue for as long as scholars remain interested in its material.

1This article appeared first in Le Médiéviste et l'ordinateur, no. 39 (Winter 2000), 11-17. (BACK)

This file was posted on 5 May 2001.
Please send your comments to Michael DiMaio, jr.

Return to the Table of Contents