Enhancing access to the levy sheet music collection: reconstructing full-text lyrics from syllables

The goal of the Lester S. Levy Sheet Music Collection, Phase Two project is to develop tools, processes, and systems that facilitate collection ingestion through automated processes that reduce, but not necessarily eliminate human intervention[1]. One of the major components of this project is an optical music recognition (OMR) system[2] that extracts musical information and lyric text from the page images that comprise each piece in a collection. It is often the case, as it is with the Levy Collection, that lyrics embedded in music notation are written in a syllabicated form so that each syllable lines up with the note or notes to which it corresponds. Searching the syllabicated form of words, however, would be counterintuitive and cumbersome for end-users. This paper describes the evolution of a tool that, using a simple algorithm, rebuilds complete words from lyric syllables and, in ambiguous cases, provides feedback to the collection builder. This system will be integrated into the workflow of the Levy Sheet Music Collection, but has broad applicability for any project ingesting musical scores with lyrics.
Brian Wingenroth, Mark Patton, and Tim DiLauro. 2002. Enhancing access to the levy sheet music collection: reconstructing full-text lyrics from syllables. In Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries (JCDL '02). ACM, New York, NY, USA, 308-309. DOI: https://doi.org/10.1145/544220.544293