For years I have been interested in Taiwanese “cantillation” (yínsòng 吟誦, the melodic elaboration of syllable-tones) of Classical Chinese literature. I have a great many recordings of it and have always wanted a way to study them quantitatively. Last month I decided my programming skills should be up to the task.
At first I tried converting the audio to MIDI representation and then processing the result. My efforts are documented in a set of slides here — features of the performance itself, such as tremolo and reverberation, produced a very messy MIDI transcription, and my efforts to clean it up automatically were unsatisfactory.
Eventually I transcribed it by hand. Naturally I learned a lot about the composition in the process.
For transcription I used MuseScore, which allows the export of the score to MusicXML format. And suddenly in the XML I have the quantitative representation that I need. Yesterday I put together some functions and unit tests to convert the MusicXML (representing melody, with lyrics as child-elements) to a list of the lyrics, with the melody as an annotation on each syllable. Now I am able to begin programmatic analysis of the compositions.
More detail is available here in my slides from yesterday's presentation at Hacker School.