I have posted a short presentation here about efforts I made last week to produce automatic transcription of the solo human singing voice. Most of the work was done over four days, beginning a week ago at Hacker School, New York.
The recording is of Taiwanese “cantillation” (yínsòng 吟誦): a classical poem is read in “reading pronunciation” (different from spoken language) and the tones of the words are elaborated artistically into a melody.
The poem in question is the “Ballad of the Lute” (pípá xíng 琵琶行) of Bái Jūyì 白居易 (772–846), Ms. Hsü I-t‘ing 許禕娗 singing, c. 2009. The melody is that taught by Mr. Hong Zenan 洪澤南.
I failed in what I was trying to do but learned a lot by making the attempt. And I discovered that even though WaoN output isn't very good, the things that are wrong with it make for very interesting music on their own — just not for very good transcription.
I'd be grateful to hear about corrections to my transcription, which is still in a tentative state.