DDMAL

I have been looking for the best way to create audio files from our collection of MEI files from the manuscripts that we have automatically transcribed during the last couple of months. By creating this audio versions, the users of our browser not only be able to search the manuscripts using different queries, but also they will be able to listen the pieces.

The libmei library has a method for creating MIDI files from the original MEI files. Once we get the MIDI files, different technologies of singing synthesis can be used for creating the audio versions. These are the ones that I have take a look:

1. Vocaloid www.vocaloid.com

Vocaloid is a singing synthesizer based on pre-recorded phonemes. It works as a stand-alone application or within a DAW as a VSTi or by using ReWire. However, it runs only on Windows machines. There is no way of scripting the application, so the only chance of creating audio versions for our several thousand pieces database would be to script the DAW.

2. SynthFont www.synthfont.com

There are no Audio/MIDI DAW with batch process capabilities (Wavelab and SoundForge allow batch process, but they only work with audio). However, SynthFont, a small Windows program for editing and playing MIDI files using various source files like SoundFonts, GigaSampler files and VST instruments, allows for rendering audio from MIDI files in batch process.

Although SynthFont is pretty simple in terms of features, it does what it says, so we could bounce or collection, however, its VST host is very old, so it does not send NRPN’s, which are required for Vocaloid to work.

3. UTAU utau-synth.com

UTAU is another singing synthesizer, a-la Vocaloid, but free and open-source. It runs on Windows as well as OSX machines. At the moment, the characters are mostly japanese, but there are some bilingual UTAUloids.
UTAU does not work as a VST or with ReWire, so the only chance is the use it as a stand alone application, making

4. AquesTone http://www.a-quest.com/products/aquestone.html

Aquestone is a freeware vocal synthesizer that runs on Windows as aVST instrument. It also uses the japanese phoneme set, but this time the lyrics file must be written in Hirangana.

5. Sinsy www.sinsy.jp

This library for singing voice synthesis, developed at the Nagoya Institute of Technology and the Tokyo Institute of Technology, is based on Hidden Markov Models. You provide the system with a MusicXML file with the lyrics, pitches and durations for each note, and it will return a .wav audio file. This process could be rendered on-demand in the server side, so there is no need of having all the audio files pre-rendered.

This is a heavily-processed, multi-voice Sinsy example:

A Sinsy unprocessed, single-voice example

Although the idea of rendering the files on-demand is interesting, the resulting voice does not sound realistic.

6. Sikuli http://sikuli.org/

I decided to move on for something different and looked for a scripting technique. I end up by finding Sikuli, a technology based on computer vision for automating GUIs using screenshots developed in the MIT. Sikuli provides an API that can interact with Jython, so developing scripts and batch processing files is pretty easy. It runs on OSX/Windows/Linux.

7. Symphonic Choirs www.soundsonline.com/Symphonic-Choirs

Symphonic Choirs is a 40GB library of vocal samples that runs as a NI Kontakt VST, AudioUnits, RTAS or CoreAudio instrument on OSX, and VST, ASIO and RTAS on Windows. It comes with extended ranges for SATB standard choir sections and boys choir sections. The lyrics should be builded in WordBuilder, a companion software that links the lyrics with the audio engine, however simplified vowel patches and effects are provided for easy use (Ah-Ih-Eh, Ee-Oh-Eh, Cluster Effects, Cluster Oh, Falls, Shouts, Whispered Words). Educational Price: $247.50.

Multivoice example

8. Magnus Choir magnus.syntheway.net

Magnus Choir is a Windows-based software instrument based on a combination of sampling and synthesis. It comes with a classic SATB structure, built-in reverb, and factory presets for different vowels and combination of men and women. Price: $40.

Single voice example

Posted: November 8th, 2011
at 11:10am by admin

 


Categories:

Comments: No comments



 

Leave a Reply