ChatMusician isn't exactly new and the underlying dataset isn't particularly diverse, but it's one of the few models made specifically for classical music.
Are there any others, by the way?
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
ChatMusician isn't exactly new and the underlying dataset isn't particularly diverse, but it's one of the few models made specifically for classical music.
Are there any others, by the way?
Maybe it would be possible to use a regular text-to-voice model and then use something similar to autotune
The only text-to-audio model I can think of at the moment is Stable Audio Open, which AFAIK is rather underwhelming for your use-case, if it can even handle stuff more complex than basic sounds - and no lyrics.
It is even under the "new" membership licensing of SAI.
I remember reading about a more recent one, but I currently can't find it, and I don't think that that one too could handle lyrics.
I suppose the Music industry is a lot harder to fight, so not a lot of people want to entangle themself with it.
Interestingly, Jukebox from OpenAI was trained on what appears to be copyrighted music and involved styles and renditions that explicitly referenced specific artists. It's now four years old though. The demo songs don't seem to be available anymore on Soundcloud.
There is MusicLM from Google (2023) - no lyrics. Also, AudioCraft from Meta (2023) - also no lyrics as far as I can tell.