Add Parler (generative speech generation) based TTS (proof of conept) #274
+1,942
−66
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Yet another TTS experiment similar to #261. This time using Parler-TTS, via candle. This makes this a Rust-dependency, keeping the build and distribution simple. It still needs to download the language model, but this is done automatically on first use (at least for this proof of concept).
This however is not a general purpose TTS, as the model is huge and very slow on the CPU. It only really makes sense with a dedicated GPU so that it can run via CUDA - ie. nothing for most servers. The large model needs about 10GB VRAM. The smaller model requires about 4GB VRAM.
Some additional stats, on a 4090, loading the large model onto the GPU takes about 4s (only done once per mission), and generating the speech of the sample below took 3s.
Since this is a generative speech generation, it doesn't have voices, but uses an additional prompt where you describe the speaker. It still has some named speakers that should help with consistency between different prompts.
Here is a sample.zip using the following speaker prompt:
I am still exploring the different offline-TTS options until I'll actually propose to bring one of them into DCS-gRPC.