-
I was looking through the Wiki and the various readme files and none of them seem to detail how you actually do text-to-audio voice cloning with this webui. Is there a way to do this? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 4 replies
-
Text-to-audio voice cloning can be achieved with bark on the text to speech tab, using the clone option in the voice select, then you'll just optionally name and upload your audio file, your next generation will then use the voice of the target speaker. If it's not like your speaker voice enough, you can usually still improve it with RVC after, but this will require a model of the target speaker. |
Beta Was this translation helpful? Give feedback.
-
Now I'm running into a different issue. There seems to be a potential memory leak? When cloning it loads the Hubert model and tokenizer into memory just fine, but the moment it starts extracting semantics it fills up 20gb of system memory (the rest of my 32gb of memory after bark is loaded), causing the Linux kernel to kill tons of processes to prevent a crash, including the webui. |
Beta Was this translation helpful? Give feedback.
-
how big is your audio file? you're supposed to use at most 15 seconds |
Beta Was this translation helpful? Give feedback.
-
Oh, I was not aware of that, I was training it with 28 minutes of audio lmao. I assumed it would train similar to how RVC does. Thanks for all your help with this! |
Beta Was this translation helpful? Give feedback.
Text-to-audio voice cloning can be achieved with bark on the text to speech tab, using the clone option in the voice select, then you'll just optionally name and upload your audio file, your next generation will then use the voice of the target speaker.
If it's not like your speaker voice enough, you can usually still improve it with RVC after, but this will require a model of the target speaker.