python script to download & process data to train a speech-synthesis model of Vietnamese M.C. Nguyễn Ngọc Ngạn
tải và xử lí audio để train neural network nhái giọng bác Ngạn
vì lí do bản quyền nên ở đây chỉ có code ko có data, ai muốn thì đọc hướng dẫn dưới đây để chạy code kéo audio về tự train
RVC checkpoints: https://huggingface.co/doof-ferb/rvc-ngngngan
Matcha-TTS checkpoints: https://huggingface.co/doof-ferb/matcha_ngngngan
Demo: Matcha-TTS 🤗 https://huggingface.co/spaces/doof-ferb/MatchaTTS_ngngngan
need NVIDIA GPU
install ffmpeg
git clone
this repo
prepare a fresh python env (venv
or conda
)
pip install torch torchaudio --find-links=https://download.pytorch.org/whl/torch_stable.html
optional: pip install jupyter-lab tensorboard
for visualization
e.g. tensorboard --logdir <path to folder containing events.out.tfevents.*>
⇒ localhost:6006
or directly run pip install -r requirements.txt
but it may not be up-to-date
Part 1: prepare data for RVC
Part 2: e.g. of RVC training + inference
Part 3: prepare data for text-to-speech
Part 4.1: e.g. VITS 2 training (GIVE UP because training too long)
Part 4.2: e.g. Matcha-TTS training
git update-index --skip-worktree data/vits2_ngngngan_nosdp.json
git update-index --skip-worktree tensorboard/export_tensorboard_RVC.py
git update-index --skip-worktree tensorboard/export_tensorboard_MatchaTTS.py