Part 2: e.g. of RVC training + inference

6h30min audio (1 speaker) at 48 kHz + RMVPE pitch extraction = 16.1 GiB disk space

pretrained base models (Discriminator & Generator) v2

hop length only relevant to “crepe” algorithm

train feature index (independent of actual model training)

how to resume training from previous ckpt: increase number of epochs

max batch size depends on VRAM: e.g. 6 if 6 GiB VRAM

during training, monitor 2 losses: G total & D total, save ckpt every 5 epochs so can stop early before overfitting

train 300 epochs but keep ckpt at 160th epoch, see loss curve

save model to share:

inference parameters:

Provide feedback