Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Option to choose between Pyannote and NeMo for diarization #115

Open
Arche151 opened this issue Aug 12, 2024 · 0 comments
Open

Comments

@Arche151
Copy link

First of all, @pluja I want to thank you again for developing whishper/soon to be anysub!

I basically check out the v4 branch every day, because I'm too excited for when anysub is ready! :) And I can't believe, that my feature request - user authentication - will actually be implemented. Thanks so much for that!

My new feature request probably comes way too late, considering how deeply WhisperX will be integrated into anysub and how much work you've put into the WhisperX API, but I want to try anyway.

I suggest adding the option to choose between Pyannote and Nvidia NeMo for diarization for two reasons:

  1. Unlike Pyannote NeMo is truly open source, with no requirement for obtaining and entering an authorization token.
  2. From my personal tests and to my surprise NeMo is way better than Pyannote at accurately diarizing speakers.

@MahmoudAshraf97 created whisper-diarization which is in parts based on WhisperX, but uses NeMo for diarization.

I know, that I am asking a lot here, but for the two reasons, that I stated, I would really appreciate it, if you could still consider it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant