All notable changes to this project will be documented in this file.
- Support for
task instruction prefixes
(defined in.env
file) see this nomic embedding model card for reference - Noic's new
modernbert-embed-base
model has been added to the.env
comments as an example
- Changed sentence parsing library to
sentence-parse
- Updated TransformersJS to 3.2.0
- Topic embedding generation from training data
- Interactive test runner for analyzing text
- Support for custom embedding models via ONNX
- Configurable similarity thresholds per topic
- Automatic model caching
- Verbose and minimal logging modes
- Configuration persistence
- Training data support in JSONL format