Sahayak : Sahayak App
Untitled.video.-.Made.with.Clipchamp_compressed.mp4
-
Sahayak -a vigilant antidrug companion, is a dedicated chatbot committed to providing support and information.Main objective of this chatbot is to handle sensitive chats of drugs & alcohol, as it gives remedies & specialized responses.Therefore our main goal is to finetune the chatbot over such sensitive data, & drug addicted person can use & ask questions freely
-
Harnessing technology for a healthier tomorrow, Sahayak stands as a reliable ally in the fight against drugs, promoting a drug-free lifestyle through understanding and assistance using technologies like finetuning, small language model (SLM), PEFT, LoRA, Python, Anvil, Cloud, tensorflow, NLP, pytorch, Phi-2 by Microsoft,
Prepare a dataset in the required format (JSONL) with input-output pairs or any suitable format.
Utilize Google Colab ,AWS Sagemaker or a similar platform to obtain GPU resources.Choose appropriate specifications (e.g., Python version, CUDA version) for the environment.Build and configure the environment using the provided badge.
Load the Phi-2 model using 8-bit quantization for efficient training.
Set up the tokenizer, considering padding, truncation, and max_length.Tokenize the dataset for training and validation.
Apply preprocessing to the model for training using LoRA layers.Define LoRA configurations for the model.
Train the model using the provided Trainer class.Monitor training metrics and adjust hyperparameters as needed.
Evaluate the trained model on a sample prompt to check performance.
Utilize Anvil to create a user-friendly interface for the chatbot.
- Machine Learning Libraries: Hugging Face Transformers, Accelerate, PEFT, Datasets
- GPU and Cloud Services: AWS Sagemaker & Google Colab
- Data Processing and Visualization: Matplotlib
- Experiment Tracking: Weights & Biases
- User Interface: Anvil
- Python Libraries: Torch, scipy, ipywidgets, einops
- Version Control: Git