Semantic_Similarity

Given a text and a reason, predict if text satisfies the reason. You can use the train file for any training and report metrics on evaluation file.

Dataset information

Note: Small train dataset with only positive samples is intentional.

The python scripts in this repository addresses the issues below. Run on Google colab, script can be foundhere

Required packages
Label class Imbalance
- Data insights:
  - Baseline approach (use only transformer models)
  - Training approach (use only transformer models)
  - Artificial neg generation techniques.
Metrics
Ablation Study table (different tabular model architecture results comparison)
Fine-tuned the learning rate.
Used a learning rate scheduler.
Used a pre-trained model specifically designed for semantic similarity, such as sentence-transformers/bert-base-nli-mean-tokens.
Insufficient data from data insights analysis

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Data_insights.ipynb		Data_insights.ipynb
NLP REPORT SUBMITTED.pdf		NLP REPORT SUBMITTED.pdf
Nlp internship.ipynb - Colaboratory.pdf		Nlp internship.ipynb - Colaboratory.pdf
Nlp_internship.ipynb		Nlp_internship.ipynb
README.md		README.md
nlp_internship.py		nlp_internship.py