Telecom NLI Matryoshka Model
Repo: agraharr/finetune-matryoshka-telecom
Model Overview
A finetuned sentence-pair transformer model for telecom domain natural language inference (NLI) and question-answer evaluation.
- Base model: tomaarsen/mpnet-base-nli-matryoshka
- Architecture: MPNet, Matryoshka sequence pooling, 2-class (entail/reject) head
- Domain: Telecom, 3GPP, networking, standards, wireless, industry QA
Training Details
- Method: Supervised sentence pair classification (entailment:1, no entailment:0)
- Code: Hugging Face Transformers Trainer
- Key config:
- Model: tomaarsen/mpnet-base-nli-matryoshka
- Epochs: 4
- LR: 2e-5
- Batch: 32
- Max length: 128 tokens
- Loss: CrossEntropy (label: 0, 1)
- Hardware: a10g-large GPU, Trackio for experiment tracking
- Full SFT python/train script in repo:
finetune_matryoshka_telecom_nli.py
- Usage:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("agraharr/finetune-matryoshka-telecom")
tokenizer = AutoTokenizer.from_pretrained("agraharr/finetune-matryoshka-telecom")
Dataset
- Resource:
teleqna_sentence_pairs.jsonl(view),teleqna_sentence_pairs.tsv - Size: ≈ 46,322 sentence pairs
- Source: Generated from telecom Q&A, curated from standards, publications, and domain lexicons
- Format:
sentence1: telecom questionsentence2: telecom answer/candidate/optionlabel: 1 (correct/entailed), 0 (incorrect/rejected)
Task & Intended Use
- Downstream: Telecom question answering, entailment, retrieval, reranking, NLI-style scoring
- Benchmarks: Custom telecom QA, 3GPP/industry exams, technical evaluation
- API/Model Head: Sequence classification head, softmax; use as zero-shot scorer or finetune further for retrieval/ranking
Citations
Base Model:
Tijs Tomaarsen. "mpnet-base-nli-matryoshka." Hugging Face. https://huggingface.co/tomaarsen/mpnet-base-nli-matryoshka
Matryoshka Pooling:
Liu, X., Kalyan, A., Singh, Y. S., et al. "Matryoshka Representations for Efficient and Robust Sentence Embeddings." NeurIPS 2023. DOI: 10.48550/arxiv.2310.06665
Data Curation/QA Extraction:
Various telecom standards (3GPP, IEEE), public Q&A mining, domain lexicons, compiled by agraharr (2024)
Contact/Provenance
- Model created and trained via ML Intern.
- Author: agraharr
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support