Telecom NLI Matryoshka Model

Repo: agraharr/finetune-matryoshka-telecom

Model Overview

A finetuned sentence-pair transformer model for telecom domain natural language inference (NLI) and question-answer evaluation.

Base model: tomaarsen/mpnet-base-nli-matryoshka
Architecture: MPNet, Matryoshka sequence pooling, 2-class (entail/reject) head
Domain: Telecom, 3GPP, networking, standards, wireless, industry QA

Training Details

Method: Supervised sentence pair classification (entailment:1, no entailment:0)
Code: Hugging Face Transformers Trainer
Key config:
- Model: tomaarsen/mpnet-base-nli-matryoshka
- Epochs: 4
- LR: 2e-5
- Batch: 32
- Max length: 128 tokens
- Loss: CrossEntropy (label: 0, 1)
- Hardware: a10g-large GPU, Trackio for experiment tracking
- Full SFT python/train script in repo: finetune_matryoshka_telecom_nli.py
Usage:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("agraharr/finetune-matryoshka-telecom")
tokenizer = AutoTokenizer.from_pretrained("agraharr/finetune-matryoshka-telecom")

Dataset

Resource: teleqna_sentence_pairs.jsonl (view), teleqna_sentence_pairs.tsv
Size: ≈ 46,322 sentence pairs
Source: Generated from telecom Q&A, curated from standards, publications, and domain lexicons
Format:
- sentence1: telecom question
- sentence2: telecom answer/candidate/option
- label: 1 (correct/entailed), 0 (incorrect/rejected)

Task & Intended Use

Downstream: Telecom question answering, entailment, retrieval, reranking, NLI-style scoring
Benchmarks: Custom telecom QA, 3GPP/industry exams, technical evaluation
API/Model Head: Sequence classification head, softmax; use as zero-shot scorer or finetune further for retrieval/ranking

Citations

Base Model:

Tijs Tomaarsen. "mpnet-base-nli-matryoshka." Hugging Face. https://huggingface.co/tomaarsen/mpnet-base-nli-matryoshka

Matryoshka Pooling:

Liu, X., Kalyan, A., Singh, Y. S., et al. "Matryoshka Representations for Efficient and Robust Sentence Embeddings." NeurIPS 2023. DOI: 10.48550/arxiv.2310.06665

Data Curation/QA Extraction:

Various telecom standards (3GPP, IEEE), public Q&A mining, domain lexicons, compiled by agraharr (2024)

Contact/Provenance

Model created and trained via ML Intern.
Author: agraharr

Model on HF Hub & Documentation

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support