Telecom NLI Matryoshka Model

Repo: agraharr/finetune-matryoshka-telecom

Model Overview

A finetuned sentence-pair transformer model for telecom domain natural language inference (NLI) and question-answer evaluation.

  • Base model: tomaarsen/mpnet-base-nli-matryoshka
  • Architecture: MPNet, Matryoshka sequence pooling, 2-class (entail/reject) head
  • Domain: Telecom, 3GPP, networking, standards, wireless, industry QA

Training Details

  • Method: Supervised sentence pair classification (entailment:1, no entailment:0)
  • Code: Hugging Face Transformers Trainer
  • Key config:
    • Model: tomaarsen/mpnet-base-nli-matryoshka
    • Epochs: 4
    • LR: 2e-5
    • Batch: 32
    • Max length: 128 tokens
    • Loss: CrossEntropy (label: 0, 1)
    • Hardware: a10g-large GPU, Trackio for experiment tracking
    • Full SFT python/train script in repo: finetune_matryoshka_telecom_nli.py
  • Usage:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("agraharr/finetune-matryoshka-telecom")
tokenizer = AutoTokenizer.from_pretrained("agraharr/finetune-matryoshka-telecom")

Dataset

  • Resource: teleqna_sentence_pairs.jsonl (view), teleqna_sentence_pairs.tsv
  • Size: ≈ 46,322 sentence pairs
  • Source: Generated from telecom Q&A, curated from standards, publications, and domain lexicons
  • Format:
    • sentence1: telecom question
    • sentence2: telecom answer/candidate/option
    • label: 1 (correct/entailed), 0 (incorrect/rejected)

Task & Intended Use

  • Downstream: Telecom question answering, entailment, retrieval, reranking, NLI-style scoring
  • Benchmarks: Custom telecom QA, 3GPP/industry exams, technical evaluation
  • API/Model Head: Sequence classification head, softmax; use as zero-shot scorer or finetune further for retrieval/ranking

Citations

Base Model:

Tijs Tomaarsen. "mpnet-base-nli-matryoshka." Hugging Face. https://huggingface.co/tomaarsen/mpnet-base-nli-matryoshka

Matryoshka Pooling:

Liu, X., Kalyan, A., Singh, Y. S., et al. "Matryoshka Representations for Efficient and Robust Sentence Embeddings." NeurIPS 2023. DOI: 10.48550/arxiv.2310.06665

Data Curation/QA Extraction:

Various telecom standards (3GPP, IEEE), public Q&A mining, domain lexicons, compiled by agraharr (2024)

Contact/Provenance

  • Model created and trained via ML Intern.
  • Author: agraharr

Model on HF Hub & Documentation

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support