Edit model card

Telco_Transformer_V1

The objective of the 'Telco Transformer' initiative is to pre-train a language model for the telecom industry to understand complex, contextual relationships in domain specific text data. Business hypothesis hinges on non-standard natural languages, such as components of a telecom system, technical terminology, and rich knowledge from multiple subdomains. This builds a strong case for pretraining a model from scratch. It constitutes a custom tokenizer to capture telco vocabulary, large scale unsupervised pre-training that is paired with supervised fine tuning to perform well on downstream tasks. This model will be able to complete sentences, and answer questions with accuracy that is superior to RAG or a fine tuned model.

Model V1 achieves the following results on the evaluation set:

  • Loss: 3.8171

Input Token Examples

RAN, IP Network, Radio, 5G, Core Network

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: sagemaker_data_parallel
  • num_devices: 8
  • total_train_batch_size: 128
  • total_eval_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 300
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
5.3308 1.57 500 5.2874
4.9378 3.14 1000 4.8655
4.5929 4.72 1500 4.5568
4.4308 6.29 2000 4.3593
4.2703 7.86 2500 4.2217
4.1977 9.43 3000 4.1222
4.0986 11.01 3500 4.0477
4.0791 12.58 4000 3.9904
3.9625 14.15 4500 3.9470
3.9381 15.72 5000 3.9114
3.9399 17.3 5500 3.8844
3.9146 18.87 6000 3.8640
3.8779 20.44 6500 3.8468
3.844 22.01 7000 3.8355
3.8364 23.58 7500 3.8266
3.8566 25.16 8000 3.8216
3.8411 26.73 8500 3.8187
3.815 28.3 9000 3.8172
3.8225 29.87 9500 3.8171

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0
  • Datasets 2.16.1
  • Tokenizers 0.13.3
Downloads last month
35
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for subhashtalluri/Telco_Transformer_V1

Finetunes
1 model