Telco_Transformer_V1

The objective of the 'Telco Transformer' initiative is to pre-train a language model for the telecom industry to understand complex, contextual relationships in domain specific text data. Business hypothesis hinges on non-standard natural languages, such as components of a telecom system, technical terminology, and rich knowledge from multiple subdomains. This builds a strong case for pretraining a model from scratch. It constitutes a custom tokenizer to capture telco vocabulary, large scale unsupervised pre-training that is paired with supervised fine tuning to perform well on downstream tasks. This model will be able to complete sentences, and answer questions with accuracy that is superior to RAG or a fine tuned model.

Model V1 achieves the following results on the evaluation set:

Loss: 3.8171

Input Token Examples

RAN, IP Network, Radio, 5G, Core Network

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
distributed_type: sagemaker_data_parallel
num_devices: 8
total_train_batch_size: 128
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 300
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
5.3308	1.57	500	5.2874
4.9378	3.14	1000	4.8655
4.5929	4.72	1500	4.5568
4.4308	6.29	2000	4.3593
4.2703	7.86	2500	4.2217
4.1977	9.43	3000	4.1222
4.0986	11.01	3500	4.0477
4.0791	12.58	4000	3.9904
3.9625	14.15	4500	3.9470
3.9381	15.72	5000	3.9114
3.9399	17.3	5500	3.8844
3.9146	18.87	6000	3.8640
3.8779	20.44	6500	3.8468
3.844	22.01	7000	3.8355
3.8364	23.58	7500	3.8266
3.8566	25.16	8000	3.8216
3.8411	26.73	8500	3.8187
3.815	28.3	9000	3.8172
3.8225	29.87	9500	3.8171

Framework versions

Transformers 4.28.1
Pytorch 2.0.0
Datasets 2.16.1
Tokenizers 0.13.3

subhashtalluri
/

Telco_Transformer_V1

Telco_Transformer_V1

Input Token Examples

Training hyperparameters

Training results

Framework versions

Model tree for subhashtalluri/Telco_Transformer_V1

Evaluation results