Telco Transformer V1 Chat

Telco Transformer Chat model is a subset to facilitate conversational interactions. The objective is to train a chat model to be good at conversing in a context. The base model is provided with a general comprehension of the telco language. It is then finetuned on curated instruction dataset to improve overall conversation quality.

Telco Transformer V1 Chat is a fine-tuned version of subhashtalluri/Telco_Transformer_1_4. It achieves the following results on the evaluation set:

Loss: 3.8628

Model Usage

Context Length: 512 tokens

Prompt Definition:

Q): {user question}

A):

Prompt Examples for V1 Chat:

Currently, the model is limited to answering questions about MPLS - Multiprotocol label switching. Please ask questions about MPLS is various forms.

What is MPLS?
Describe MPLS?
When is MPLS used?
What are the pros and cons of MPLS?
Why do we need MPLS?
How does MPLS work?
What are the drawbacks of MPLS?
How does routing work in MLPS?
What is MPLS used for? etc..

Please note: The response is limited to 512 tokens. It will give a compute error when generating beyond this limit.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
3.3391	0.56	50	3.9165
2.5623	1.12	100	3.8315
3.6035	1.69	150	3.7690
3.6966	2.25	200	3.7529
3.3786	2.81	250	3.7020
1.8281	3.37	300	3.7681
2.8276	3.93	350	3.7432
1.4719	4.49	400	3.7911
1.2252	5.06	450	3.7793
1.3996	5.62	500	3.7991
1.1709	6.18	550	3.8068
0.7002	6.74	600	3.8335
2.2776	7.3	650	3.8329
2.0515	7.87	700	3.8596
2.2111	8.43	750	3.8686
1.1622	8.99	800	3.8587
0.7239	9.55	850	3.8628

Framework versions

Transformers 4.39.0.dev0
Pytorch 2.0.0+cu117
Datasets 2.10.1
Tokenizers 0.15.2

subhashtalluri
/

Telco_Transformer_V1_Chat