metadata

license: apache-2.0
base_model: google/electra-small-discriminator
tags:
  - generated_from_keras_callback
model-index:
  - name: nguyennghia0902/electra-small-discriminator_0.0001_16_15e
    results: []
language:
  - vi
  - en
metrics:
  - accuracy
pipeline_tag: question-answering
datasets:
  - nguyennghia0902/project02_textming_dataset

nguyennghia0902/electra-small-discriminator_0.0001_16_15e

This model is a fine-tuned version of google/electra-small-discriminator on Vietnamese dataset. It achieves the following results on the evaluation set:

Train Loss: 0.4315
Train End Logits Accuracy: 0.8714
Train Start Logits Accuracy: 0.8580
Validation Loss: 0.1470
Validation End Logits Accuracy: 0.9577
Validation Start Logits Accuracy: 0.9542
Test Matching Accuracy: 0.90209
Epoch: 15
Train time: 21920.9752 seconds ~ 6.09 hours

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Learning rate: 1e-4
Batch size: 16
optimizer: { 'name': 'Adam', 'learning_rate': { 'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': { 'initial_learning_rate': 0.0001, 'decay_steps': 46905, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False } }, 'epsilon': 1e-08 }
training_precision: float32

Training results

Train Loss	Train End Logits Accuracy	Train Start Logits Accuracy	Validation Loss	Validation End Logits Accuracy	Validation Start Logits Accuracy	Epoch
2.9418	0.3441	0.3115	2.1831	0.4777	0.4649	0
2.2767	0.4696	0.4357	1.7802	0.5643	0.5481	1
1.9907	0.5234	0.4941	1.5055	0.6229	0.6068	2
1.7630	0.5690	0.5440	1.2348	0.6824	0.6708	3
1.5637	0.6086	0.5842	1.0345	0.7291	0.7190	4
1.3785	0.6500	0.6241	0.8309	0.7823	0.7724	5
1.2118	0.6880	0.6604	0.6918	0.8105	0.8116	6
1.0610	0.7222	0.6963	0.5471	0.8490	0.8476	7
0.9249	0.7495	0.7272	0.4426	0.8770	0.8763	8
0.8085	0.7777	0.7585	0.3695	0.8919	0.8908	9
0.7062	0.8018	0.7843	0.2773	0.9194	0.9198	10
0.6182	0.8232	0.8043	0.2323	0.9343	0.9302	11
0.5422	0.8414	0.8267	0.1807	0.9470	0.9470	12
0.4797	0.8588	0.8443	0.1570	0.9530	0.9515	13
0.4315	0.8714	0.8580	0.1470	0.9577	0.9542	14

Framework versions

Transformers 4.39.3
TensorFlow 2.15.0
Datasets 2.18.0
Tokenizers 0.15.2

How to use?

from transformers import ElectraTokenizerFast, TFElectraForQuestionAnswering

model_hf = "nguyennghia0902/electra-small-discriminator_0.0001_16_15e"
tokenizer = ElectraTokenizerFast.from_pretrained(model_hf)
reload_model = TFElectraForQuestionAnswering.from_pretrained(model_hf)

question = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh bao gồm có bao nhiêu khu?"
context = "Ký túc xá Đại học Quốc gia Thành phố Hồ Chí Minh (Ký túc xá ĐHQG-TPHCM) là hệ thống ký túc xá xây tại Khu đô thị Đại học Quốc gia Thành phố Hồ Chí Minh (còn gọi với tên phổ biến: Khu đô thị ĐHQG-HCM hay Làng Đại học Thủ Đức). Ký túc xá ĐHQG-TPHCM gồm có 02 khu: A và B. Địa chỉ: Đường Tạ Quang Bửu, Khu phố 6, phường Linh Trung, thành phố Thủ Đức, Thành phố Hồ Chí Minh, điện thoại: 1900 05 55 59 (111). "

inputs = tokenizer(question, context, return_offsets_mapping=True, return_tensors="tf", max_length=512, truncation=True)
offset_mapping = inputs.pop("offset_mapping")
outputs = reload_model(**inputs)
answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
start_char = offset_mapping[0][answer_start_index][0]
end_char = offset_mapping[0][answer_end_index][1]
predicted_answer_text = context[start_char:end_char]

print(predicted_answer_text)