metadata
library_name: pytorch
pipeline_tag: text2text-generation
language:
- vi
- lo
metrics:
- bleu
Please use python version 3.10
Direct Use
Load a pre-trained model
Use load_config
to load a .yaml config file.
Then use load_model_tokenizer
to load a pretrained model and its tokenizers
from config import load_config
from load_model import load_model_tokenizer
config = load_config(file_name='config/config_final.yaml')
model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config)
Translate lo to vi
Use the translate
function in translate.py
.
from translate import translate
from config import load_config
from load_model import load_model_tokenizer
config = load_config(file_name='config/config_final.yaml')
model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config)
text = " "
translation, attn = translate(
model, src_tokenizer, tgt_tokenizer, text,
decode_method='beam-search',
)
print(translation)
Training
Use the train_model
function in train.py
to train your model.
from train import train_model
from config import load_config
config = load_config(file_name='config/config_final.yaml')
train_model(config)
If you wish to continue training/ fine-tune our model, you should
modify the num_epochs
in your desired config file,
as well as read the following notes (+
is the string concat funtion):
- The code will save and preload models in
model_folder
- The code will preload the model with the name: "
model_basename
+preload
+.pt
" - The code will NOT preload a trained model if you set
preload
asnull
- Every epoch, the code will save the model with the name: "
model_basename
+_
+ (current epoch) +.pt
" train_model
will automatically continue training thepreload
ed model.