--- library_name: pytorch pipeline_tag: text2text-generation language: - vi - lo metrics: - bleu --- Please use python version 3.10 ## Direct Use ### Load a pre-trained model Use `load_config` to load a .yaml config file. Then use `load_model_tokenizer` to load a pretrained model and its tokenizers ``` from config import load_config from load_model import load_model_tokenizer config = load_config(file_name='config/config_final.yaml') model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config) ``` ### Translate lo to vi Use the `translate` function in `translate.py`. ``` from translate import translate from config import load_config from load_model import load_model_tokenizer config = load_config(file_name='config/config_final.yaml') model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config) text = " " translation, attn = translate( model, src_tokenizer, tgt_tokenizer, text, decode_method='beam-search', ) print(translation) ``` ## Training Use the `train_model` function in `train.py` to train your model. ``` from train import train_model from config import load_config config = load_config(file_name='config/config_final.yaml') train_model(config) ``` If you wish to continue training/ fine-tune our model, you should modify the `num_epochs` in your desired config file, as well as read the following notes (`+` is the string concat funtion): - The code will save and preload models in `model_folder` - The code will preload the model with the name: "`model_basename` + `preload` + `.pt`" - The code will NOT preload a trained model if you set `preload` as `null` - Every epoch, the code will save the model with the name: "`model_basename` + `_` + (current epoch) + `.pt`" - `train_model` will automatically continue training the `preload`ed model.