--- license: apache-2.0 datasets: - vietgpt/wikipedia_vi - oscar-corpus/OSCAR-2301 language: - vi - en pipeline_tag: text-generation --- # Concept of open-llama-7b-vi This is a OpenLLama model finetuned on texts in the Vietnamese language. ## Model architecture The model architecture is the same as the original OpenLLama model ## Training Data The models are trained on the Vietnamese version of Wikipedia. The generated corpus files are 1.5GB in total, containing approximately 1.3M sentences.