--- license: apache-2.0 language: - vi - en --- ## Model Details ### Model Description - **Developed by:** Tuan Pham (FPTU HCM Student) - **Model type:** Llama2-7B Decoder-only - **Finetuned from model :** * meta-llama/Llama-2-7b * bkai-foundation-models/vietnamese-llama2-7b-120GB * yeen214/llama2_7b_merge_orcafamily. - **Bilingual support :** English and Vietnamese ### Model Sources - **Repository:** * Training: https://github.com/vTuanpham/Vietnamese_QA_System * Data: https://github.com/vTuanpham/Large_dataset_translator - **Paper:** ... - **Demo:** ... ## Uses ### Prompt template ``` [SYSTEM_PROMPT] ####### Instruction: [INPUT] %%%%%%% Response: [RESPONSE] ``` ## How to Get Started with the Model Use the code below to get started with the model. ```python from torch.cuda.amp import autocast from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline model_name = "1TuanPham/InstructEnVi_llama2-bkai-120GB-Orcafamily_250kx3.37_350kx1.4" model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, use_cache=True, ) tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) streamer = TextStreamer(tokenizer, skip_special_tokens=True) pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer) with autocast(): output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128) ``` ## Training Details **Hardware Type:** * GPU: VGA NVIDIA Tesla P100 16GB * SYSTEM RAM: 29GB **Hours used:** ~43.5 Approx* ### Training Data * BactrianX * OpenOrca_translated * WizardLM_70k_translated * TigerLabMathInstruct_translated_vi * GradeSchoolMathInstruct_translated * vilm_lima-vi * MTEngVietnamese * databricks_dolly15k_translated * AlpacaCleaned_translated * databricks_dolly15k * OpenOrca * GradeSchoolMathInstruct * AlpacaCleaned * WebglmQA ### Training Procedure * Learning rate: 2e-5 cosine * Optimizer: PagedLion8bit * QLora: rank: 64 /Q: 4-bit - 250k examples of 70% Vietnamese 30% English for 3.37 epoch - 350k examples of 60% Vietnamese 40% English for 1.4 epoch ### Training loss ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/rV8Go_YFZv7QcR_FhFxp-.png) ## Evaluation ### Results [More Information Needed] ## Technical Specifications ### Model Architecture and Objective [More Information Needed] ## Citation ## Model Card Authors ## Model Card Contact [More Information Needed]