--- language: - en - es - ru - de - pl - th - vi - sv - bn - da - he - it - fa - sk - id - nb - el - nl - hu - eu - zh - eo - ja - ca - cs - bg - fi - pt - tr - ro - ar - uk - gl - fr - ko task_categories: - conversational license: llama2 datasets: - Photolens/oasst1-langchain-llama-2-formatted --- ## Model Overview Model license: Llama-2
This model is trained based on [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) model that is QLoRA finetuned on [Photolens/oasst1-langchain-llama-2-formatted](https://huggingface.co/datasets/Photolens/oasst1-langchain-llama-2-formatted) dataset.
## Prompt Template: Llama-2 ``` [INST] Prompter Message [/INST] Assistant Message ``` ## Intended Use Dataset that is used to finetune base model is optimized for langchain applications.
So this model is intended for a langchain LLM. ## Training Details This model took `1:14:16` to train in QLoRA on a single `A100 40gb` GPU.
- *epochs*: `1` - *train batch size*: `8` - *eval batch size*: `8` - *gradient accumulation steps*: `1` - *maximum gradient normal*: `0.3` - *learning rate*: `2e-4` - *weight decay*: `0.001` - *optimizer*: `paged_adamw_32bit` - *learning rate schedule*: `cosine` - *warmup ratio (linear)*: `0.03` ## Models in this series | Model | Train time | Size (in params) | Base Model | ---|---|---|--- | [llama-2-7b-langchain-chat](https://huggingface.co/Photolens/llama-2-7b-langchain-chat/) | 1:14:16 | 7 billion | [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) | | [llama-2-13b-langchain-chat](https://huggingface.co/Photolens/llama-2-13b-langchain-chat/) | 2:50:27 | 13 billion | [TheBloke/Llama-2-13B-Chat-fp16](https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16) | | [Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat](https://huggingface.co/Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat/) | 2:56:54 | 13 billion | [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) |