Upload quantized model versions of Photolens/llama-2-7b-langchain-chat

e95191e 10 months ago

No virus

2.07 kB

	---
	language:
	- en
	- es
	- ru
	- de
	- pl
	- th
	- vi
	- sv
	- bn
	- da
	- he
	- it
	- fa
	- sk
	- id
	- nb
	- el
	- nl
	- hu
	- eu
	- zh
	- eo
	- ja
	- ca
	- cs
	- bg
	- fi
	- pt
	- tr
	- ro
	- ar
	- uk
	- gl
	- fr
	- ko
	task_categories:
	- conversational
	license: llama2
	datasets:
	- Photolens/oasst1-langchain-llama-2-formatted
	---

	## Model Overview
	Model license: Llama-2<br>
	This model is trained based on [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) model that is QLoRA finetuned on [Photolens/oasst1-langchain-llama-2-formatted](https://huggingface.co/datasets/Photolens/oasst1-langchain-llama-2-formatted) dataset.<br>

	## Prompt Template: Llama-2
	```
	<s>[INST] Prompter Message [/INST] Assistant Message </s>
	```

	## Intended Use
	Dataset that is used to finetune base model is optimized for langchain applications.<br>
	So this model is intended for a langchain LLM.

	## Training Details
	This model took `1:14:16` to train in QLoRA on a single `A100 40gb` GPU.<br>
	- epochs: `1`
	- train batch size: `8`
	- eval batch size: `8`
	- gradient accumulation steps: `1`
	- maximum gradient normal: `0.3`
	- learning rate: `2e-4`
	- weight decay: `0.001`
	- optimizer: `paged_adamw_32bit`
	- learning rate schedule: `cosine`
	- warmup ratio (linear): `0.03`

	## Models in this series
	\| Model \| Train time \| Size (in params) \| Base Model \|
	---\|---\|---\|---
	\| [llama-2-7b-langchain-chat](https://huggingface.co/Photolens/llama-2-7b-langchain-chat/) \| 1:14:16 \| 7 billion \| [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) \|
	\| [llama-2-13b-langchain-chat](https://huggingface.co/Photolens/llama-2-13b-langchain-chat/) \| 2:50:27 \| 13 billion \| [TheBloke/Llama-2-13B-Chat-fp16](https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16) \|
	\| [Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat](https://huggingface.co/Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat/) \| 2:56:54 \| 13 billion \| [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) \|