Edit model card
Browse files
README.md
CHANGED
@@ -41,6 +41,8 @@ license: llama2
|
|
41 |
datasets:
|
42 |
- Photolens/oasst1-langchain-llama-2-formatted
|
43 |
---
|
|
|
|
|
44 |
|
45 |
## Model Overview
|
46 |
Model license: Llama-2<br>
|
@@ -52,25 +54,4 @@ This model is trained based on [NousResearch/Llama-2-7b-chat-hf](https://hugging
|
|
52 |
```
|
53 |
|
54 |
## Intended Use
|
55 |
-
Dataset that is used to finetune base model is optimized for langchain applications.<br>
|
56 |
-
So this model is intended for a langchain LLM.
|
57 |
-
|
58 |
-
## Training Details
|
59 |
-
This model took `1:14:16` to train in QLoRA on a single `A100 40gb` GPU.<br>
|
60 |
-
- *epochs*: `1`
|
61 |
-
- *train batch size*: `8`
|
62 |
-
- *eval batch size*: `8`
|
63 |
-
- *gradient accumulation steps*: `1`
|
64 |
-
- *maximum gradient normal*: `0.3`
|
65 |
-
- *learning rate*: `2e-4`
|
66 |
-
- *weight decay*: `0.001`
|
67 |
-
- *optimizer*: `paged_adamw_32bit`
|
68 |
-
- *learning rate schedule*: `cosine`
|
69 |
-
- *warmup ratio (linear)*: `0.03`
|
70 |
-
|
71 |
-
## Models in this series
|
72 |
-
| Model | Train time | Size (in params) | Base Model |
|
73 |
-
---|---|---|---
|
74 |
-
| [llama-2-7b-langchain-chat](https://huggingface.co/Photolens/llama-2-7b-langchain-chat/) | 1:14:16 | 7 billion | [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) |
|
75 |
-
| [llama-2-13b-langchain-chat](https://huggingface.co/Photolens/llama-2-13b-langchain-chat/) | 2:50:27 | 13 billion | [TheBloke/Llama-2-13B-Chat-fp16](https://huggingface.co/TheBloke/Llama-2-13B-Chat-fp16) |
|
76 |
-
| [Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat](https://huggingface.co/Photolens/OpenOrcaxOpenChat-2-13b-langchain-chat/) | 2:56:54 | 13 billion | [Open-Orca/OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) |
|
|
|
41 |
datasets:
|
42 |
- Photolens/oasst1-langchain-llama-2-formatted
|
43 |
---
|
44 |
+
Model by [Photolens/llama-2-7b-langchain-chat](https://huggingface.co/Photolens/llama-2-7b-langchain-chat) converted in GGUF format.
|
45 |
+
|
46 |
|
47 |
## Model Overview
|
48 |
Model license: Llama-2<br>
|
|
|
54 |
```
|
55 |
|
56 |
## Intended Use
|
57 |
+
Dataset that is used to finetune base model is optimized for langchain applications.<br>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|