Awan LLM
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ Training:
|
|
14 |
- 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
|
15 |
- Trained on a modified and improved version of Cognitive Computations Eric Hartford's Dolphin dataset. https://huggingface.co/datasets/cognitivecomputations/dolphin
|
16 |
- Training duration is around 1 day on 2x RTX3090 on our own machine, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
|
17 |
-
|
18 |
|
19 |
The goal for this model is to have the model less-censored and great at general tasks like the previous dolphin based models by Eric Hartford.
|
20 |
|
|
|
14 |
- 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
|
15 |
- Trained on a modified and improved version of Cognitive Computations Eric Hartford's Dolphin dataset. https://huggingface.co/datasets/cognitivecomputations/dolphin
|
16 |
- Training duration is around 1 day on 2x RTX3090 on our own machine, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
|
17 |
+
|
18 |
|
19 |
The goal for this model is to have the model less-censored and great at general tasks like the previous dolphin based models by Eric Hartford.
|
20 |
|