OwenArli
/

ArliAI-Llama-3-8B-Dolfin-v0.2-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Awan LLM commited on Apr 30, 2024

Commit

38f6f18

·

verified ·

1 Parent(s): 86c7445

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ Training:
 - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Trained on a modified and improved version of Cognitive Computations Eric Hartford's Dolphin dataset. https://huggingface.co/datasets/cognitivecomputations/dolphin
 - Training duration is around 1 day on 2x RTX3090 on our own machine, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
--
 The goal for this model is to have the model less-censored and great at general tasks like the previous dolphin based models by Eric Hartford.

 - 4096 sequence length, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Trained on a modified and improved version of Cognitive Computations Eric Hartford's Dolphin dataset. https://huggingface.co/datasets/cognitivecomputations/dolphin
 - Training duration is around 1 day on 2x RTX3090 on our own machine, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.
 The goal for this model is to have the model less-censored and great at general tasks like the previous dolphin based models by Eric Hartford.