SicariusSicariiStuff
/

LLAMA-3_8B_Unaligned

Model card Files Files and versions Community

SicariusSicariiStuff commited on Jul 27

Commit

f3a522c

•

1 Parent(s): 4445052

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ language:
 <details>
   <summary><b>July 26th, 2024, moving on to LLAMA 3.1</b></summary>
-  One step forward, one step backward. Many issues were solved, but a few new ones were encountered. As I already updated in my "blog"(https://huggingface.co/SicariusSicariiStuff/Blog_And_Updates#july-26th-2024), I originally wanted to finetune Gradients 0.25M\1M\4M LLAMA3 8B model, but almost at the same time I concluded the model is really not that great in even 8k context, Zuck the CHAD dropped LLAMA 3.1.
   LLAMA 3.1 is **128k context**, which probably means that in practice it will be somewhat coherent at 32k context, as a guesstimate. Also, I've heard from several people who have done some early tests, that the new LLAMA 3.1 8B is even better than the new Mistral Nemo 12B. IDK if that's true, but overall LLAMA 3.1 does seem to be a much better version of the "regular" LLAMA 3.

 <details>
   <summary><b>July 26th, 2024, moving on to LLAMA 3.1</b></summary>
+  One step forward, one step backward. Many issues were solved, but a few new ones were encountered. As I already updated in my "blog"(https://huggingface.co/SicariusSicariiStuff/Blog_And_Updates#july-26th-2024), I originally wanted to finetune Gradient' 0.25M\1M\4M LLAMA3 8B model, but almost at the same time I concluded the model is really not that great in even 8k context, Zuck the CHAD dropped LLAMA 3.1.
   LLAMA 3.1 is **128k context**, which probably means that in practice it will be somewhat coherent at 32k context, as a guesstimate. Also, I've heard from several people who have done some early tests, that the new LLAMA 3.1 8B is even better than the new Mistral Nemo 12B. IDK if that's true, but overall LLAMA 3.1 does seem to be a much better version of the "regular" LLAMA 3.