Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

.gitattributes +4 -0
README.md +66 -0
prodigy-sm-base-v0.1-Q4_K_M.gguf +3 -0
prodigy-sm-base-v0.1-Q5_K_M.gguf +3 -0
prodigy-sm-base-v0.1-Q8_K_M.gguf +3 -0
prodigy-sm-base-v0.1.gguf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+prodigy-sm-base-v0.1-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+prodigy-sm-base-v0.1-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+prodigy-sm-base-v0.1-Q8_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+prodigy-sm-base-v0.1.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+license: apache-2.0
+language:
+- en
+- sr
+- hr
+- bs
+---
+# Prodigy SM Base v0.1
+<img src="https://cdn-uploads.huggingface.co/production/uploads/617bbeec14572ebe9e6ea83f/4p2zaOWu6kTS3fcbevHef.png" width="70%" height="70%">
+In our latest endeavour, we performed continued pre-training of a large language model (Mistral-7b-v0.1) to understand and generate text in new languages, including **Serbian**, **Bosnian** and **Croatian** using an innovative approach.
+Rather than depending only on extensive datasets in the target language, our method utilizes a more compact set of both synthetic and human-curated data along with some mixture of CC Web data, which is implemented in two strategic phases:
+1. Establishing a comprehensive demonstration of all grammatical and orthographic rules pertinent to the language.
+2. Supplying a diverse array of examples that not only reinforce these rules but also integrate a wide range of linguistic nuances.
+While our approach is uniquely tailored to our objectives, we have drawn some inspiration from recent advancements in language model training. Specifically, the conceptual strategies discussed in the paper [ADAPTING LARGE LANGUAGE MODELS VIA READING COMPREHENSION](https://arxiv.org/pdf/2309.09530.pdf) provided valuable insights, though our methods diverge significantly in practice. By adopting this inspired approach, we aim to efficiently teach the model new languages with a balanced blend of accuracy and linguistic diversity.
+So... Did it work?!
+# **Yes!**
+See the benchmark results, or even better, download the model and try it yourself. As you know by now, there's no better benchmark than a quick 'try it yourself' vibe check. :)
+<img src="https://cdn-uploads.huggingface.co/production/uploads/617bbeec14572ebe9e6ea83f/C9m_OjnYEpQo43VCrwz4A.png" width="100%" height="100%">
+Here, we demonstrate results of benchmark that is not frequently performed, yet equally important: how adapting the model for a new language impacted its original English-only performance.
+<img src="https://cdn-uploads.huggingface.co/production/uploads/617bbeec14572ebe9e6ea83f/IPY0myfQI-Ne5x6b11glz.png" width="100%" height="100%">
+*All evals are performed in zero shot manner.
+*Also bear in mind that llama-2-7b, llama-3-8b and mistral-7b models compared to Prodigy SM base aren't trained on extensive Serbian language datasets, and these benchmarks demonstrate that primarily English models can be adapted to other languages.
+So, as you can see, we successfully improved the original model's performance for Serbian language use cases while retaining or even slightly improving its performance for English language.
+### Training results
+Training results of continued pre-training of [mistral-7b-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
+<img src="https://cdn-uploads.huggingface.co/production/uploads/617bbeec14572ebe9e6ea83f/5xeJ-vfWk4RhJNC7t5I0g.png" width="70%" height="70%">
+<img src="https://cdn-uploads.huggingface.co/production/uploads/617bbeec14572ebe9e6ea83f/R4R8ai8LaN3WlYCOenUyb.png" width="70%" height="70%">
+As last experimental step we merged produced model with **Mistral-7B-v0.1** and two earlier checkpoints from **prodigy-sm-base** using [Model Stock](https://arxiv.org/abs/2403.19522) method.
+# Notes
+As this is base model, there is no chat template or strict chat following capabilities, this model is best candidate for further pre-train on Serbian language as there is a lot more room for improvement (you can hit sweet spot), or next step in the pipeline, such as some form of chat or instruct tuning.
+If you want model that is already instruction tuned we did that too, check **Prodigy SM Instruct v0.1**
+# Prodigy SM Instruct v0.1
+🚀[prodigy-sm-instruct]() **COMING SOON**
+And stay tuned for:
+[prodigy-sm-base (llama-3)]() **COMING SOON**
+[prodigy-sm-instruct (llama-3)]() **COMING SOON**
+📢 Also we are excited to announce that [iskon.ai](https://Iskon.ai) will soon launch an API platform featuring advanced **Prodigy** series of models, advanced AI tools and much more! 🚀
+# Thanks
+  - [gordicaleksa/serbian-llm-eval](https://github.com/gordicaleksa/serbian-llm-eval) and his community for curating translations and adaptation of [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
+that we used to perform benchmarks.
+ - [jondurbin](https://huggingface.co/jondurbin) for amazing airoboros framework
+ - [teknium](https://huggingface.co/teknium) for various insights shared on discord and twitter aka x.com
+ - [Eric](https://twitter.com/erhartford) for various insights shared on discord and twitter aka x.com
+ - [mergekit](https://github.com/arcee-ai/mergekit) for model merging tools
+*Huge thanks to Redmond.ai for generous DGX cloud credits* [redmond.ai]( https://redmond.ai)

prodigy-sm-base-v0.1-Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1986511afe78b7087fb7a185bc947926b5a0dee84019d1e9f8c513c07439346
+size 4368439008

prodigy-sm-base-v0.1-Q5_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8340e398b03c37f5822c014d006ce0501e20b4d96fbefffe3a4a6aec46dce75e
+size 5131409120

prodigy-sm-base-v0.1-Q8_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f986dae1498f242b7b569d5cf1327be826d1217a681e8f7dd2d9c64c005d614f
+size 7695857376

prodigy-sm-base-v0.1.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:eddb08d54eb92cc91cf075d0edf837b167dd8afa9d1d86c5f8456bd466f592c7
+size 14484731584