ibivibiv
/

aegolius-acadicus-24b-v2

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ibivibiv commited on Jan 28

Commit

4eca82b

•

1 Parent(s): 777b582

Update README.md

Files changed (1) hide show

README.md +18 -4

README.md CHANGED Viewed

@@ -9,9 +9,24 @@ tags:
 # Aegolius Acadicus 24B V2
 ![img](./aegolius-acadicus.png)
-I like to call this model line "The little professor". They are MOE merges of 7B fine tuned models to cover general knowledge use cases.
 # Prompting
@@ -48,12 +63,11 @@ print(text)
 * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
 * **Model type:**  **aegolius-acadicus-24b-v2** is an auto-regressive language model moe from Llama 2 transformer architecture models and mistral models.
 * **Language(s)**: English
-* **Purpose**: This model is an iteration of an moe model (the original Aegolius Acadicus) to lower the model size and maintain capabilities.
 # Benchmark Scores
-pending
 ## Citations

 # Aegolius Acadicus 24B V2
+# Aegolius Acadicus 30B
 ![img](./aegolius-acadicus.png)
+I like to call this model "The little professor".  It is simply a MOE merge of lora merged models across Llama2 and Mistral.  I am using this as a test case to move to larger models and get my gate discrimination set correctly.  This model is best suited for knowledge related use cases, I did not give it a specific workload target as I did with some of the other models in the "Owl Series".
+In this particular run I am starting to collapse data sets and model count to see if that helps/hurts
+This model is merged from the following sources:
+[Fine Tuned Mistral of Mine](https://huggingface.co/ibivibiv/temp_tuned_mistral)
+[WestLake-7B-v2-laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser)
+[openchat-nectar-0.5](https://huggingface.co/andysalerno/openchat-nectar-0.5)
+[WestSeverus-7B-DPO](https://huggingface.co/PetroGPT/WestSeverus-7B-DPO)
+Unless those models are "contaminated" this one is not.  This is a proof of concept version of this series and you can find others where I am tuning my own models and using moe mergekit to combine them to make moe models that I can run on lower tier hardware with better results.
+The goal here is to create specialized models that can collaborate and run as one model.
 # Prompting
 * **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
 * **Model type:**  **aegolius-acadicus-24b-v2** is an auto-regressive language model moe from Llama 2 transformer architecture models and mistral models.
 * **Language(s)**: English
+* **Purpose**: This model is an attempt at an moe model to cover multiple disciplines using finetuned llama 2 and mistral models as base models.
 # Benchmark Scores
+coming soon
 ## Citations