NexaAIDev
/

Octopus-v4

@@ -95,6 +95,44 @@ print(f'Elapsed time: {end - start:.2f}s')
 This model was trained on commercially viable data. For use of our model, refer to the [license information](https://www.nexa4ai.com/licenses).
 ## References
 We thank the Microsoft team for their amazing model!
 ```

 This model was trained on commercially viable data. For use of our model, refer to the [license information](https://www.nexa4ai.com/licenses).
+## Performance
+###  Model Selection
+We leverage the latest Language Large Models for a variety of domains. Below is a summary of the chosen models for each category. In cases where no specialized model exists for a subject, we utilize generic models like Llama3-8b.
+| **Model**                               | **Category**       | **Subjects**                                                                                                                                                      |
+|-----------------------------------------|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `jondurbin/bagel-8b-v1.0`               | Biology            | `college_biology`, `high_school_biology`                                                                                                                          |
+| `Weyaxi/Einstein-v6.1-Llama3-8B`        | Physics            | `astronomy`, `college_physics`, `conceptual_physics`, `high_school_physics`                                                                                       |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Business           | `business_ethics`, `management`, `marketing`                                                                                                                      |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Chemistry          | `college_chemistry`, `high_school_chemistry`                                                                                                                      |
+| `abacusai/Llama-3-Smaug-8B`             | Computer Science   | `college_computer_science`, `computer_security`, `high_school_computer_science`, `machine_learning`                                                               |
+| `Open-Orca/Mistral-7B-OpenOrca`         | Math               | `abstract_algebra`, `college_mathematics`, `elementary_mathematics`, `high_school_mathematics`, `high_school_statistics`                                          |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Economics          | `econometrics`, `high_school_macroeconomics`, `high_school_microeconomics`                                                                                       |
+| `AdaptLLM/medicine-chat`                | Health             | `anatomy`, `clinical_knowledge`, `college_medicine`, `human_aging`, `medical_genetics`, `nutrition`, `professional_medicine`, `virology`                          |
+| `STEM-AI-mtl/phi-2-electrical-engineering` | Engineering     | `electrical_engineering`                                                                                                                                         |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Philosophy         | `formal_logic`, `logical_fallacies`, `moral_disputes`, `moral_scenarios`, `philosophy`, `world_religions`                                                        |
+| `microsoft/Phi-3-mini-128k-instruct`    | Other              | `global_facts`, `miscellaneous`, `professional_accounting`                                                                                                       |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | History            | `high_school_european_history`, `high_school_us_history`, `high_school_world_history`, `prehistory`                                                              |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Culture            | `human_sexuality`, `sociology`                                                                                                                                   |
+| `AdaptLLM/law-chat`                     | Law                | `international_law`, `jurisprudence`, `professional_law`                                                                                                         |
+| `meta-llama/Meta-Llama-3-8B-Instruct`   | Psychology         | `high_school_psychology`, `professional_psychology`                                                                                                              |
+### MMLU Benchmark Results (5-shot learning)
+Here are the comparative MMLU scores for various models tested under a 5-shot learning setup:
+| **Model**                         | **MMLU Score** |
+|-----------------------------------|----------------|
+| Octopus-V4                        | **74.6%**      |
+| GPT-3.5                           | 70.0%          |
+| Phi-3-mini-128k-instruct          | 68.1%          |
+| OpenELM-3B                        | 26.7%          |
+| Lamma3-8b-instruct                | 68.4%          |
+| Gemma-2b                          | 42.3%          |
+| Gemma-7b                          | 64.3%          |
 ## References
 We thank the Microsoft team for their amazing model!
 ```