OpenCerebrum-2.0-7B / README.md

Locutusque

Add quant links (#3)

1fe4427 verified 6 months ago

preview code

raw

history blame contribute delete

No virus

4.39 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- open-source
	- code
	- math
	- chemistry
	- biology
	- text-generation
	- question-answering
	pipeline_tag: text-generation
	---

	# OpenCerebrum-2.0-7B

	OpenCerebrum-2.0-7B is an open-source language model fine-tuned from the alpindale/Mistral-7B-v0.2-hf base model on a diverse dataset aimed at replicating capabilities of Aether Research's proprietary Cerebrum model.

	The model was fine-tuned with SFT and DPO on approximately 7,000 examples across 15 data sources spanning coding, math, science, multi-turn conversation, RAG, reasoning, and general instruction-following. The goal was to assemble public datasets that could help the model achieve strong performance on benchmarks where Cerebrum excels.

	## Model Details

	- Base Model: alpindale/Mistral-7B-v0.2-hf
	- Parameters: 7 billion
	- Fine-Tuning Dataset Size: ~7,000 examples
	- Fine-Tuning Data: Advanced in-house curation techniques at Cognitive Computations, with 15 different data sources for DPO and SFT.
	- Language: English
	- License: Apache 2.0

	## Quants

	### EXL2 [@bartowski](https://huggingface.co/bartowski/)

	- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-exl2

	### GGUF [@bartowski](https://huggingface.co/bartowski/)

	- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-GGUF

	## Intended Use

	OpenCerebrum-2.0-7B is intended to be a powerful open-source model for coding, math, science, and general question-answering and text generation tasks. Its diverse fine-tuning data aims to equip it with broad knowledge and reasoning capabilities.

	However, as an open-source replica trained on a subset of data compared to the original Cerebrum, it may not match Cerebrum's full performance. Additionally, biases and limitations of the fine-tuning data may be reflected in the model's outputs.

	## Limitations and Biases

	- The model may have biases and limitations inherited from its fine-tuning datasets. Thorough testing is needed to characterize these.
	- As the model is based on a 7B parameter model, it has computational and memory constraints compared to larger models.

	## Evaluations

	\| Tasks \|Version\|Filter\|n-shot\|Metric\|Value \| \|Stderr\|
	\|--------------\|------:\|------\|-----:\|------\|-----:\|---\|-----:\|
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.5182\|± \|0.0152\|
	\|ai2_arc \|N/A \|none \| 0\|acc \|0.7060\|± \|0.0073\|
	\| \| \|none \| 0\|acc_norm\|0.7049\|± \|0.0074\|
	\| - arc_challenge \| 1\|none \| 0\|acc \|0.5000\|± \|0.0146\|
	\| \| \|none \| 0\|acc_norm\|0.5299\|± \|0.0146\|
	\| - arc_easy \| 1\|none \| 0\|acc \|0.8077\|± \|0.0081\|
	\| \| \|none \| 0\|acc_norm\|0.7912\|± \|0.0083\|
	\|agieval_nous \|N/A \|none \| 0\|acc \|0.3778\|± \|0.0093\|
	\| \| \|none \| 0\|acc_norm\|0.3574\|± \|0.0093\|
	\| - agieval_aqua_rat \| 1\|none \| 0\|acc \|0.2402\|± \|0.0269\|
	\| \| \|none \| 0\|acc_norm\|0.2205\|± \|0.0261\|
	\| - agieval_logiqa_en \| 1\|none \| 0\|acc \|0.3164\|± \|0.0182\|
	\| \| \|none \| 0\|acc_norm\|0.3656\|± \|0.0189\|
	\| - agieval_lsat_ar \| 1\|none \| 0\|acc \|0.2130\|± \|0.0271\|
	\| \| \|none \| 0\|acc_norm\|0.1913\|± \|0.0260\|
	\| - agieval_lsat_lr \| 1\|none \| 0\|acc \|0.4078\|± \|0.0218\|
	\| \| \|none \| 0\|acc_norm\|0.3647\|± \|0.0213\|
	\| - agieval_lsat_rc \| 1\|none \| 0\|acc \|0.4981\|± \|0.0305\|
	\| \| \|none \| 0\|acc_norm\|0.4498\|± \|0.0304\|
	\| - agieval_sat_en \| 1\|none \| 0\|acc \|0.6650\|± \|0.0330\|
	\| \| \|none \| 0\|acc_norm\|0.5922\|± \|0.0343\|
	\| - agieval_sat_en_without_passage\| 1\|none \| 0\|acc \|0.4612\|± \|0.0348\|
	\| \| \|none \| 0\|acc_norm\|0.3932\|± \|0.0341\|
	\| - agieval_sat_math \| 1\|none \| 0\|acc \|0.3273\|± \|0.0317\|
	\| \| \|none \| 0\|acc_norm\|0.2818\|± \|0.0304\|

	---
	language:
	- en
	license: apache-2.0
	tags:
	- open-source
	- code
	- math
	- chemistry
	- biology
	- text-generation
	- question-answering
	pipeline_tag: text-generation
	---

	# OpenCerebrum-2.0-7B

	OpenCerebrum-2.0-7B is an open-source language model fine-tuned from the alpindale/Mistral-7B-v0.2-hf base model on a diverse dataset aimed at replicating capabilities of Aether Research's proprietary Cerebrum model.

	The model was fine-tuned with SFT and DPO on approximately 7,000 examples across 15 data sources spanning coding, math, science, multi-turn conversation, RAG, reasoning, and general instruction-following. The goal was to assemble public datasets that could help the model achieve strong performance on benchmarks where Cerebrum excels.

	## Model Details

	- Base Model: alpindale/Mistral-7B-v0.2-hf
	- Parameters: 7 billion
	- Fine-Tuning Dataset Size: ~7,000 examples
	- Fine-Tuning Data: Advanced in-house curation techniques at Cognitive Computations, with 15 different data sources for DPO and SFT.
	- Language: English
	- License: Apache 2.0

	## Quants

	### EXL2 [@bartowski](https://huggingface.co/bartowski/)

	- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-exl2

	### GGUF [@bartowski](https://huggingface.co/bartowski/)

	- https://huggingface.co/bartowski/OpenCerebrum-2.0-7B-GGUF

	## Intended Use

	OpenCerebrum-2.0-7B is intended to be a powerful open-source model for coding, math, science, and general question-answering and text generation tasks. Its diverse fine-tuning data aims to equip it with broad knowledge and reasoning capabilities.

	However, as an open-source replica trained on a subset of data compared to the original Cerebrum, it may not match Cerebrum's full performance. Additionally, biases and limitations of the fine-tuning data may be reflected in the model's outputs.

	## Limitations and Biases

	- The model may have biases and limitations inherited from its fine-tuning datasets. Thorough testing is needed to characterize these.
	- As the model is based on a 7B parameter model, it has computational and memory constraints compared to larger models.

	## Evaluations

	\| Tasks \|Version\|Filter\|n-shot\|Metric\|Value \| \|Stderr\|
	\|--------------\|------:\|------\|-----:\|------\|-----:\|---\|-----:\|
	\|truthfulqa_mc2\| 2\|none \| 0\|acc \|0.5182\|± \|0.0152\|
	\|ai2_arc \|N/A \|none \| 0\|acc \|0.7060\|± \|0.0073\|
	\| \| \|none \| 0\|acc_norm\|0.7049\|± \|0.0074\|
	\| - arc_challenge \| 1\|none \| 0\|acc \|0.5000\|± \|0.0146\|
	\| \| \|none \| 0\|acc_norm\|0.5299\|± \|0.0146\|
	\| - arc_easy \| 1\|none \| 0\|acc \|0.8077\|± \|0.0081\|
	\| \| \|none \| 0\|acc_norm\|0.7912\|± \|0.0083\|
	\|agieval_nous \|N/A \|none \| 0\|acc \|0.3778\|± \|0.0093\|
	\| \| \|none \| 0\|acc_norm\|0.3574\|± \|0.0093\|
	\| - agieval_aqua_rat \| 1\|none \| 0\|acc \|0.2402\|± \|0.0269\|
	\| \| \|none \| 0\|acc_norm\|0.2205\|± \|0.0261\|
	\| - agieval_logiqa_en \| 1\|none \| 0\|acc \|0.3164\|± \|0.0182\|
	\| \| \|none \| 0\|acc_norm\|0.3656\|± \|0.0189\|
	\| - agieval_lsat_ar \| 1\|none \| 0\|acc \|0.2130\|± \|0.0271\|
	\| \| \|none \| 0\|acc_norm\|0.1913\|± \|0.0260\|
	\| - agieval_lsat_lr \| 1\|none \| 0\|acc \|0.4078\|± \|0.0218\|
	\| \| \|none \| 0\|acc_norm\|0.3647\|± \|0.0213\|
	\| - agieval_lsat_rc \| 1\|none \| 0\|acc \|0.4981\|± \|0.0305\|
	\| \| \|none \| 0\|acc_norm\|0.4498\|± \|0.0304\|
	\| - agieval_sat_en \| 1\|none \| 0\|acc \|0.6650\|± \|0.0330\|
	\| \| \|none \| 0\|acc_norm\|0.5922\|± \|0.0343\|
	\| - agieval_sat_en_without_passage\| 1\|none \| 0\|acc \|0.4612\|± \|0.0348\|
	\| \| \|none \| 0\|acc_norm\|0.3932\|± \|0.0341\|
	\| - agieval_sat_math \| 1\|none \| 0\|acc \|0.3273\|± \|0.0317\|
	\| \| \|none \| 0\|acc_norm\|0.2818\|± \|0.0304\|