ANIMA-Phi-Neptune-Mistral-7B / README.md

Adding Evaluation Results (#3)

54769e3 verified 3 months ago

No virus

11.8 kB

	---
	license: artistic-2.0
	tags:
	- chemistry
	- biology
	- climate
	- science
	- philosophy
	- nature
	- ecology
	- biomimicry
	- fauna
	- flora
	datasets:
	- Severian/Biomimicry
	- emrgnt-cmplxty/sciphi-textbooks-are-all-you-need
	- fmars/wiki_stem
	- fblgit/tree-of-knowledge
	- Severian/Bio-Design-Process
	metrics:
	- accuracy
	pipeline_tag: text-generation
	model-index:
	- name: ANIMA-Phi-Neptune-Mistral-7B-v4
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 55.46
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 77.63
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 53.12
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 59.01
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 73.48
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 14.94
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Severian/ANIMA-Phi-Neptune-Mistral-7B-v4
	name: Open LLM Leaderboard
	---
	# ANIMA-Phi-Neptune-Mistral-7B: Biomimicry Enhanced LLM

	<img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/JZH6p50t_j3-OUph4Wq6y.png" width="500">

	## Overview

	ANIMA (Advanced Nature Inspired Multidisciplinary Assistant) is an expert in various scientific disciplines, including but not limited to biomimicry, biology, and environmental science.

	Instagram: [@anima_llm](https://www.instagram.com/anima_llm)

	---

	## Model Description

	ANIMA is fine-tuned on a rich dataset encompassing:

	- 4,000+ Nature-Biomimicry examples
	- 60k Biomimicry Design Process examples
	- 600k STEM facts from Wikipedia
	- Science/Philosophy focused 'All-You-Need-Is-Textbooks' dataset
	- Additional Tree of Knowledge + Biomimicry data combined fine-tuning

	The model aims to assist users in solving problems using nature-inspired strategies and concepts.

	### Special Features

	- Multi-disciplinary Expertise: Knowledge across various scientific and philosophical domains.
	- Biomimicry Design Process: Incorporates a dataset generated by Mistral and Minotaur-15B. The dataset was then intricately processed by a real person to ensure factuality and grounding.

	---

	- Here is a link to The Bloke's GGUF version: [ANIMA-Phi-Neptune-Mistral-7B-GGUF](https://huggingface.co/TheBloke/ANIMA-Phi-Neptune-Mistral-7B-GGUF)


	- ANIMA is also available using Ollama - Download the [OLLAMA](https://ollama.ai/) App (MacOS & Linux) and then run this command in your Terminal 'ollama pull severian/anima' to download the model and then run this command 'ollama run severian/anima' to load the model and start talking.


	- You can also download and use the model with [LM Studio](https://lmstudio.ai/) (All OS systems). Just download the app and then search for 'ANIMA GGUF' in the search bar and you will have a list of versions to choose from.


	- Want to test ANIMA + Ollama and chat right away? Download the model from Ollama and head here to chat with ANIMA right in your browser! [ANIMA - Chat](https://severian42.github.io/ANIMA-Chat/)

	- Have a PDF you want to discuss with ANIMA + Ollama? Head here and you can do just that in your browser, 100% locally. [ANIMA - Locally Chat over your PDF](https://ANIMA-pdf-chat.vercel.app/)

	- ANIMA is also being hosted on a Hugging Face Space if you'd like to try it there (It may be slow to generate a respone but it still works just fine) [ANIMA - HF Space](https://huggingface.co/spaces/Severian/ANIMA-7B-Biomimicry-LLM)

	- Contact

	If you want to discuss the model/dataset/concept further, have any cool ideas, want to collaborate or if you run into any issues with the model, please don't hesitate to reach out at InterwovenArkitech@Gmail.com

	## Datasets

	- `Severian/Biomimicry` (1st Fine-Tune)
	- `emrgnt-cmplxty/sciphi-textbooks-are-all-you-need` (2nd Fine-Tune)
	- `Severian/Bio-Design-Process` & `fmars/wiki_stem` (3rd Fine-Tune)
	- `fblgit/tree-of-knowledge` & `Severian/Biomimicry` (4th Fine-Tune)
	- `Combined all previous datasets` into one giant dataset for a quick fine-tuning for a total of 730k data points (5th Fine-Tune)

	## Training Stages

	1. Base Model: `ehartford/dolphin-2.0-mistral-7b`
	2. 1st Fine-Tune: `ANIMA-Echo-Mistral-7B-v1` - 1 Epoch x 5hrs on 1 x T4 Small
	3. 2nd Fine-Tune: `ANIMA-Phi-Neptune-Mistral-v1`-: 1 Epoch x 1hr on 1 x A100
	4. 3rd Fine-Tune: `ANIMA-Phi-Neptune-Mistral-7B-v2` - 1 Epoch x 1hr on 1 x A100
	5. 4th Fine-Tune: `ANIMA-Phi-Neptune-Mistral-7B-v3` - 1 Epoch x 1hr on 1 x A100
	6. 5th Fine-Tune: `ANIMA-Phi-Neptune-Mistral-7B-v4` - 1 Epoch x 15mins on 1 x A100

	---

	## 📝 Usage

	ANIMA is designed around the Biomimicry Design Process to guide users in solving problems through biomimicry. For more information and examples, please visit [Biomimicry Design Process](https://toolbox.biomimicry.org/methods/process/).

	## 🏁 Benchmarks

	- Average - 62.22
	- ARC - 56.83
	- HellaSwag - 78.82
	- MMLU - 53.84
	- TruthfulQA - 59.40

	---

	- EXAMPLE OUTPUT

	[INST] How can biomimicry help in water purification? [/INST]

	Biomimicry plays a significant role in water purification by taking
	inspiration from natural processes and organisms to develop sustainable
	and efficient solutions. Here's a comprehensive overview of how biomimicry
	can help in water purification:

	1. Microbial filtration:
	- Bacteriophages (viruses that infect bacteria) can effectively target
	contaminants in water systems, such as Escherichia coli and Staphylococcus
	aureus. By mimicking the natural defense mechanisms of bacteriophages, we
	can develop efficient filtration systems that remove pollutants from water
	sources.
	- Some fungi, such as Trametes versicolor, can break down organic
	pollutants in water using enzymes. This process helps make contaminated
	water safe for consumption. By studying and replicating these natural
	processes, we can develop sustainable filtration systems that remove
	pollutants from water sources effectively.

	2. Biofilm-based filtration:
	- Microorganisms form biofilms on surfaces to protect themselves from
	environmental threats like predators or contaminants. By mimicking the
	natural formation of biofilms, we can develop sustainable filtration
	systems that use bacteria or fungi to remove pollutants from water
	sources.
	- For example, a study has shown how Pseudomonas putida could
	be used to biodegrade petroleum hydrocarbons in contaminated water sources
	using a biofilm-based system. This method offers an eco-friendly
	alternative to conventional chemical treatments for oil spills.

	3. Wetland-based filtration:
	- Wetlands act as natural filters for water by retaining sediments and
	organic matter, which helps remove contaminants. By mimicking the natural
	processes of a wetland, we can design effective water purification systems
	that use sand filters or engineered wetlands to remove pollutants. For
	instance, a study has shown how reed beds can be used for water filtration
	by using them as a natural treatment system for agricultural wastewater.

	4. Sponge filtration:
	- Some marine sponges (e.g., sea sponge) have evolved sophisticated
	mechanisms to filter out particles and contaminants from the surrounding
	water. By studying these natural filtration processes, we can develop
	efficient and sustainable filtration systems that remove pollutants
	effectively. For example, a study has shown how sea sponges can be used as
	a model for designing a low-cost, self-sustaining water purification
	system that uses solar energy to disinfect contaminated water.

	5. Biodegradable materials:
	- Natural organisms have evolved biodegradable materials to protect
	themselves from predators or environmental threats. For instance, some
	insects produce natural pesticides (such as alkaloids) to defend
	themselves against predators. Similarly, we can develop biodegradable
	materials that release these natural pesticides in contaminated water
	sources, killing off contaminants and restoring the environment's health.
	- For example, a study has shown how bacterial cells can be used as a
	natural filtration system to remove contaminants from water sources by
	incorporating them into membranes that selectively remove organic
	pollutants.

	In conclusion, biomimicry plays a vital role in water purification by
	using natural processes and organisms as inspiration for developing
	sustainable and efficient solutions. By studying the natural filtration
	mechanisms of microorganisms, plants, animals, and other living organisms,
	we can design eco-friendly and cost-effective water purification systems
	that help protect our environment from contamination.

	---

	### *[Help Support ANIMA](https://www.buymeacoffee.com/anima7b)*

	---

	- KNOWN ISSUES

	The model will sometimes respond to itself and continue the conversation taking both the user and AI roles. This is a known issue in the Mistral model but does not happen very often.

	## 📜 License

	This project is licensed under Artistic-2.0

	This model is for research purposes only and restricted from any commercial use
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Severian__ANIMA-Phi-Neptune-Mistral-7B-v4)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|55.61\|
	\|AI2 Reasoning Challenge (25-Shot)\|55.46\|
	\|HellaSwag (10-Shot) \|77.63\|
	\|MMLU (5-Shot) \|53.12\|
	\|TruthfulQA (0-shot) \|59.01\|
	\|Winogrande (5-shot) \|73.48\|
	\|GSM8k (5-shot) \|14.94\|