Xclbr7
/

Arcanum-12b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Arcanum-12b / README.md

Xclbr7's picture

Update README.md

845ac67 verified 4 months ago

|

1.8 kB

	---
	library_name: transformers
	license: mit
	---

	![Arcanum-12b Banner](https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/SvGSozVAJMaf5PL21dMBb.jpeg)

	# Arcanum-12b 🧙‍♂️


	Arcanum-12b is a merged large language model created by combining TheDrummer/Rocinante-12B-v1.1 and MarinaraSpaghetti/NemoMix-Unleashed-12B using a novel merging technique.

	## Model Details 📊

	- Developed by: Xclbr7
	- Model type: Causal Language Model
	- Language(s): English (primarily), may support other languages
	- License: MIT
	- Repository: https://huggingface.co/Xclbr7/Arcanum-12b

	## Model Architecture 🏗️

	- Base model: MarinaraSpaghetti/NemoMix-Unleashed-12B
	- Parameter count: ~12 billion
	- Architecture specifics: Transformer-based language model

	## Training & Merging 🔄

	Arcanum-12b was created by merging two existing 12B models:

	1. TheDrummer/Rocinante-12B-v1.1
	- Density parameters: [1, 0.8, 0.6]
	- Weight: 0.7

	2. MarinaraSpaghetti/NemoMix-Unleashed-12B
	- Density parameters: [0.5, 0.7, 0.9]
	- Weight: 0.8

	Merging method: Ties
	Additional parameters:
	- Normalization: True
	- Int8 mask: True
	- Data type: float16

	## Intended Use 🎯

	Conversation with different personas.

	## Performance and Limitations ⚖️

	Not tested yet.

	## Ethical Considerations 🤔

	As a merged model based on existing language models, Arcanum-12b may inherit biases and limitations from its parent models. Users should be aware of potential biases in generated content and use the model responsibly.


	## Acknowledgments 🙏

	We acknowledge the contributions of the original model creators:
	- TheDrummer for Rocinante-12B-v1.1
	- MarinaraSpaghetti for NemoMix-Unleashed-12B

	Their work formed the foundation for Arcanum-12b.