Arcanum-12b / README.md
Xclbr7's picture
Update README.md
845ac67 verified
|
raw
history blame
1.8 kB
---
library_name: transformers
license: mit
---
![Arcanum-12b Banner](https://cdn-uploads.huggingface.co/production/uploads/66dcee3321f901b049f48002/SvGSozVAJMaf5PL21dMBb.jpeg)
# Arcanum-12b πŸ§™β€β™‚οΈ
Arcanum-12b is a merged large language model created by combining TheDrummer/Rocinante-12B-v1.1 and MarinaraSpaghetti/NemoMix-Unleashed-12B using a novel merging technique.
## Model Details πŸ“Š
- **Developed by:** Xclbr7
- **Model type:** Causal Language Model
- **Language(s):** English (primarily), may support other languages
- **License:** MIT
- **Repository:** https://huggingface.co/Xclbr7/Arcanum-12b
## Model Architecture πŸ—οΈ
- **Base model:** MarinaraSpaghetti/NemoMix-Unleashed-12B
- **Parameter count:** ~12 billion
- **Architecture specifics:** Transformer-based language model
## Training & Merging πŸ”„
Arcanum-12b was created by merging two existing 12B models:
1. TheDrummer/Rocinante-12B-v1.1
- Density parameters: [1, 0.8, 0.6]
- Weight: 0.7
2. MarinaraSpaghetti/NemoMix-Unleashed-12B
- Density parameters: [0.5, 0.7, 0.9]
- Weight: 0.8
**Merging method:** Ties
**Additional parameters:**
- Normalization: True
- Int8 mask: True
- Data type: float16
## Intended Use 🎯
Conversation with different personas.
## Performance and Limitations βš–οΈ
Not tested yet.
## Ethical Considerations πŸ€”
As a merged model based on existing language models, Arcanum-12b may inherit biases and limitations from its parent models. Users should be aware of potential biases in generated content and use the model responsibly.
## Acknowledgments πŸ™
We acknowledge the contributions of the original model creators:
- TheDrummer for Rocinante-12B-v1.1
- MarinaraSpaghetti for NemoMix-Unleashed-12B
Their work formed the foundation for Arcanum-12b.