arcee-ai
/

teeny-tiny-mixtral

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

teeny-tiny-mixtral / README.md

Shamane's picture

Update README.md

c9440c3 verified 2 months ago

|

raw history blame contribute delete

No virus

992 Bytes

	---
	library_name: transformers
	tags:
	- arcee-ai
	---

	![image/webp](https://cdn-uploads.huggingface.co/production/uploads/654aa1d86167ff03f70e32f9/5qausvO9z7FhTl5wyJhRy.webp)

	# Model Card for Model ID

	This model is a dummy model created for testing purposes only. It utilizes a custom configuration to explore various training scenarios and should not be used for production.

	## Configuration Highlights

	The configuration of this dummy model is distinct from the original model in several key aspects:

	- Number of Layers: Reduced to 2, allowing for quicker tests of layer-specific behaviors.
	- Experts: Configured with 4 local experts and 2 experts per token, experimenting with the model's capacity to handle multiple expert inputs.
	- Hidden Size: Set at 512, this smaller size is suitable for testing the impact of network width.
	- Intermediate Size: Enlarged to 3579, to investigate how an increase in the size affects the model's ability to process information deeply.