YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This is an experimental depth-upscale of Qwen2.5 14B to a total of 21.4B parameters. A total of 24 layers were added (layers 30-41 inclusive each repeated twice) bringing the total to 72 layers.
The added layers had the o_proj
and down_proj
modules zeroed out prior to retraining as seen in other modern depth upscaling experiments.
The upscaled model was then trained on a mix of about 10M tokens worth of instruct and creative data, with the majority being general instruct training to try to repair those connections.
- Downloads last month
- 17
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.