Edit model card

Hercules-Qwen1.5-14B

  • Model Details

    Model Description

    This model has capabilities in math, coding, function calling, roleplay, and more. We fine-tuned it using 700,000 examples of Hercules-v4.

    Developed by: M4-ai

    Language(s) (NLP): English and maybe Chinese

    License: tongyi-qianwen license

    Finetuned from model: Qwen1.5-14B

  • Uses

    General purpose assistant, question answering, chain-of-thought, etc..

    Recommendations

    Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

  • Evaluation

    Coming soon

  • Training Details

    Training Data

    https://huggingface.co/datasets/Locutusque/hercules-v4.0

    Training Hyperparameters

    Training regime: bf16 non-mixed precision

  • Technical Specifications

    Hardware

    We used 8 Kaggle TPUs, and we trained at a global batch size of 128 and sequence length of 1024

  • Contributions

    Thanks to @Tonic, @aloobun, @fhai50032, and @Locutusque for their contributions to this model.

Downloads last month
383
Safetensors
Model size
14.2B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train M4-ai/Hercules-Qwen1.5-14B