Edit model card

Hercules-Qwen1.5-14B

  • Model Details

    Model Description

    This model has capabilities in math, coding, function calling, roleplay, and more. We fine-tuned it using 700,000 examples of Hercules-v4.

    Developed by: M4-ai

    Language(s) (NLP): English and maybe Chinese

    License: tongyi-qianwen license

    Finetuned from model: Qwen1.5-14B

  • Uses

    General purpose assistant, question answering, chain-of-thought, etc..

    Recommendations

    Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

  • Evaluation

    Coming soon

  • Training Details

    Training Data

    https://huggingface.co/datasets/Locutusque/hercules-v4.0

    Training Hyperparameters

    Training regime: bf16 non-mixed precision

  • Technical Specifications

    Hardware

    We used 8 Kaggle TPUs, and we trained at a global batch size of 128 and sequence length of 1024

  • Contributions

    Thanks to @Tonic, @aloobun, @fhai50032, and @Locutusque for their contributions to this model.

Downloads last month
2,645
Safetensors
Model size
14.2B params
Tensor type
BF16
·

Dataset used to train M4-ai/Hercules-Qwen1.5-14B