M4-ai
/

Hercules-Qwen1.5-14B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Hercules-Qwen1.5-14B

Model Details

Model Description

This model has capabilities in math, coding, function calling, roleplay, and more. We fine-tuned it using 700,000 examples of Hercules-v4.

Developed by: M4-ai

Language(s) (NLP): English and maybe Chinese

License: tongyi-qianwen license

Finetuned from model: Qwen1.5-14B
Uses

General purpose assistant, question answering, chain-of-thought, etc..

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
Evaluation

Coming soon
Training Details

Training Data

https://huggingface.co/datasets/Locutusque/hercules-v4.0

Training Hyperparameters

Training regime: bf16 non-mixed precision
Technical Specifications

Hardware

We used 8 Kaggle TPUs, and we trained at a global batch size of 128 and sequence length of 1024
Contributions

Thanks to @Tonic, @aloobun, @fhai50032, and @Locutusque for their contributions to this model.

Downloads last month: 383

Safetensors

Model size

14.2B params

Tensor type

BF16

·

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train M4-ai/Hercules-Qwen1.5-14B