Edit model card

An sft version of a qwen1.5-1.8B

Introduction

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include:

  • 8 model sizes, including 0.5B, 1.8B, 4B, 7B, 14B, 32B and 72B dense models, and an MoE model of 14B with 2.7B activated;
  • Significant performance improvement in Chat models;
  • Multilingual support of both base and chat models;
  • Stable support of 32K context length for models of all sizes
  • No need of trust_remote_code.

For more details, please refer to the blog post and GitHub repo.

Our Work

We sft the model on a subset of [Open Assistant dataset](Open Assistant dataset) following self_reward

Downloads last month
2,263
Safetensors
Model size
1.84B params
Tensor type
BF16
·