Edit model card

ChinaLM by Chickaboo AI

Welcome to ChinaLM, a Chinese LLM merge made Chickaboo AI. ChinaLM is designed to deliver a high-quality conversational experience in Chinese.

Table of Contents

  • Model Details
  • Benchmarks
  • Usage

Model Details

ChinaLM is a merge of the Qwen2-7B-Instruct model and Yi-1.5-9B-Chat made with Mergekit using this config file:

slices:
  - sources:
    - model: 01-ai/Yi-1.5-9B-Chat
      layer_range: [0, 20]
  - sources:
    - model: Qwen/Qwen2-7B-Instruct
      layer_range: [0, 20]
merge_method: passthrough
dtype: bfloat16

Open Chinese LLM Leaderboard

Coming soon

Benchmark ChinaLM-9B ChinaLM-13B (Unrealesed) Mistral-7B-Instruct-v0.2 Meta-Llama-3-8B Yi-1.5-9B-Chat Qwen2-7B-Instruct
Average -- -- -- -- -- --
ARC -- -- -- -- -- --
Hellaswag -- -- -- -- -- --
MMLU -- -- -- -- -- --
TruthfulQA -- -- -- -- -- --
Winogrande -- -- -- -- -- --
GSM8K -- -- -- -- -- --

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/ChinaLM-9B")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/ChinaLM-9B")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
Downloads last month
12
Safetensors
Model size
8.93B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Merge of