Edit model card

image/png

🌟 Checkout Taiwan-LLM Demo Chat-UI 🌟

Model Card for Taiwan LLM 8x7B-DPO

Taiwan LLM is an advanced language model tailored for Traditional Chinese, focusing on the linguistic and cultural contexts of Taiwan.

Model description

  • Model type: A 8x7B parameter Mixtral MoE model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily Traditional Chinese (zh-tw)
  • Finetuned from model: yentinglin/Taiwan-LLM-MoE-alpha

Model Sources

Performance

Checkout leaderboard in Tw Chatbot Arena

TMMLUS+ score:

  • yentinglin/Taiwan-LLM-MoE-alpha: 43.93
  • yentinglin/Taiwan-LLM-8x7B-DPO: TBD

Intended uses

Here's how you can run the model using the pipeline() function from 🤗 Transformers:

# pip install transformers>=4.34
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="yentinglin/Taiwan-LLM-8x7B-DPO", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "你是一個人工智慧助理",
    },
    {"role": "user", "content": "東北季風如何影響台灣氣候?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Citation

If you find Taiwan LLM useful in your work, please cite it with:

@misc{lin2023taiwan,
      title={Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model}, 
      author={Yen-Ting Lin and Yun-Nung Chen},
      year={2023},
      eprint={2311.17487},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgement

Ubitus provides valuable compute resources for the project.

Downloads last month
44
Safetensors
Model size
46.7B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yentinglin/Taiwan-LLM-8x7B-DPO

Quantizations
2 models

Space using yentinglin/Taiwan-LLM-8x7B-DPO 1