Edit model card
Taiwan LLM Logo

🌟 Checkout Taiwan-LLM Demo Chat-UI 🌟

Model Card for Taiwan LLM 13B v2.0 chat

Taiwan LLM is an advanced language model tailored for Traditional Chinese, focusing on the linguistic and cultural contexts of Taiwan. Developed from a large base model, it's enriched with diverse Taiwanese textual sources and refined through Supervised Fine-Tuning. This model excels in language understanding and generation, aligning closely with Taiwan's cultural nuances. It demonstrates improved performance on various benchmarks like TC-Eval, showcasing its contextual comprehension and cultural relevance. For detailed insights into Taiwan LLM's development and features, refer to our technical report.

Model description

  • Model type: A 13B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily Traditional Chinese (zh-tw)
  • Finetuned from model: yentinglin/Taiwan-LLM-13B-v2.0-base

Model Sources

Performance

image/png

TMMLUS+ score: 24.76727075757576

Intended uses

Here's how you can run the model using the pipeline() function from 🤗 Transformers:

# pip install transformers>=4.34
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="yentinglin/Taiwan-LLM-13B-v2.0-chat", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "你是一個人工智慧助理",
    },
    {"role": "user", "content": "東北季風如何影響台灣氣候?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Training hyperparameters

image/png

image/png

image/png

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 5.0

Citation

If you find Taiwan LLM is useful in your work, please cite it with:

@misc{lin2023taiwan,
      title={Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model}, 
      author={Yen-Ting Lin and Yun-Nung Chen},
      year={2023},
      eprint={2311.17487},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Acknowledgement

Taiwan LLM v2 is conducted in collaboration with Ubitus K.K.. Ubitus provides valuable compute resources for the project.

Open LLM Leaderboard

Task Version Metric Value Stderr
leaderboard:arc:challenge:25 0 acc 0.5529 ± 0.0145
acc_norm 0.5862 ± 0.0144
leaderboard:gsm8k:5 0 qem 0.3177 ± 0.0128
leaderboard:hellaswag:10 0 acc 0.6307 ± 0.0048
acc_norm 0.8327 ± 0.0037
leaderboard:mmlu:_average:5 acc 0.5483 ± 0.0356
leaderboard:mmlu:abstract_algebra:5 0 acc 0.3400 ± 0.0476
leaderboard:mmlu:anatomy:5 0 acc 0.5111 ± 0.0432
leaderboard:mmlu:astronomy:5 0 acc 0.5789 ± 0.0402
leaderboard:mmlu:business_ethics:5 0 acc 0.5100 ± 0.0502
leaderboard:mmlu:clinical_knowledge:5 0 acc 0.6000 ± 0.0302
leaderboard:mmlu:college_biology:5 0 acc 0.5764 ± 0.0413
leaderboard:mmlu:college_chemistry:5 0 acc 0.4100 ± 0.0494
leaderboard:mmlu:college_computer_science:5 0 acc 0.4500 ± 0.0500
leaderboard:mmlu:college_mathematics:5 0 acc 0.3800 ± 0.0488
leaderboard:mmlu:college_medicine:5 0 acc 0.5434 ± 0.0380
leaderboard:mmlu:college_physics:5 0 acc 0.2941 ± 0.0453
leaderboard:mmlu:computer_security:5 0 acc 0.7000 ± 0.0461
leaderboard:mmlu:conceptual_physics:5 0 acc 0.4468 ± 0.0325
leaderboard:mmlu:econometrics:5 0 acc 0.2719 ± 0.0419
leaderboard:mmlu:electrical_engineering:5 0 acc 0.4552 ± 0.0415
leaderboard:mmlu:elementary_mathematics:5 0 acc 0.3175 ± 0.0240
leaderboard:mmlu:formal_logic:5 0 acc 0.3413 ± 0.0424
leaderboard:mmlu:global_facts:5 0 acc 0.3700 ± 0.0485
leaderboard:mmlu:high_school_biology:5 0 acc 0.6323 ± 0.0274
leaderboard:mmlu:high_school_chemistry:5 0 acc 0.4581 ± 0.0351
leaderboard:mmlu:high_school_computer_science:5 0 acc 0.5400 ± 0.0501
leaderboard:mmlu:high_school_european_history:5 0 acc 0.6364 ± 0.0376
leaderboard:mmlu:high_school_geography:5 0 acc 0.6970 ± 0.0327
leaderboard:mmlu:high_school_government_and_politics:5 0 acc 0.7617 ± 0.0307
leaderboard:mmlu:high_school_macroeconomics:5 0 acc 0.4974 ± 0.0254
leaderboard:mmlu:high_school_mathematics:5 0 acc 0.3296 ± 0.0287
leaderboard:mmlu:high_school_microeconomics:5 0 acc 0.5336 ± 0.0324
leaderboard:mmlu:high_school_physics:5 0 acc 0.3709 ± 0.0394
leaderboard:mmlu:high_school_psychology:5 0 acc 0.7468 ± 0.0186
leaderboard:mmlu:high_school_statistics:5 0 acc 0.4074 ± 0.0335
leaderboard:mmlu:high_school_us_history:5 0 acc 0.7108 ± 0.0318
leaderboard:mmlu:high_school_world_history:5 0 acc 0.7046 ± 0.0297
leaderboard:mmlu:human_aging:5 0 acc 0.6323 ± 0.0324
leaderboard:mmlu:human_sexuality:5 0 acc 0.5878 ± 0.0432
leaderboard:mmlu:international_law:5 0 acc 0.6694 ± 0.0429
leaderboard:mmlu:jurisprudence:5 0 acc 0.7037 ± 0.0441
leaderboard:mmlu:logical_fallacies:5 0 acc 0.6564 ± 0.0373
leaderboard:mmlu:machine_learning:5 0 acc 0.3393 ± 0.0449
leaderboard:mmlu:management:5 0 acc 0.7087 ± 0.0450
leaderboard:mmlu:marketing:5 0 acc 0.8333 ± 0.0244
leaderboard:mmlu:medical_genetics:5 0 acc 0.5400 ± 0.0501
leaderboard:mmlu:miscellaneous:5 0 acc 0.7382 ± 0.0157
leaderboard:mmlu:moral_disputes:5 0 acc 0.6127 ± 0.0262
leaderboard:mmlu:moral_scenarios:5 0 acc 0.3788 ± 0.0162
leaderboard:mmlu:nutrition:5 0 acc 0.6046 ± 0.0280
leaderboard:mmlu:philosophy:5 0 acc 0.6270 ± 0.0275
leaderboard:mmlu:prehistory:5 0 acc 0.6204 ± 0.0270
leaderboard:mmlu:professional_accounting:5 0 acc 0.3582 ± 0.0286
leaderboard:mmlu:professional_law:5 0 acc 0.3931 ± 0.0125
leaderboard:mmlu:professional_medicine:5 0 acc 0.5184 ± 0.0304
leaderboard:mmlu:professional_psychology:5 0 acc 0.5556 ± 0.0201
leaderboard:mmlu:public_relations:5 0 acc 0.6818 ± 0.0446
leaderboard:mmlu:security_studies:5 0 acc 0.6122 ± 0.0312
leaderboard:mmlu:sociology:5 0 acc 0.7164 ± 0.0319
leaderboard:mmlu:us_foreign_policy:5 0 acc 0.8200 ± 0.0386
leaderboard:mmlu:virology:5 0 acc 0.4578 ± 0.0388
leaderboard:mmlu:world_religions:5 0 acc 0.7661 ± 0.0325
leaderboard:truthfulqa:mc:0 0 truthfulqa_mc1 0.2840 ± 0.0158
truthfulqa_mc2 0.4423 ± 0.0146
leaderboard:winogrande:5 0 acc 0.7593 ± 0.0120

TC-Eval

Task Version Metric Value Stderr
community:tc-eval-v2:drcd:0 0 pem 0.6848 ± 0.0079
pqem 0.6799 ± 0.0079
community:tc-eval-v2:penguin_table:0 0 acc 0.2361 ± 0.0355
community:tc-eval-v2:_average:5 acc 0.3508 ± 0.0318
community:tc-eval-v2:tmmluplus-accounting:5 0 acc 0.2565 ± 0.0317
community:tc-eval-v2:tmmluplus-administrative_law:5 0 acc 0.2833 ± 0.0220
community:tc-eval-v2:tmmluplus-advance_chemistry:5 0 acc 0.3333 ± 0.0427
community:tc-eval-v2:tmmluplus-agriculture:5 0 acc 0.1987 ± 0.0326
community:tc-eval-v2:tmmluplus-anti_money_laundering:5 0 acc 0.5597 ± 0.0430
community:tc-eval-v2:tmmluplus-auditing:5 0 acc 0.2836 ± 0.0192
community:tc-eval-v2:tmmluplus-basic_medical_science:5 0 acc 0.2841 ± 0.0146
community:tc-eval-v2:tmmluplus-business_management:5 0 acc 0.4245 ± 0.0421
community:tc-eval-v2:tmmluplus-chinese_language_and_literature:5 0 acc 0.2714 ± 0.0316
community:tc-eval-v2:tmmluplus-clinical_psychology:5 0 acc 0.3840 ± 0.0437
community:tc-eval-v2:tmmluplus-computer_science:5 0 acc 0.4195 ± 0.0375
community:tc-eval-v2:tmmluplus-culinary_skills:5 0 acc 0.4589 ± 0.0292
community:tc-eval-v2:tmmluplus-dentistry:5 0 acc 0.3885 ± 0.0244
community:tc-eval-v2:tmmluplus-economics:5 0 acc 0.3053 ± 0.0233
community:tc-eval-v2:tmmluplus-education:5 0 acc 0.4355 ± 0.0447
community:tc-eval-v2:tmmluplus-education_(profession_level):5 0 acc 0.2819 ± 0.0204
community:tc-eval-v2:tmmluplus-educational_psychology:5 0 acc 0.4489 ± 0.0376
community:tc-eval-v2:tmmluplus-engineering_math:5 0 acc 0.2718 ± 0.0441
community:tc-eval-v2:tmmluplus-finance_banking:5 0 acc 0.3037 ± 0.0397
community:tc-eval-v2:tmmluplus-financial_analysis:5 0 acc 0.2801 ± 0.0230
community:tc-eval-v2:tmmluplus-fire_science:5 0 acc 0.2500 ± 0.0390
community:tc-eval-v2:tmmluplus-general_principles_of_law:5 0 acc 0.3113 ± 0.0452
community:tc-eval-v2:tmmluplus-geography_of_taiwan:5 0 acc 0.4492 ± 0.0180
community:tc-eval-v2:tmmluplus-human_behavior:5 0 acc 0.3883 ± 0.0278
community:tc-eval-v2:tmmluplus-insurance_studies:5 0 acc 0.3487 ± 0.0173
community:tc-eval-v2:tmmluplus-introduction_to_law:5 0 acc 0.3165 ± 0.0303
community:tc-eval-v2:tmmluplus-jce_humanities:5 0 acc 0.3444 ± 0.0504
community:tc-eval-v2:tmmluplus-junior_chemistry:5 0 acc 0.3158 ± 0.0322
community:tc-eval-v2:tmmluplus-junior_chinese_exam:5 0 acc 0.4171 ± 0.0374
community:tc-eval-v2:tmmluplus-junior_math_exam:5 0 acc 0.2286 ± 0.0318
community:tc-eval-v2:tmmluplus-junior_science_exam:5 0 acc 0.3427 ± 0.0326
community:tc-eval-v2:tmmluplus-junior_social_studies:5 0 acc 0.4683 ± 0.0446
community:tc-eval-v2:tmmluplus-logic_reasoning:5 0 acc 0.2734 ± 0.0379
community:tc-eval-v2:tmmluplus-macroeconomics:5 0 acc 0.3187 ± 0.0230
community:tc-eval-v2:tmmluplus-management_accounting:5 0 acc 0.2977 ± 0.0313
community:tc-eval-v2:tmmluplus-marketing_management:5 0 acc 0.4624 ± 0.0520
community:tc-eval-v2:tmmluplus-mechanical:5 0 acc 0.4831 ± 0.0462
community:tc-eval-v2:tmmluplus-music:5 0 acc 0.3993 ± 0.0294
community:tc-eval-v2:tmmluplus-national_protection:5 0 acc 0.4929 ± 0.0345
community:tc-eval-v2:tmmluplus-nautical_science:5 0 acc 0.2777 ± 0.0191
community:tc-eval-v2:tmmluplus-occupational_therapy_for_psychological_disorders:5 0 acc 0.4438 ± 0.0213
community:tc-eval-v2:tmmluplus-official_document_management:5 0 acc 0.3559 ± 0.0322
community:tc-eval-v2:tmmluplus-optometry:5 0 acc 0.2804 ± 0.0148
community:tc-eval-v2:tmmluplus-organic_chemistry:5 0 acc 0.3486 ± 0.0459
community:tc-eval-v2:tmmluplus-pharmacology:5 0 acc 0.3397 ± 0.0197
community:tc-eval-v2:tmmluplus-pharmacy:5 0 acc 0.2174 ± 0.0209
community:tc-eval-v2:tmmluplus-physical_education:5 0 acc 0.3966 ± 0.0367
community:tc-eval-v2:tmmluplus-physics:5 0 acc 0.2371 ± 0.0434
community:tc-eval-v2:tmmluplus-politic_science:5 0 acc 0.3407 ± 0.0150
community:tc-eval-v2:tmmluplus-real_estate:5 0 acc 0.3804 ± 0.0509
community:tc-eval-v2:tmmluplus-secondary_physics:5 0 acc 0.3393 ± 0.0449
community:tc-eval-v2:tmmluplus-statistics_and_machine_learning:5 0 acc 0.3438 ± 0.0318
community:tc-eval-v2:tmmluplus-taiwanese_hokkien:5 0 acc 0.2636 ± 0.0389
community:tc-eval-v2:tmmluplus-taxation:5 0 acc 0.2507 ± 0.0224
community:tc-eval-v2:tmmluplus-technical:5 0 acc 0.4204 ± 0.0247
community:tc-eval-v2:tmmluplus-three_principles_of_people:5 0 acc 0.5396 ± 0.0424
community:tc-eval-v2:tmmluplus-trade:5 0 acc 0.2251 ± 0.0187
community:tc-eval-v2:tmmluplus-traditional_chinese_medicine_clinical_medicine:5 0 acc 0.3094 ± 0.0278
community:tc-eval-v2:tmmluplus-trust_practice:5 0 acc 0.3292 ± 0.0235
community:tc-eval-v2:tmmluplus-ttqav2:5 0 acc 0.6726 ± 0.0443
community:tc-eval-v2:tmmluplus-tve_chinese_language:5 0 acc 0.4161 ± 0.0225
community:tc-eval-v2:tmmluplus-tve_design:5 0 acc 0.4542 ± 0.0227
community:tc-eval-v2:tmmluplus-tve_mathematics:5 0 acc 0.2733 ± 0.0365
community:tc-eval-v2:tmmluplus-tve_natural_sciences:5 0 acc 0.3349 ± 0.0229
community:tc-eval-v2:tmmluplus-veterinary_pathology:5 0 acc 0.2544 ± 0.0259
community:tc-eval-v2:tmmluplus-veterinary_pharmacology:5 0 acc 0.3259 ± 0.0202
Downloads last month
307
Safetensors
Model size
13B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using yentinglin/Taiwan-LLM-13B-v2.0-chat 3

Collection including yentinglin/Taiwan-LLM-13B-v2.0-chat