TwinDoc's picture
Update README.md
4a46e0a verified
metadata
library_name: transformers
tags: []

Model Card for TwinDoc/RedWhale-2-12B-Instruct

์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์ธ TwinDoc/RedWhale-2-12B๋ฅผ SFT(Supervised Finetuning)ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. SFT๋Š” ContextQA ๋ฐ ์š”์•ฝ task์— ๋งž์ถฐ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Model Details

Model Description

This is the model card of a ๐Ÿค— transformers model that has been pushed on the Hub. This model card has been automatically generated.

  • Developed by: AgileSoda
  • Model type: Llama
  • Language(s) (NLP): ํ•œ๊ตญ์–ด
  • License: [More Information Needed]
  • Foundation Model: RedWhale-2-12B

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

RedWhale-2-12B-Instruct ๋ชจ๋ธ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์€ meta-llama/Llama-3.1-8B-Instruct ๋ชจ๋ธ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
์‚ฌ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ์„œ๋น™ ์—”์ง„์˜ ๊ณต์‹ ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. ๋‹ค์Œ์€ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

usage with Transformers
์˜ˆ์‹œ ์ฝ”๋“œ๋Š” transformers == 4.48.1์—์„œ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

from transformers import AutoModelForCausalLM,AutoTokenizer
import torch

loading_args = {"torch_dtype": torch.bfloat16, "device_map": "auto"} ## for multi gpu loading
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-2-12B-Instruct",**loading_args)
tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-2-12B-Instruct")

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š”?"},
]

inputs = tokenizer.apply_chat_template(messages,add_generation_prompt=True,return_tensors="pt")
outputs = model.generate(inputs)
>>> print(outputs)
"<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024

You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>

๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š”?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š” ์„œ์šธ์ž…๋‹ˆ๋‹ค.<|eot_id|>"

usage with vllm
์˜ˆ์‹œ ์ฝ”๋“œ๋Š” vllm == 0.6.6์—์„œ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

from vllm import LLM, SamplingParams
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"  # Arrange GPU devices starting from 0
os.environ["CUDA_VISIBLE_DEVICES"]= "0,1,2,3,4,5,6,7" 

repo_id = "TwinDoc/RedWhale-2-12B-Instruct"
tensor_parallel_size = 8 ## num of gpus

llm = LLM(
    model=repo_id,
    tensor_parallel_size=tensor_parallel_size,
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š”?"},
]

sampling_params = SamplingParams(
    temperature=0.8,
    top_p=0.9,
    max_tokens = 8192,
)

outputs = llm.chat(messages,sampling_params)
>>> print(outputs[0].outputs[0].text)
๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š” ์„œ์šธ์ž…๋‹ˆ๋‹ค.

Training Details

Training Data

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • allganize/rag-ko test์…‹ 200๊ฑด
  • ๋ฏธ๋ž˜์—์…‹ Context QA 100๊ฑด
  • AIA Context QA 140๊ฑด
  • BNK Context QA 63๊ฑด

Metrics

LLM as a Judge์„ ํ™œ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ์ธก์ •ํ•˜์˜€์Šต๋‹ˆ๋‹ค. Prompt,์ธก์ • ๋ชจ๋ธ ๊ทธ๋ฆฌ๊ณ  ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋Š” Our Leaderboard๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. Our Leaderboard์˜ ๋ชจ๋ธ๋ช… "RedWhale2 12B 0.98 SFT v4 M"์ด "TwinDoc/RedWhale-2-12B-Instruct" ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.