RedWhale-2-12B / README.md
TwinDoc's picture
Update README.md
e481220 verified
metadata
library_name: transformers
tags: []

Model Card for Model TwinDoc/RedWhale-2-12B

Llama3.1 8B๋ฅผ TLIํ•˜์—ฌ 12B ๋ชจ๋ธ๋กœ ๋งŒ๋“  ํ›„ ์‚ฌ์ „ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‚ฌ์ „ํ•™์Šต์€ ํ•œ๊ตญ์–ด Corpus๋กœ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
TLI๋Š” transformer์˜ layer๋ฅผ ๋ณต์ œํ•˜๋Š” ๋ชจ๋ธ up-scale ๋ฐฉ๋ฒ•๋ก ์ž…๋‹ˆ๋‹ค.

Model Details

Model Description

  • Developed by: AgileSoda
  • Model type: Llama
  • Language(s) (NLP): ํ•œ๊ตญ์–ด
  • License: [More Information Needed]
  • Finetuned from model [optional]: TwinDoc/RedWhale-2-12B-Instruct
  • Foundation Model: RedWhale-2-12B-TLI

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

RedWhale-2-12B ๋ชจ๋ธ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์€ meta-llama/Llama-3.1-8B ๋ชจ๋ธ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ์„œ๋น™ ์—”์ง„์˜ ๊ณต์‹ ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. ๋‹ค์Œ์€ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.

Direct Use

usage with Transformers ์˜ˆ์‹œ ์ฝ”๋“œ๋Š” transformers == 4.48.1์—์„œ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

from transformers import AutoModelForCausalLM,AutoTokenizer
import torch

loading_args = {"torch_dtype": torch.bfloat16, "device_map": "auto"} ## for multi gpu loading
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-2-12B",**loading_args)
tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-2-12B")

text = "๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š” "
inputs = tokenizer(text,return_tensors="pt")
outputs = model.generate(**inputs,max_new_tokens = 100)
>>> print(tokenizer.decode(outputs[0]))
"<|begin_of_text|>๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š” 1000๋งŒ์—ฌ ๋ช… ์ด์ƒ์ด ๊ฑฐ์ฃผํ•˜๊ณ  ์žˆ๋Š” ์„œ์šธ๋กœ ๋Œ€ํ‘œ๋˜๋Š” ๋„์‹ฌ์ง€์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์„œ์šธ์˜ ์ค‘์‹ฌ์„ ๋‚˜ํƒ€๋‚ด๋Š” 4๋Œ€๋ฌธ ์•ˆ์„ ๋„์‹ฌ์ง€๋กœ ์ •์˜ํ•˜๊ณ , ๊ทธ ๊ฒฝ๊ณ„๋ฅผ ๋ถ์•…์‚ฐ, ์ธ์™•์‚ฐ, ๋‚จ์‚ฐ, ๋‚™์‚ฐ์œผ๋กœ ๊ตฌ๋ถ„ํ•˜๋Š” 4์‚ฐ์˜ ์‚ฐ์ค„๊ธฐ์™€ ๋„๋กœ๋กœ ๊ตฌ์„ฑ๋˜๋Š” 8๊ฐœ์˜ ๋ณ€์„ ๊ฒฝ๊ณ„๋กœ ์ •ํ•œ๋‹ค. ๊ตญํ†  ๊ณต๊ฐ„์  ๊ด€์ ์—์„œ ์šฐ๋ฆฌ๋‚˜๋ผ์˜"

Out-of-Scope Use

์‚ฌ์ „ํ•™์Šต๋งŒ ์ง„ํ–‰ํ•œ ๋ชจ๋ธ์ด๊ธฐ ๋•Œ๋ฌธ์— Instruction์„ ๋”ฐ๋ฅด๋Š” ๋Šฅ๋ ฅ์€ ์—†์Šต๋‹ˆ๋‹ค. ํŠน์ • Task์— ๋ฐ”๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ๋ณด๋‹ค๋Š” Fine-Tuning์„ ์œ„ํ•œ Base๋ชจ๋ธ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

Training Details

Training Data

Training Procedure

Compute Infrastructure

Hardware

  • L40 48GB * 4EA