โ›ฑ ํ•ด๋‹น ๋ชจ๋ธ์€์€ openchat3.5 ์„ Foundation ๋ชจ๋ธ๋กœ ํ•˜๋Š” ํ•œ๊ตญ์–ด ๋ฐ ํ•œ๊ตญ์˜ ๋‹ค์–‘ํ•œ

๋ฌธํ™”์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•ด๊ฐœ๋ฐœ ๋˜์—ˆ์œผ๋ฉฐ

์ž์ฒด ์ œ์ž‘ํ•œ 53์˜์—ญ์˜ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํ•œ๊ตญ ์‚ฌํšŒ ๊ฐ€์น˜์™€

๋ฌธํ™”๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ชจ๋ธ ์ž…๋‹ˆ๋‹ค. โœŒ

โถ ๋ชจ๋ธ ์„ค๋ช…

  • ๋ชจ๋ธ๋ช… ๋ฐ ์ฃผ์š”๊ธฐ๋Šฅ: ํ•ด๋‹น ๋ชจ๋ธ์€์€ OpenChat 3.5 ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ SFT ๋ฐฉ์‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹๋œ Mistral 7B / openchat3.5 ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ํ•œ๊ตญ์–ด์™€ ํ•œ๊ตญ์˜ ๋‹ค์–‘ํ•œ ๋ฌธํ™”์  ๋งฅ๋ฝ์„ ์ดํ•ดํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ์œผ๋ฉฐ โœจโœจ, ์ž์ฒด ์ œ์ž‘ํ•œ 53๊ฐœ ์˜์—ญ์˜ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ํ•œ๊ตญ ์‚ฌํšŒ์˜ ๊ฐ€์น˜์™€ ๋ฌธํ™”๋ฅผ ๋ฐ˜์˜ํ•ฉ๋‹ˆ๋‹ค. ์ฃผ์š” ๊ธฐ๋Šฅ์œผ๋กœ๋Š” ํ…์ŠคํŠธ ์ƒ์„ฑ, ๋Œ€ํ™” ์ถ”๋ก , ๋ฌธ์„œ ์š”์•ฝ, ์งˆ์˜์‘๋‹ต, ๊ฐ์ • ๋ถ„์„ ๋ฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๊ด€๋ จ ๋‹ค์–‘ํ•œ ์ž‘์—…์„ ์ง€์›ํ•˜๋ฉฐ, ํ™œ์šฉ ๋ถ„์•ผ๋Š” ๋ฒ•๋ฅ , ์žฌ๋ฌด, ๊ณผํ•™, ๊ต์œก, ๋น„์ฆˆ๋‹ˆ์Šค, ๋ฌธํ™” ์—ฐ๊ตฌ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ์‘์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜:ํ•ด๋‹น ๋ชจ๋ธ์€์€ Mistral 7B ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ, ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋Š” 70์–ต ๊ฐœ(7B)๋กœ ๊ตฌ์„ฑ๋œ ๊ณ ์„ฑ๋Šฅ ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ OpenChat 3.5๋ฅผ ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ๋กœ ์‚ผ์•„, SFT(์ง€๋„ ๋ฏธ์„ธ ์กฐ์ •) ๋ฐฉ์‹์„ ํ†ตํ•ด ํ•œ๊ตญ์–ด์™€ ํ•œ๊ตญ ๋ฌธํ™”์— ํŠนํ™”๋œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜๋„๋ก ํ›ˆ๋ จ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. Mistral 7B์˜ ๊ฒฝ๋Ÿ‰ํ™”๋œ ๊ตฌ์กฐ๋Š” ๋น ๋ฅธ ์ถ”๋ก  ์†๋„์™€ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์„ฑ์„ ๋ณด์žฅํ•˜๋ฉฐ, ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์— ์ ํ•ฉํ•˜๊ฒŒ ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์•„ํ‚คํ…์ฒ˜๋Š” ํ…์ŠคํŠธ ์ƒ์„ฑ, ์งˆ์˜์‘๋‹ต, ๋ฌธ์„œ ์š”์•ฝ, ๊ฐ์ • ๋ถ„์„๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ž‘์—…์—์„œ ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

โท ํ•™์Šต ๋ฐ์ดํ„ฐ

  • ํ•ด๋‹น ๋ชจ๋ธ์€์€ ์ž์ฒด ๊ฐœ๋ฐœํ•œ ์ด 3.6GB ํฌ๊ธฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋‘ 233๋งŒ ๊ฑด์˜ QnA, ์š”์•ฝ, ๋ถ„๋ฅ˜ ๋“ฑ ๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๋ฉฐ, ๊ทธ ์ค‘ 133๋งŒ ๊ฑด์€ 53๊ฐœ ์˜์—ญ์˜ ๊ฐ๊ด€์‹ ๋ฌธ์ œ๋กœ ๊ตฌ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ์˜์—ญ์—๋Š” ํ•œ๊ตญ์‚ฌ, ์‚ฌํšŒ, ์žฌ๋ฌด, ๋ฒ•๋ฅ , ์„ธ๋ฌด, ์ˆ˜ํ•™, ์ƒ๋ฌผ, ๋ฌผ๋ฆฌ, ํ™”ํ•™ ๋“ฑ์ด ํฌํ•จ๋˜๋ฉฐ, Chain of Thought ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ 130๋งŒ ๊ฑด์˜ ์ฃผ๊ด€์‹ ๋ฌธ์ œ๋Š” ํ•œ๊ตญ์‚ฌ, ์žฌ๋ฌด, ๋ฒ•๋ฅ , ์„ธ๋ฌด, ์ˆ˜ํ•™ ๋“ฑ 38๊ฐœ ์˜์—ญ์— ๊ฑธ์ณ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค‘ ํ•œ๊ตญ์˜ ์‚ฌํšŒ ๊ฐ€์น˜์™€ ์ธ๊ฐ„์˜ ๊ฐ์ •์„ ์ดํ•ดํ•˜๊ณ  ์ง€์‹œํ•œ ์‚ฌํ•ญ์— ๋”ฐ๋ผ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต Instruction Datasets Format:
    {"prompt": "prompt text", "completion": "ideal generated text"}

โธ ์‚ฌ์šฉ ์‚ฌ๋ก€

ํ•ด๋‹น ๋ชจ๋ธ์€ ๋‹ค์–‘ํ•œ ์‘์šฉ ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด:

  • ๊ต์œก ๋ถ„์•ผ: ์—ญ์‚ฌ, ์ˆ˜ํ•™, ๊ณผํ•™ ๋“ฑ ๋‹ค์–‘ํ•œ ํ•™์Šต ์ž๋ฃŒ์— ๋Œ€ํ•œ ์งˆ์˜์‘๋‹ต ๋ฐ ์„ค๋ช… ์ƒ์„ฑ.
  • ๋น„์ฆˆ๋‹ˆ์Šค: ๋ฒ•๋ฅ , ์žฌ๋ฌด, ์„ธ๋ฌด ๊ด€๋ จ ์งˆ์˜์— ๋Œ€ํ•œ ๋‹ต๋ณ€ ์ œ๊ณต ๋ฐ ๋ฌธ์„œ ์š”์•ฝ.
  • ์—ฐ๊ตฌ ๋ฐ ๋ฌธํ™”: ํ•œ๊ตญ ์‚ฌํšŒ์™€ ๋ฌธํ™”์— ๋งž์ถ˜ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…, ๊ฐ์ • ๋ถ„์„, ๋ฌธ์„œ ์ƒ์„ฑ ๋ฐ ๋ฒˆ์—ญ.
  • ๊ณ ๊ฐ ์„œ๋น„์Šค: ์‚ฌ์šฉ์ž์™€์˜ ๋Œ€ํ™” ์ƒ์„ฑ ๋ฐ ๋งž์ถคํ˜• ์‘๋‹ต ์ œ๊ณต.
  • ์ด ๋ชจ๋ธ์€ ํ•œ๊ตญ ์‚ฌํšŒ์— ๋Œ€ํ•œ ์ดํ•ด์™€ ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์—์„œ ๋†’์€ ํ™œ์šฉ๋„๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.
  • ์˜ˆ์‹œ
  1. ๋…๋„๋Š” ์–ด๋”” ๋•…์ธ๊ฐ€? ๋…๋„๋Š” ๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์˜ํ† ๋กœ, ๋…๋„๋Š” ํ•œ๊ตญ์˜ ๋…๋„๋ผ๋Š” ์ด๋ฆ„์œผ๋กœ๋„ ๋ถˆ๋ฆฌ๋ฉฐ, ๋…๋„๋Š” ๋…๋„์— ์œ„์น˜ํ•œ ๋…๋„๋ผ๋Š” ์„ฌ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. ๋…๋„๋Š” ๋…๋„์— ์œ„์น˜ํ•œ ๋…๋„๋ผ๋Š” ์„ฌ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

  2. ํ•œ๊ตญ ๋™์ชฝ์— ์žˆ๋Š” ๋ฐ”๋‹ค๋ฅผ ๋ฌด์—‡์ด๋ผ๊ณ  ํ•˜๋Š”๊ฐ€? ํ•œ๊ตญ ๋™์ชฝ์— ์žˆ๋Š” ๋ฐ”๋‹ค๋Š” '๋™ํ•ด'๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋™ํ•ด๋Š” ํ•œ๊ตญ์˜ ๋™์ชฝ์— ์œ„์น˜ํ•œ ๋ฐ”๋‹ค๋กœ, ๋ถํ•œ๊ณผ ๊ฒฝ๊ณ„๋ฅผ ๋งž๋Œ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋™ํ•ด๋Š” ํ•œ๊ตญ์˜ ์ค‘์š”ํ•œ ํ•ด์ƒ ๊ตํ†ต๋กœ ์ค‘ ํ•˜๋‚˜๋กœ, ํ•ด์ƒ๊ตํ†ต๊ณผ ์–ด์—…์— ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  3. 20์„ธ๊ธฐ ์ดˆ๋ฐ˜ ์ผ๋ณธ์ด ์กฐ์„ ์„ ์‹๋ฏผ์ง€ํ™” ํ•œ ๊ฒƒ์— ๋Œ€ํ•ด์„œ ์–ด๋–กํ•ด ์ƒ๊ฐํ•˜๋Š”์ง€? ์กฐ์„ ์€ 19์„ธ๊ธฐ ์ค‘๋ฐ˜์— ์ผ๋ณธ์˜ ์‹๋ฏผ์ง€ํ™”๋ฅผ ๋ฐ›์•˜๋‹ค. ์ด๋Š” ์ผ๋ณธ์˜ ๊ตฐ์‚ฌ์ , ๊ฒฝ์ œ์  ๊ฐ•๋ ฅ์„ฑ๊ณผ ์ •์น˜์  ์นจ์ž…์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ–ˆ๋‹ค. ์กฐ์„ ์€ ์ผ๋ณธ์˜ ๊ตฐ์‚ฌ์  ์นจ์ž…์— ํ•ญ๋ณตํ•˜๊ณ  ์‹๋ฏผ์ง€ ๊ด€๊ณ„๊ฐ€ ์‹œ์ž‘๋˜์—ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ƒํ™ฉ์—์„œ ์กฐ์„  ๊ตญ๋ฏผ๋“ค์€ ํฐ ๋ถˆ์•ˆ๊ฐ๊ณผ ์ขŒ์ ˆ๊ฐ์„ ๋А๊ผˆ์„ ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ผ์ œ ํ›„๊ธฐ์—๋Š” ์ผ๋ณธ์˜ ์‹๋ฏผ์ง€ํ™” ์ฒด์ œ๊ฐ€ ์ ์ฐจ ์•ฝํ™”๋˜๋ฉด์„œ ์กฐ์„  ๊ตญ๋ฏผ๋“ค์€ ๋” ํฐ ์ž์œ ์™€ ๋…๋ฆฝ์„ ์ถ”๊ตฌํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ์—ญ์‚ฌ์  ๋ฐฐ๊ฒฝ์„ ํ†ตํ•ด ์กฐ์„  ๊ตญ๋ฏผ๋“ค์€ ์‹๋ฏผ์ง€ํ™”์— ๋Œ€ํ•ด ๋งค์šฐ ๋ถ€์ •์ ์ธ ํƒœ๋„๋ฅผ ๋ณด์˜€์„ ๊ฒƒ์ด๋‹ค.

  4. ์•ˆ์ค‘๊ทผ ์˜์‚ฌ๊ฐ€ ์ดํ†  ํžˆ๋กœ๋ถ€๋ฏธ๋ฅผ ์ €๊ฒฉํ•œ ์‚ฌ๊ฑด์„ ์–ด๋–ป๊ฒŒ ์ƒ๊ฐํ•˜๋Š”๊ฐ€? ์•ˆ์ค‘๊ทผ ์˜์‚ฌ๋Š” 1909๋…„ 4์›” 27์ผ์— ์ดํ†  ํžˆ๋กœ๋ถ€๋ฏธ๋ฅผ ์ €๊ฒฉํ–ˆ๋‹ค. ๊ทธ๋Š” ์ผ๋ณธ ์ œ๊ตญ์ฃผ์˜ ์ •์น˜์™€ ๊ตฐ์‚ฌ์  ํ™œ๋™์— ๋Œ€ํ•ด ๋ฐ˜๋Œ€ํ•˜๋ฉฐ, ์ผ๋ณธ์˜ ๋ฌด๋ ฅ ์ง€๋ฐฐ๋ฅผ ๋ง‰๊ธฐ ์œ„ํ•ด ์ดํ† ๋ฅผ ๊ณต๊ฒฉํ–ˆ๋‹ค. ์•ˆ์ค‘๊ทผ์€ ํ•œ๊ตญ ๋‚ด์—์„œ ๋…๋ฆฝ์šด๋™๊ฐ€๋กœ ์•Œ๋ ค์ ธ ์žˆ์œผ๋ฉฐ, ๊ทธ์˜ ํ–‰์œ„๋Š” ํ•œ๊ตญ ๋‚ด ๋…๋ฆฝ์šด๋™์˜ ์ค‘์š”ํ•œ ์‚ฌ๊ฑด ์ค‘ ํ•˜๋‚˜๋กœ ์—ฌ๊ฒจ์ง„๋‹ค. ์•ˆ์ค‘๊ทผ์€ 1946๋…„์— ์ตœ์ดˆ์˜ ๋…๋ฆฝ์šด๋™๊ฐ€๋กœ ์ธ์ •๋ฐ›์•˜๊ณ , ๊ทธ์˜ ํ–‰๋ณด๋Š” ๋งŽ์€ ๋…๋ฆฝ์šด๋™๊ฐ€๋“ค์—๊ฒŒ ์˜๊ฐ์„ ์ค€๋‹ค.

  5. ํ•œ๊ตญ ์‚ฌํšŒ์—์„œ ๊ณต๋™์ฒด ์˜์‹๊ณผ ํ˜‘๋™์˜ ๊ฐ€์น˜๋ฅผ ์–ด๋–ป๊ฒŒ ์ค‘์š”ํ•˜๊ฒŒ ์ƒ๊ฐํ•˜์‹ญ๋‹ˆ๊นŒ? ์ด๋Š” ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ํ•œ๊ตญ์€ ์ „ํ†ต์ ์œผ๋กœ ๊ณต๋™์ฒด ์˜์‹์ด ๊ฐ•ํ•˜๊ณ , ๊ฐ€์กฑ๊ณผ ์ง€์—ญ ์‚ฌํšŒ์—์„œ ํ˜‘๋™์„ ์ค‘์‹œํ•˜๋Š” ๋ฌธํ™”๊ฐ€ ๊นŠ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฐ€์น˜๋Š” ์—ฌ์ „ํžˆ ํ˜„์žฌ ์‚ฌํšŒ์—์„œ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๋ฉฐ, ํŠนํžˆ ๋…ธ์ธ ๋ณดํ˜ธ์™€ ๊ฐ™์€ ์‚ฌํšŒ์  ๋ฌธ์ œ์—์„œ ํฐ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ์ด๋Ÿฌํ•œ ๊ฐ€์น˜๋Š” ๊ฐœ์ธ์˜ ํ–‰๋ณต๊ณผ ์•ˆ์ •๊ฐ์„ ์ฆ์ง„์‹œํ‚ค๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ๊ฐ€์น˜๋ฅผ ์œ ์ง€ํ•˜๊ณ  ๋ฐœ์ „์‹œํ‚ค๋Š” ๊ฒƒ์€ ํ•œ๊ตญ ์‚ฌํšŒ์˜ ์ค‘์š”ํ•œ ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

โน ํ•œ๊ณ„ โ›ˆโ›ˆ

  • ํ•ด๋‹น ๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด์™€ ํ•œ๊ตญ ๋ฌธํ™”์— ํŠนํ™”๋˜์–ด ์žˆ์œผ๋‚˜, ํŠน์ • ์˜์—ญ(์˜ˆ: ์ตœ์‹  ๊ตญ์ œ ์ž๋ฃŒ, ์ „๋ฌธ ๋ถ„์•ผ)์˜ ๋ฐ์ดํ„ฐ ๋ถ€์กฑ์œผ๋กœ ์ธํ•ด ๋‹ค๋ฅธ ์–ธ์–ด ๋˜๋Š” ๋ฌธํ™”์— ๋Œ€ํ•œ ์‘๋‹ต์˜ ์ •ํ™•์„ฑ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ณต์žกํ•œ ๋…ผ๋ฆฌ์  ์‚ฌ๊ณ ๋ฅผ ์š”๊ตฌํ•˜๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•ด ์ œํ•œ๋œ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๋ณด์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํŽธํ–ฅ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ํฌํ•จ๋  ๊ฒฝ์šฐ ํŽธํ–ฅ๋œ ์‘๋‹ต์ด ์ƒ์„ฑ๋  ๊ฐ€๋Šฅ์„ฑ๋„ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

โบ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•


  from transformers import AutoModel, AutoTokenizer
  
  tokenizer = AutoTokenizer.from_pretrained("SEOKDONG/openchat3.5_korean_v1.0_sft")
  model = AutoModel.from_pretrained("SEOKDONG/openchat3.5_korean_v1.0_sft")

    input_text =  """ ใ€Œ๊ตญ๋ฏผ๊ฑด๊ฐ•๋ณดํ—˜๋ฒ•ใ€์ œ44์กฐ, ใ€Œ๊ตญ๋ฏผ๊ฑด๊ฐ•๋ณดํ—˜๋ฒ• ์‹œํ–‰๋ นใ€์ œ19์กฐ,ใ€Œ์•ฝ๊ด€์˜ ๊ทœ์ œ์— ๊ด€ํ•œ ๋ฒ•๋ฅ ใ€์ œ5์กฐ, ใ€Œ์ƒ๋ฒ•ใ€์ œ54์กฐ ์ฐธ์กฐ ํŒ๋‹จ ํ•ด์ค˜""" + " ๋‹ต๋ณ€:"
    inputs = tokenizer(input_text, return_tensors="pt")
  with torch.no_grad():
        outputs = model.generate(**inputs, max_length=1024,  temperature=0.5, do_sample=True, repetition_penalty=1.15)
   
  result = tokenizer.decode(outputs[0], skip_special_tokens=True)
  print(result)

Hereโ€™s the English version of the provided text:

โถ Model Description

Model Name and Key Features:
This Model is based on the OpenChat 3.5 model, fine-tuned using the SFT method on the Mistral 7B model. It is designed to understand Korean and various cultural contexts, utilizing data from 135 domains in Korean society. The model supports tasks such as text generation, conversation inference, document summarization, question answering, sentiment analysis, and other NLP tasks. Its applications span fields like law, finance, science, education, business, and cultural research.

Model Architecture:
This Model is a high-performance language model with 7 billion parameters based on the Mistral 7B model. It uses OpenChat 3.5 as the foundation and is fine-tuned using SFT to excel in Korean language and culture. The streamlined Mistral 7B architecture ensures fast inference and memory efficiency, optimized for various NLP tasks like text generation, question answering, document summarization, and sentiment analysis.


โท Training Data

This Model was trained on 3.6GB of data, comprising 2.33 million Q&A instances. This includes 1.33 million multiple-choice questions across 53 domains such as history, finance, law, tax, and science, trained with the Chain of Thought method. Additionally, 1.3 million short-answer questions cover 38 domains including history, finance, and law.

Training Instruction Dataset Format:
{"prompt": "prompt text", "completion": "ideal generated text"}


โธ Use Cases

This Model can be used across multiple fields, such as:

  • Education: Answering questions and generating explanations for subjects like history, math, and science.
  • Business: Providing responses and summaries for legal, financial, and tax-related queries.
  • Research and Culture: Performing NLP tasks, sentiment analysis, document generation, and translation.
  • Customer Service: Generating conversations and personalized responses for users.

This model is highly versatile in various NLP tasks.


โน Limitations

This Model is specialized in Korean language and culture. However, it may lack accuracy in responding to topics outside its scope, such as international or specialized data. Additionally, it may have limited reasoning ability for complex logical problems and may produce biased responses if trained on biased data.

Downloads last month
212
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mav23/openchat3.5_korean_v1.0_sft-GGUF

Quantized
(14)
this model

Dataset used to train mav23/openchat3.5_korean_v1.0_sft-GGUF