Edit model card

The license is cc-by-nc-sa-4.0.

  • Commercializing is not allowed.

ASAP will upload it.

Not based on Synatra model, we pre-train and full-finetuning Mixtralx2 to enhance Korean abilities.

DATASET.

  • Using a Self-supervised learning manner, we converted raw corpus to instruct tuned data.

  • We used text-mining techniques to create the train data.

  • Here is some examples...

  • Mask prediction Task


#Mask prediction

text='์ง€๋Šฅ(ๆ™บ่ƒฝ) ๋˜๋Š” ์ธํ…”๋ฆฌ์ „์Šค(intelligence)๋Š” ์ธ๊ฐ„์˜ <MASK> ๋Šฅ๋ ฅ์„ ๋งํ•œ๋‹ค.'

response='์ง€์ '

complete_text='์ง€๋Šฅ(ๆ™บ่ƒฝ) ๋˜๋Š” ์ธํ…”๋ฆฌ์ „์Šค(intelligence)๋Š” ์ธ๊ฐ„์˜ ์ง€์  ๋Šฅ๋ ฅ์„ ๋งํ•œ๋‹ค.'
  • Text allign Task

#Text-allign Task

text_list=['๋ณต์ˆ˜๋ช…๋ น-๋ณต์ˆ˜์ž๋ฃŒ(MIMD,Multiple Instruction, Multiple Data)์€ ์ „์‚ฐ์—์„œ ๋ณ‘๋ ฌํ™”์˜ ํ•œ ๊ธฐ๋ฒ•์ด๋‹ค.',
           '๋ถ„์‚ฐ ๋ฉ”๋ชจ๋ฆฌ์˜ ์˜ˆ๋Š” MPP(massively parallel processors)์™€ COW (Clusters of Workstations)์ด๋‹ค.',
           'MIMD๊ธฐ๊ณ„๋Š” ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ์ด๊ฑฐ๋‚˜ ๋ถ„์‚ฐ ๋ฉ”๋ชจ๋ฆฌ์ด๋ฉฐ ์ด๋Ÿฌํ•œ ๋ถ„๋ฅ˜๋Š” MIMD๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ด์šฉํ•˜๋Š๋ƒ์— ๋”ฐ๋ผ ๋‚˜๋‰œ๋‹ค.']



response='๋ณต์ˆ˜๋ช…๋ น-๋ณต์ˆ˜์ž๋ฃŒ(MIMD,Multiple Instruction, Multiple Data)์€ ์ „์‚ฐ์—์„œ ๋ณ‘๋ ฌํ™”์˜ ํ•œ ๊ธฐ๋ฒ•์ด๋‹ค. \
          MIMD๊ธฐ๊ณ„๋Š” ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ์ด๊ฑฐ๋‚˜ ๋ถ„์‚ฐ ๋ฉ”๋ชจ๋ฆฌ์ด๋ฉฐ ์ด๋Ÿฌํ•œ ๋ถ„๋ฅ˜๋Š” MIMD๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ด์šฉํ•˜๋Š๋ƒ์— ๋”ฐ๋ผ ๋‚˜๋‰œ๋‹ค. \
          ๋ถ„์‚ฐ ๋ฉ”๋ชจ๋ฆฌ์˜ ์˜ˆ๋Š” MPP(massively parallel processors)์™€ COW (Clusters of Workstations)์ด๋‹ค.'
  • Text completion Task

#Text Completion

text= '๊ทธ๋ฆฐ๋ธŒ๋ผ์šฐ์ €(GreenBrowser)๋Š” ์ธํ„ฐ๋„ท ์ต์Šคํ”Œ๋กœ๋Ÿฌ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ํŠธ๋ผ์ด๋˜ํŠธ ๋ ˆ์ด์•„์›ƒ ์—”์ง„์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•˜๋ฉฐ ์ค‘๊ตญ์— ๊ธฐ๋ฐ˜์„ ๋‘” ์†Œํ”„ํŠธ์›จ์–ด ํšŒ์‚ฌ์ธ ๋ชจ์–ดํ€ต(morequick)์—์„œ ๋งŒ๋“  ๋ฌด๋ฃŒ ์›น ๋ธŒ๋ผ์šฐ์ €๋‹ค. ๊ฐ„์ฒด์ž ์ค‘๊ตญ์–ด๊ฐ€ ์›น ๋ธŒ๋ผ์šฐ์ €์— ๋‚ด์žฅ๋˜์–ด ์žˆ๋‹ค.
      ๋งฅ์Šคํ†ค ์›น ๋ธŒ๋ผ์šฐ์ €์™€ ๋น„์Šทํ•˜์—ฌ MyIE์™€ ๋ฐ€์ ‘ํ•˜๊ฒŒ ๊ด€๋ จ๋˜์–ด ์žˆ๋‹ค. ๋งฅ์Šคํ†ค์šฉ์˜ ์ผ๋ถ€ ํ”Œ๋Ÿฌ๊ทธ์ธ์ด ๊ทธ๋ฆฐ๋ธŒ๋ผ์šฐ์ €์—์„œ๋„ ์ž‘๋™ํ•  ๊ฒƒ์ด๋‹ค.'



response= '์ž๋™ ์Šคํฌ๋กค, ์ž๋™ ๋ฆฌํ”„๋ ˆ์‹œ, ์ž๋™ ์ €์žฅ, ์ž๋™ ํผ ์ฑ„์šฐ๊ธฐ์™€ ๊ฐ™์€ ๋งŽ์€ ์ž๋™ํ™” ๊ธฐ๋Šฅ์ด ์žˆ๋‹ค.'
Downloads last month
1,144
Safetensors
Model size
12.9B params
Tensor type
F32
ยท
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.