Changgil
/

K2S3-v0.1

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

Edit model card

Developed by :

K2S3

Model Number:

K2S3-v0.1

Base Model Weight :

mistralai/Mistral-7B-v0.1

Model Description :

The K2S3 v0.1 model utilizes mistral weights, having undergone depth up scaling to double its size, and has been enhanced with the addition of Korean vocabulary and merges to the tokenizer.
K2S3 v0.1 모델은 mistral weight를 활용하였으며, depth up scaling을 통해 모델의 크기를 2배로 확장하였습니다. 또한, 토크나이저에는 한글 vocab과 merges를 추가하여 한국어 처리 능력을 강화하였습니다.

Training Data

The training data for this model includes alpaca-gpt4-data, and samples from The OpenOrca Dataset.
이 모델의 훈련 데이터에는 alpaca-gpt4-data, 그리고 OpenOrca Dataset에서 제공한 샘플들이 포함됩니다.

Training Method

This model was trained on an enhanced version of the base model that underwent depth up scaling by K2S3, using a full parameter tuning method with SFT (Supervised Fine-Tuning).
이 모델은 K2S3에서 depth up scaling을 통해 확장한 버전의 기반 모델을 사용하여 SFT(Supervised Fine-Tuning)를 사용한 전체 파라미터 조정 방법으로 훈련되었습니다.

Hardware

Hardware: Utilized two A100 (80G*2EA) GPUs for training.
Training Factors: This model was fine-tuned with SFT, using the HuggingFace SFTtrainer and applied fsdp.
이 모델은 SFT를 사용하여 HuggingFace SFTtrainer와 fsdp를 적용하여 미세조정되었습니다.

Downloads last month: 3,825

Safetensors

Model size

14.4B params

Tensor type

FP16

·