Gonsoo's picture
Update README.md
488983b verified
|
raw
history blame
1.86 kB
metadata
license: mit
language:
  - ko
  - en
base_model: MLP-KTLim/llama-3-Korean-Bllossom-8B

Model Card for Model ID

This model is an AWS Neuron compiled version, neuron-cc 2.14, of the Korean fine-tuned model MLP-KTLim/llama-3-Korean-Bllossom-8B, available at https://huggingface.co/MLP-KTLim/llama-3-Korean-Bllossom-8B. It is intended for deployment on Amazon EC2 Inferentia2 and Amazon SageMaker. For detailed information about the model and its license, please refer to the original MLP-KTLim/llama-3-Korean-Bllossom-8B model page

Model Details

This model is compiled with neuronx-cc version, 2.14 It can be deployed with [v1.0-hf-tgi-0.0.24-pt-2.1.2-inf-neuronx-py310](https://github.com/aws/deep-learning-containers/releases?q=tgi+AND+neuronx&expanded=true on SageMaker Endpoint because this inference docker image is only used on SageMaker

This model can be deployed to Amazon SageMaker Endtpoint with this guide, S3 ์— ์ €์žฅ๋œ ๋ชจ๋ธ์„ SageMaker INF2 ์— ๋ฐฐํฌํ•˜๊ธฐ

In order to do neuron-compilation and depoly in detail , you can refer to Amazon ECR ์˜ ๋„์ปค ์ด๋ฏธ์ง€ ๊ธฐ๋ฐ˜ํ•˜์— Amazon EC2 Inferentia2 ์„œ๋น™ํ•˜๊ธฐ

Hardware

At a minimum hardware, you can use Amazon EC2 inf2.xlarge and more powerful family such as inf2.8xlarge, inf2.24xlarge and inf2.48xlarge. The detailed information is Amazon EC2 Inf2 Instances

Model Card Contact

Gonsoo Moon, gonsoomoon@gmail.com