Instructions to use Changgil/k2s3_test_24001 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Changgil/k2s3_test_24001 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Changgil/k2s3_test_24001") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Changgil/k2s3_test_24001") model = AutoModelForMultimodalLM.from_pretrained("Changgil/k2s3_test_24001") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Changgil/k2s3_test_24001 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Changgil/k2s3_test_24001" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/k2s3_test_24001", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Changgil/k2s3_test_24001
- SGLang
How to use Changgil/k2s3_test_24001 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Changgil/k2s3_test_24001" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/k2s3_test_24001", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Changgil/k2s3_test_24001" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Changgil/k2s3_test_24001", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Changgil/k2s3_test_24001 with Docker Model Runner:
docker model run hf.co/Changgil/k2s3_test_24001
Developed by :
- Changgil Song
Model Number:
- k2s3_test_24001
Base Model :
Training Data
- The model was trained on a diverse dataset comprising approximately 800 million tokens, including the Standard Korean Dictionary, KULLM training data from Korea University, dissertation abstracts from master's and doctoral theses, and Korean language samples from AI Hub.
- μ΄ λͺ¨λΈμ νμ€λκ΅μ΄μ¬μ , κ³ λ €λ KULLMμ νλ ¨ λ°μ΄ν°, μλ°μ¬νμμ μμ§μ 보 λ Όλ¬Έμ΄λ‘, ai_hubμ νκ΅μ΄ λ°μ΄ν° μνλ€μ ν¬ν¨νμ¬ μ½ 8μ΅ κ°μ ν ν°μΌλ‘ ꡬμ±λ λ€μν λ°μ΄ν°μ μμ νλ ¨λμμ΅λλ€.
Training Method
- This model was fine-tuned on the "meta-llama/Llama-2-13b-chat-hf" base model using PEFT (Parameter-Efficient Fine-Tuning) LoRA (Low-Rank Adaptation) techniques.
- μ΄ λͺ¨λΈμ "meta-llama/Llama-2-13b-chat-hf" κΈ°λ° λͺ¨λΈμ PEFT LoRAλ₯Ό μ¬μ©νμ¬ λ―ΈμΈμ‘°μ λμμ΅λλ€.
Hardware and Software
- Hardware: Utilized two A100 (80G*2EA) GPUs for training.
- Training Factors: This model was fine-tuned using PEFT LoRA with the HuggingFace SFTtrainer and applied fsdp. Key parameters included LoRA r = 8, LoRA alpha = 16, trained for 2 epochs, batch size of 1, and gradient accumulation of 32.
- μ΄ λͺ¨λΈμ PEFT LoRAλ₯Ό μ¬μ©νμ¬ HuggingFace SFTtrainerμ fsdpλ₯Ό μ μ©νμ¬ λ―ΈμΈμ‘°μ λμμ΅λλ€. μ£Όμ νλΌλ―Έν°λ‘λ LoRA r = 8, LoRA alpha = 16, 2 μν νλ ¨, λ°°μΉ ν¬κΈ° 1, κ·Έλ¦¬κ³ κ·ΈλΌλμΈνΈ λμ 32λ₯Ό ν¬ν¨ν©λλ€.
Caution
- For fine-tuning this model, it is advised to consider the specific parameters used during training, such as LoRA r and LoRA alpha values, to ensure compatibility and optimal performance.
- μ΄ λͺ¨λΈμ λ―ΈμΈμ‘°μ ν λλ LoRA r λ° LoRA alpha κ°κ³Ό κ°μ΄ νλ ¨ μ€μ μ¬μ©λ νΉμ νλΌλ―Έν°λ₯Ό κ³ λ €νλ κ²μ΄ μ’μ΅λλ€. μ΄λ νΈνμ± λ° μ΅μ μ μ±λ₯μ 보μ₯νκΈ° μν¨μ λλ€.
Additional Information
- The training leveraged the fsdp (Fully Sharded Data Parallel) feature through the HuggingFace SFTtrainer for efficient memory usage and accelerated training.
- νλ ¨μ HuggingFace SFTtrainerλ₯Ό ν΅ν fsdp κΈ°λ₯μ νμ©νμ¬ λ©λͺ¨λ¦¬ μ¬μ©μ ν¨μ¨μ μΌλ‘ νκ³ νλ ¨ μλλ₯Ό κ°μννμ΅λλ€.
- Downloads last month
- 62