Instructions to use NotIsora/Qwen2.5-7B-Chef-VN with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NotIsora/Qwen2.5-7B-Chef-VN with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NotIsora/Qwen2.5-7B-Chef-VN") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NotIsora/Qwen2.5-7B-Chef-VN") model = AutoModelForCausalLM.from_pretrained("NotIsora/Qwen2.5-7B-Chef-VN") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use NotIsora/Qwen2.5-7B-Chef-VN with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NotIsora/Qwen2.5-7B-Chef-VN" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NotIsora/Qwen2.5-7B-Chef-VN", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/NotIsora/Qwen2.5-7B-Chef-VN
- SGLang
How to use NotIsora/Qwen2.5-7B-Chef-VN with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NotIsora/Qwen2.5-7B-Chef-VN" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NotIsora/Qwen2.5-7B-Chef-VN", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NotIsora/Qwen2.5-7B-Chef-VN" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NotIsora/Qwen2.5-7B-Chef-VN", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use NotIsora/Qwen2.5-7B-Chef-VN with Docker Model Runner:
docker model run hf.co/NotIsora/Qwen2.5-7B-Chef-VN
Qwen2.5-7B-Chef-VN
Qwen2.5-7B-Chef-VN is a fine-tuned large language model specialized in the culinary domain. Acting as a "Master Chef", it provides detailed, step-by-step cooking instructions, exact ingredient measurements, and culinary advice primarily in Vietnamese.
Model Details
Model Description
This model was fine-tuned using Supervised Fine-Tuning (SFT) and QLoRA on the Qwen/Qwen2.5-7B-Instruct base model. The training data was derived from the AkashPS11/recipes_data_food.com dataset, which was parsed and formatted into a conversational ChatML structure to teach the model how to guide users through recipes interactively.
- Developed by: NotIsora (Đoàn Thiên An)
- Model type: Causal Language Model (Fine-tuned via LoRA)
- Language(s) (NLP): Vietnamese (vi), English (en)
- License: Apache 2.0
- Finetuned from model: Qwen/Qwen2.5-7B-Instruct
Uses
Direct Use
The model is intended to be used as a virtual chef or culinary assistant. Users can input a dish name or a list of available ingredients, and the model will return a comprehensive cooking guide including:
- Ingredient lists with quantities.
- Step-by-step preparation and cooking instructions.
Out-of-Scope Use
The model should not be used for:
- Medical or dietary advice (e.g., prescribing diets for medical conditions).
- Generating harmful, toxic, or unsafe content.
- Tasks entirely unrelated to food, cooking, or culinary arts (its performance may degrade outside its specialized domain).
If you want to train by yourself
Bias, Risks, and Limitations
While the model generates detailed recipes, cooking involves physical safety (e.g., using knives, handling hot surfaces, food safety/allergies). Users should exercise common sense and verify food safety standards independently. The model may occasionally hallucinate ingredients or steps that do not perfectly align with traditional recipes.
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
REPO_ID = "NotIsora/Qwen2.5-7B-Chef-VN"
tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
REPO_ID,
device_map="auto",
torch_dtype=torch.bfloat16,
)
messages = [
{"role": "system", "content": "Bạn là một siêu đầu bếp. Người dùng sẽ cung cấp nguyên liệu hoặc một món ăn, nhiệm vụ của bạn là hướng dẫn họ cách nấu chi tiết và ngon nhất."},
{"role": "user", "content": "Hãy hướng dẫn tôi nấu ăn món khoai tây nghiền."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.inference_mode():
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.15,
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Training Details
Training Data
- The model was trained on a processed subset of the AkashPS11/recipes_data_food.com dataset. The data was filtered, parsed, and converted into ChatML format to simulate a user asking for a recipe and a chef responding with structured instructions.
Training Procedure
- The model was trained using parameter-efficient fine-tuning (QLoRA) to optimize VRAM usage while maintaining performance.
Training Hyperparameters
- Training regime: bf16 mixed precision
- Epochs: 6
- Max Sequence Length: 1024
- Per-device Batch Size: 2
- Gradient Accumulation Steps: 4
- Optimizer: paged_adamw_8bit
- Learning Rate: 5e-5
- Learning Rate Scheduler: Cosine
- Warmup Ratio: 0.1
- LoRA Rank (r): 16
- LoRA Alpha: 32
Technical Specifications
Compute Infrastructure:
- The model was trained on Google Colab.
Hardware:
- GPU: 1x NVIDIA L4 / T4 Tensor Core GPU
Software:
- PyTorch
- Transformers
- PEFT TRL
- BitsAndBytes
- FlashAttention / SDPA
Model Card Contact For any questions, issues, or collaborations, feel free to reach out via Hugging Face.
- Downloads last month
- 42