Instructions to use NotIsora/Qwen2.5-7B-Chef-VN with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use NotIsora/Qwen2.5-7B-Chef-VN with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="NotIsora/Qwen2.5-7B-Chef-VN")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("NotIsora/Qwen2.5-7B-Chef-VN")
model = AutoModelForCausalLM.from_pretrained("NotIsora/Qwen2.5-7B-Chef-VN")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use NotIsora/Qwen2.5-7B-Chef-VN with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "NotIsora/Qwen2.5-7B-Chef-VN"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotIsora/Qwen2.5-7B-Chef-VN",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/NotIsora/Qwen2.5-7B-Chef-VN

SGLang

How to use NotIsora/Qwen2.5-7B-Chef-VN with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "NotIsora/Qwen2.5-7B-Chef-VN" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotIsora/Qwen2.5-7B-Chef-VN",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "NotIsora/Qwen2.5-7B-Chef-VN" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "NotIsora/Qwen2.5-7B-Chef-VN",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use NotIsora/Qwen2.5-7B-Chef-VN with Docker Model Runner:
```
docker model run hf.co/NotIsora/Qwen2.5-7B-Chef-VN
```

Qwen2.5-7B-Chef-VN

Qwen2.5-7B-Chef-VN is a fine-tuned large language model specialized in the culinary domain. Acting as a "Master Chef", it provides detailed, step-by-step cooking instructions, exact ingredient measurements, and culinary advice primarily in Vietnamese.

Model Details

Model Description

This model was fine-tuned using Supervised Fine-Tuning (SFT) and QLoRA on the Qwen/Qwen2.5-7B-Instruct base model. The training data was derived from the AkashPS11/recipes_data_food.com dataset, which was parsed and formatted into a conversational ChatML structure to teach the model how to guide users through recipes interactively.

Developed by: NotIsora (Đoàn Thiên An)
Model type: Causal Language Model (Fine-tuned via LoRA)
Language(s) (NLP): Vietnamese (vi), English (en)
License: Apache 2.0
Finetuned from model: Qwen/Qwen2.5-7B-Instruct

Uses

Direct Use

The model is intended to be used as a virtual chef or culinary assistant. Users can input a dish name or a list of available ingredients, and the model will return a comprehensive cooking guide including:

Ingredient lists with quantities.
Step-by-step preparation and cooking instructions.

Out-of-Scope Use

The model should not be used for:

Medical or dietary advice (e.g., prescribing diets for medical conditions).
Generating harmful, toxic, or unsafe content.
Tasks entirely unrelated to food, cooking, or culinary arts (its performance may degrade outside its specialized domain).

If you want to train by yourself

Bias, Risks, and Limitations

While the model generates detailed recipes, cooking involves physical safety (e.g., using knives, handling hot surfaces, food safety/allergies). Users should exercise common sense and verify food safety standards independently. The model may occasionally hallucinate ingredients or steps that do not perfectly align with traditional recipes.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

REPO_ID = "NotIsora/Qwen2.5-7B-Chef-VN"

tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForCausalLM.from_pretrained(
    REPO_ID,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)

messages = [
    {"role": "system", "content": "Bạn là một siêu đầu bếp. Người dùng sẽ cung cấp nguyên liệu hoặc một món ăn, nhiệm vụ của bạn là hướng dẫn họ cách nấu chi tiết và ngon nhất."},
    {"role": "user", "content": "Hãy hướng dẫn tôi nấu ăn món khoai tây nghiền."}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.inference_mode():
    outputs = model.generate(
        **inputs,
        max_new_tokens=1024,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
        repetition_penalty=1.15,
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Training Details

Training Data

The model was trained on a processed subset of the AkashPS11/recipes_data_food.com dataset. The data was filtered, parsed, and converted into ChatML format to simulate a user asking for a recipe and a chef responding with structured instructions.

Training Procedure

The model was trained using parameter-efficient fine-tuning (QLoRA) to optimize VRAM usage while maintaining performance.

Training Hyperparameters

Training regime: bf16 mixed precision
Epochs: 6
Max Sequence Length: 1024
Per-device Batch Size: 2
Gradient Accumulation Steps: 4
Optimizer: paged_adamw_8bit
Learning Rate: 5e-5
Learning Rate Scheduler: Cosine
Warmup Ratio: 0.1
LoRA Rank (r): 16
LoRA Alpha: 32

Technical Specifications

Compute Infrastructure:

The model was trained on Google Colab.

Hardware:

GPU: 1x NVIDIA L4 / T4 Tensor Core GPU

Software:

PyTorch
Transformers
PEFT TRL
BitsAndBytes
FlashAttention / SDPA

Model Card Contact For any questions, issues, or collaborations, feel free to reach out via Hugging Face.

Downloads last month: 42

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for NotIsora/Qwen2.5-7B-Chef-VN

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(2235)

this model

NotIsora
/

Qwen2.5-7B-Chef-VN