Instructions to use PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini")
model = AutoModelForCausalLM.from_pretrained("PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini

SGLang

How to use PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini with Docker Model Runner:
```
docker model run hf.co/PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini
```

Qwen2.5-7B-ViLegalQA

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct optimized for Vietnamese Legal Consulting (Tư vấn pháp luật).

It is designed to assist with answering legal questions, analyzing situations based on logic, and citing relevant legal contexts where possible.

Model Details

Developed by: PeterPaker123
Language: Vietnamese
Base Model: Qwen/Qwen2.5-7B-Instruct
Task: Legal Question Answering & Consulting
Domain: Vietnamese Law

Intended Use

This model is designed to act as a legal assistant for Vietnamese speakers. It is particularly effective at:

Answering questions regarding Vietnamese Civil, Criminal, and Labor laws.
Explaining legal concepts in simple terms.
Logical reasoning based on provided legal contexts.

System Prompt

To achieve the intended performance, you must use the following system prompt (as defined in the training/inference script):

Bạn là một chuyên gia tư vấn pháp lý. Hãy sử dụng tư duy logic và kiến thức luật pháp để hoàn thành nhiệm vụ, đảm bảo mọi thông tin đưa ra đều có căn cứ từ văn bản được cung cấp

Usage Example

Below is a Python script to run the model with the correct configuration and streaming generation.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

# 1. Configuration
MODEL_PATH = "PeterPaker123/Qwen2.5-7B-ViLegalQA"
SYSTEM_PROMPT = "Bạn là một chuyên gia tư vấn pháp lý. Hãy sử dụng tư duy logic và kiến thức luật pháp để hoàn thành nhiệm vụ, đảm bảo mọi thông tin đưa ra đều có căn cứ từ văn bản được cung cấp"

# 2. Load Model & Tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_PATH,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model.eval()

# 3. Prepare Input
# Example: asking about labor contract termination
user_query = "Người lao động có quyền đơn phương chấm dứt hợp đồng lao động không?"

conversation = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": user_query}
]

text = tokenizer.apply_chat_template(
    conversation, 
    tokenize=False, 
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to("cuda")

# 4. Generate
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

print("Bot is typing...")
with torch.no_grad():
    _ = model.generate(
        **inputs, 
        max_new_tokens=1024,
        pad_token_id=tokenizer.eos_token_id,
        streamer=streamer,
        temperature=0.6,
        top_p=0.9,
        do_sample=True
    )

Limitations & Disclaimer

Not a Real Lawyer: This model is an AI assistant, not a licensed attorney. The information provided should not be considered official legal advice.
Verification Required: Users should always consult with a qualified legal professional and verify citations against current official legal documents (Van ban phap luat).
Hallucinations: Like all Large Language Models (LLMs), this model may occasionally generate plausible-sounding but incorrect legal article numbers or interpretations.
Temporal Cutoff: Laws change frequently. The model's knowledge is limited to the dataset it was trained on and may not reflect the absolute latest decrees or circulars.

Ethical Considerations

Bias: The model may reflect biases present in the legal text or training data.
Misuse: This tool should not be used to generate fraudulent legal documents or to bypass professional legal counsel in critical court cases.

Credits

The Qwen Team: For the Qwen 2.5 base model architecture.
Community Datasets: Acknowledgments to the Vietnamese open-source community for legal datasets (e.g., ViLegalQA) that contribute to the development of models in this domain.

Downloads last month: 9

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for PeterPaker123/Qwen2.5-7B-ViLegalQA-Mini

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

(2626)

this model