Uploaded model

  • Developed by: beyoru
  • License: apache-2.0

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "beyoru/MCQ-3B-o-12"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "system", "content": "Tạo câu hỏi trắc nghiệm dựa vào đoạn văn dưới đây"},
    {"role": "user", "content": "<YOUR CONTEXT>"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    do_sample=True
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Notes:

  • For small datasets with narrow content which the model has already done well on our domain, and doesn't want the model to forget the knowledge => Just need to focus on q, o.
  • Fine-tuned lora with rank = 12 and alpha = 32, epoch = 1, linear (optim)
  • DoRA

Improvement

  • Increasing rank can help the model do better at robust structure.
  • Try more efficient fine-tuning
Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for beyoru/MCQ-3B-o-12

Base model

Qwen/Qwen2.5-3B
Finetuned
(49)
this model
Quantizations
1 model

Dataset used to train beyoru/MCQ-3B-o-12

Collection including beyoru/MCQ-3B-o-12