Qwen2.5-0.5B-DPO / README.md
co-gy's picture
Update README.md
e7b99b0 verified
metadata
datasets:
  - Intel/orca_dpo_pairs
base_model:
  - Qwen/Qwen2.5-0.5B-Instruct
license: apache-2.0

Fine-tuned Qwen/Qwen2.5-0.5B-Instruct Model

Model Overview

This is a fine-tuned version of the Qwen/Qwen2.5-0.5B-Instruct model. The fine-tuning process utilized the Intel/orca_dpo_pairs dataset and applied DPO (Direct Preference Optimization) and LoRA (Low-Rank Adaptation) techniques.

Note: This fine-tuning was done following the instructions in this blog.

Fine-tuning Details

  • Base Model: Qwen/Qwen2.5-0.5B-Instruct
  • Dataset: Intel/orca_dpo_pairs
  • Fine-tuning Method: DPO + LoRA

Usage Instructions

Install Dependencies

Before using this model, make sure you have the following dependencies installed:

pip install transformers datasets

Load the model

import transformers
from transformers import AutoConfig, AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("drive/MyDrive/result/Qwen-DPO")

message = [
    {"role": "system", "content": "You are a helpful assistant chatbot."},
    {"role": "user", "content": "What is a Large Language Model?"}
]
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

pipeline = transformers.pipeline(
    "text-generation",
    model="co-gy/Qwen-DPO",
    tokenizer=tokenizer
)

sequences = pipeline(
    prompt,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    num_return_sequences=1,
    max_length=200,
)
print(sequences[0]['generated_text'])