Overview

This model is a fine-tuned version of Qwen/Qwen2-7B-Instruct on the LogicNet-Subnet/Aristole dataset. It achieves the following benchmarks on the evaluation set:

  • Reliability: 98.53%
  • Correctness: 0.9739

Key Details:

This fine-tuned Qwen2 model was trained 2x faster using Unsloth and Hugging Face's TRL library.


Model and Training Hyperparameters

Model Configuration:

  • dtype: torch.bfloat16
  • load_in_4bit: True

Prompt Configuration:

  • max_seq_length: 2048

PEFT Model Parameters:

  • r: 16
  • lora_alpha: 16
  • lora_dropout: 0
  • bias: "none"
  • use_gradient_checkpointing: "unsloth"
  • random_state: 3407
  • use_rslora: False
  • loftq_config: None

Training Arguments:

  • per_device_train_batch_size: 2
  • gradient_accumulation_steps: 4
  • warmup_steps: 5
  • max_steps: 70
  • learning_rate: 2e-4
  • fp16: not is_bfloat16_supported()
  • bf16: is_bfloat16_supported()
  • logging_steps: 1
  • optim: "adamw_8bit"
  • weight_decay: 0.01
  • lr_scheduler_type: "linear"
  • seed: 3407
  • output_dir: "outputs"

Training Results

Training Loss Epoch Step Validation Loss
1.4764 1.0 1150 1.1850
1.3102 2.0 2050 1.1091
1.1571 3.0 3100 1.0813
1.0922 4.0 3970 0.9906
0.9809 5.0 5010 0.9021

How To Use

You can easily use the model for inference as shown below:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model
tokenizer = AutoTokenizer.from_pretrained("LogicNet-Subnet/LogicNet-7B")
model = AutoModelForCausalLM.from_pretrained("LogicNet-Subnet/LogicNet-7B")

# Prepare the input
inputs = tokenizer(
    [
        "what is odd which is bigger than zero?"  # Example prompt
    ],
    return_tensors="pt"
).to("cuda")

# Generate an output
outputs = model.generate(**inputs)

# Decode and print the result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train LogicNet-Subnet/LogicNet-7B