LogicNet-Subnet
/

LogicNet-7B

text-generation-inference

Model card Files Files and versions

bbreoh commited on Jan 18

Commit

dec60b4

·

verified ·

1 Parent(s): f03a2ed

Update README.md

Files changed (1) hide show

README.md +86 -6

README.md CHANGED Viewed

@@ -9,14 +9,94 @@ tags:
 license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
-- **Developed by:** LogicNet-Subnet
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/qwen2-7b-instruct-bnb-4bit
-This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 license: apache-2.0
 language:
 - en
+datasets:
+- LogicNet-Subnet/Aristole
 ---
+# Overview
+This model is a fine-tuned version of **Qwen/Qwen2-7B-Instruct** on the **LogicNet-Subnet/Aristole** dataset. It achieves the following benchmarks on the evaluation set:
+- **Reliability**: 98.53%
+- **Correctness**: 0.9739
+### Key Details:
+- **Developed by**: LogicNet Team
+- **License**: Apache 2.0
+- **Base Model**: [unsloth/qwen2-7b-instruct-bnb-4bit](https://huggingface.co/unsloth/qwen2-7b-instruct-bnb-4bit)
+This fine-tuned Qwen2 model was trained **2x faster** using [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's **TRL** library.
+---
+## Model and Training Hyperparameters
+### Model Configuration:
+- **dtype**: `torch.bfloat16`
+- **load_in_4bit**: `True`
+### Prompt Configuration:
+- **max_seq_length**: `2048`
+### PEFT Model Parameters:
+- **r**: `16`
+- **lora_alpha**: `16`
+- **lora_dropout**: `0`
+- **bias**: `"none"`
+- **use_gradient_checkpointing**: `"unsloth"`
+- **random_state**: `3407`
+- **use_rslora**: `False`
+- **loftq_config**: `None`
+### Training Arguments:
+- **per_device_train_batch_size**: `2`
+- **gradient_accumulation_steps**: `4`
+- **warmup_steps**: `5`
+- **max_steps**: `70`
+- **learning_rate**: `2e-4`
+- **fp16**: `not is_bfloat16_supported()`
+- **bf16**: `is_bfloat16_supported()`
+- **logging_steps**: `1`
+- **optim**: `"adamw_8bit"`
+- **weight_decay**: `0.01`
+- **lr_scheduler_type**: `"linear"`
+- **seed**: `3407`
+- **output_dir**: `"outputs"`
+---
+## Training Results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 1.4764        | 1.0   | 1150 | 1.1850          |
+| 1.3102        | 2.0   | 2050 | 1.1091          |
+| 1.1571        | 3.0   | 3100 | 1.0813          |
+| 1.0922        | 4.0   | 3970 | 0.9906          |
+| 0.9809        | 5.0   | 5010 | 0.9021          |
+## How To Use
+You can easily use the model for inference as shown below:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load the model
+tokenizer = AutoTokenizer.from_pretrained("LogicNet-Subnet/LogicNet-7B")
+model = AutoModelForCausalLM.from_pretrained("LogicNet-Subnet/LogicNet-7B")
+# Prepare the input
+inputs = tokenizer(
+    [
+        "what is odd which is bigger than zero?"  # Example prompt
+    ],
+    return_tensors="pt"
+).to("cuda")
+# Generate an output
+outputs = model.generate(**inputs)
+# Decode and print the result
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```