colesmcintosh
/

Llama-3.2-1B-Instruct-Mango

@@ -26,7 +26,7 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 ---
-## Model Details:
 - **Base Model**: unsloth/llama-3.2-1b-instruct-bnb-4bit
 - **Training Dataset**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – Chain-of-thought reasoning dataset with 29.9k examples to improve the model's ability to solve reasoning problems step-by-step.
 - **Techniques**:
@@ -36,11 +36,10 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
 ---
-## Training Details:
 The fine-tuning process was conducted using the `SFTTrainer` class from the `trl` library, which is optimized for training transformer models using reinforcement learning techniques. The training process was structured as follows:
-### Training Configuration:
 ```python
 from trl import SFTTrainer
 from transformers import TrainingArguments, DataCollatorForSeq2Seq
@@ -73,7 +72,7 @@ trainer = SFTTrainer(
 )
 ```
-### Key Training Parameters:
 - **Batch Size**: `2` per device
 - **Gradient Accumulation Steps**: `4` to accumulate gradients over multiple forward passes, allowing for effective training with smaller batch sizes.
 - **Learning Rate**: `2e-4` with a linear decay schedule.
@@ -83,15 +82,15 @@ trainer = SFTTrainer(
 - **Optimizer**: `adamw_8bit` – Adam optimizer with 8-bit memory-efficient operations, which reduces GPU memory usage during training.
 - **Weight Decay**: `0.01` for regularization, preventing the model from overfitting.
-### Dataset:
 - **Dataset Used for Training**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – The dataset contains **29.9k examples** of chain-of-thought reasoning instruction/output pairs.
-### Collation Strategy:
 - **Data Collator**: `DataCollatorForSeq2Seq` is used to handle padding and tokenization efficiently, ensuring sequences are of the correct length during training.
 ---
-## Inference Example:
 To run inference using the fine-tuned model, follow this code snippet:

 ---
+## Model Details
 - **Base Model**: unsloth/llama-3.2-1b-instruct-bnb-4bit
 - **Training Dataset**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – Chain-of-thought reasoning dataset with 29.9k examples to improve the model's ability to solve reasoning problems step-by-step.
 - **Techniques**:
 ---
+## Training Details
 The fine-tuning process was conducted using the `SFTTrainer` class from the `trl` library, which is optimized for training transformer models using reinforcement learning techniques. The training process was structured as follows:
+### Training Configuration
 ```python
 from trl import SFTTrainer
 from transformers import TrainingArguments, DataCollatorForSeq2Seq
 )
 ```
+### Key Training Parameters
 - **Batch Size**: `2` per device
 - **Gradient Accumulation Steps**: `4` to accumulate gradients over multiple forward passes, allowing for effective training with smaller batch sizes.
 - **Learning Rate**: `2e-4` with a linear decay schedule.
 - **Optimizer**: `adamw_8bit` – Adam optimizer with 8-bit memory-efficient operations, which reduces GPU memory usage during training.
 - **Weight Decay**: `0.01` for regularization, preventing the model from overfitting.
+### Dataset
 - **Dataset Used for Training**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – The dataset contains **29.9k examples** of chain-of-thought reasoning instruction/output pairs.
+### Collation Strategy
 - **Data Collator**: `DataCollatorForSeq2Seq` is used to handle padding and tokenization efficiently, ensuring sequences are of the correct length during training.
 ---
+## Inference Example
 To run inference using the fine-tuned model, follow this code snippet: