colesmcintosh commited on
Commit
acc3558
β€’
1 Parent(s): 2877eea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -26,7 +26,7 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
26
 
27
  ---
28
 
29
- ## Model Details:
30
  - **Base Model**: unsloth/llama-3.2-1b-instruct-bnb-4bit
31
  - **Training Dataset**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – Chain-of-thought reasoning dataset with 29.9k examples to improve the model's ability to solve reasoning problems step-by-step.
32
  - **Techniques**:
@@ -36,11 +36,10 @@ This llama model was trained 2x faster with [Unsloth](https://github.com/unsloth
36
 
37
  ---
38
 
39
- ## Training Details:
40
-
41
  The fine-tuning process was conducted using the `SFTTrainer` class from the `trl` library, which is optimized for training transformer models using reinforcement learning techniques. The training process was structured as follows:
42
 
43
- ### Training Configuration:
44
  ```python
45
  from trl import SFTTrainer
46
  from transformers import TrainingArguments, DataCollatorForSeq2Seq
@@ -73,7 +72,7 @@ trainer = SFTTrainer(
73
  )
74
  ```
75
 
76
- ### Key Training Parameters:
77
  - **Batch Size**: `2` per device
78
  - **Gradient Accumulation Steps**: `4` to accumulate gradients over multiple forward passes, allowing for effective training with smaller batch sizes.
79
  - **Learning Rate**: `2e-4` with a linear decay schedule.
@@ -83,15 +82,15 @@ trainer = SFTTrainer(
83
  - **Optimizer**: `adamw_8bit` – Adam optimizer with 8-bit memory-efficient operations, which reduces GPU memory usage during training.
84
  - **Weight Decay**: `0.01` for regularization, preventing the model from overfitting.
85
 
86
- ### Dataset:
87
  - **Dataset Used for Training**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – The dataset contains **29.9k examples** of chain-of-thought reasoning instruction/output pairs.
88
 
89
- ### Collation Strategy:
90
  - **Data Collator**: `DataCollatorForSeq2Seq` is used to handle padding and tokenization efficiently, ensuring sequences are of the correct length during training.
91
 
92
  ---
93
 
94
- ## Inference Example:
95
 
96
  To run inference using the fine-tuned model, follow this code snippet:
97
 
 
26
 
27
  ---
28
 
29
+ ## Model Details
30
  - **Base Model**: unsloth/llama-3.2-1b-instruct-bnb-4bit
31
  - **Training Dataset**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – Chain-of-thought reasoning dataset with 29.9k examples to improve the model's ability to solve reasoning problems step-by-step.
32
  - **Techniques**:
 
36
 
37
  ---
38
 
39
+ ## Training Details
 
40
  The fine-tuning process was conducted using the `SFTTrainer` class from the `trl` library, which is optimized for training transformer models using reinforcement learning techniques. The training process was structured as follows:
41
 
42
+ ### Training Configuration
43
  ```python
44
  from trl import SFTTrainer
45
  from transformers import TrainingArguments, DataCollatorForSeq2Seq
 
72
  )
73
  ```
74
 
75
+ ### Key Training Parameters
76
  - **Batch Size**: `2` per device
77
  - **Gradient Accumulation Steps**: `4` to accumulate gradients over multiple forward passes, allowing for effective training with smaller batch sizes.
78
  - **Learning Rate**: `2e-4` with a linear decay schedule.
 
82
  - **Optimizer**: `adamw_8bit` – Adam optimizer with 8-bit memory-efficient operations, which reduces GPU memory usage during training.
83
  - **Weight Decay**: `0.01` for regularization, preventing the model from overfitting.
84
 
85
+ ### Dataset
86
  - **Dataset Used for Training**: [SkunkworksAI/reasoning-0.01](https://huggingface.co/datasets/SkunkworksAI/reasoning-0.01) – The dataset contains **29.9k examples** of chain-of-thought reasoning instruction/output pairs.
87
 
88
+ ### Collation Strategy
89
  - **Data Collator**: `DataCollatorForSeq2Seq` is used to handle padding and tokenization efficiently, ensuring sequences are of the correct length during training.
90
 
91
  ---
92
 
93
+ ## Inference Example
94
 
95
  To run inference using the fine-tuned model, follow this code snippet:
96