diabolic6045
/

ELN-AOC-CAIN

@@ -1,142 +1,142 @@
----
-base_model: meta-llama/Llama-3.2-1B
-library_name: peft
-tags:
-- code
-- llm
-- Evolution_Learning_Network
-- qlora
-- llama
----
-# Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM
-## Overview
-This project implements an **Evolution Learning Network (ELN)** to fine-tune transformer-based models like LLaMA using a combination of **Quantized Low-Rank Adaptation (QLoRA)** and **Genetic Algorithms (GA)**. The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.
-### Key Features
-- Efficient model fine-tuning using **QLoRA**.
-- Evolutionary strategies, including **random mutations** and fitness-based selection.
-- Hardware-efficient training with **4-bit quantization**.
-- Comprehensive experiment tracking with **WandB**.
-- Diversity maintenance through **LoRA weight fingerprinting**.
----
-## Model Details
-### Base Model
-- **Name**: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) (can be replaced with any Hugging Face model).
-- **Architecture**: Transformer-based causal language model.
-### Quantization Configuration
-- **Quantization Type**: 4-bit using `bitsandbytes` (`bnb_4bit`).
-- **Parameters**:
-  - Compute Type: `torch.float16`
-  - Quantization Type: `"nf4"` (Nonlinear quantization).
-  - Double Quantization: Enabled.
-  - Nested Quantization: Enabled.
-### LoRA (Low-Rank Adaptation)
-- **Dimensions (r)**: 8
-- **Alpha (Scaling)**: 16
-- **Target Modules**: Query and Value projections (`q_proj`, `v_proj`).
-- **Dropout**: 0.05
-- **Task Type**: Causal Language Modeling (`CAUSAL_LM`).
-### Training Strategy
-- **Optimizer**: `paged_adamw_8bit` for memory-efficient updates.
-- **Precision**: Mixed precision (`fp16`) for faster training.
----
-## Hyperparameters
-### General Parameters
-- **Generations**: 10
-- **Population Size**: 4
-- **Dataset Size**: 2000 samples per split (adjustable for larger datasets).
-### Training
-- **Batch Size**: 8
-- **Gradient Accumulation**: 16 steps.
-- **Learning Rate**: `2e-4`
-- **Epochs per Model**: 2
-### Mutations
-- **Mutation Rate**: 10% (probability per parameter).
-- **Mutation Scale**: Noise added with a standard deviation of 0.02.
----
-## Dataset Details
-### Source
-- **Name**: WikiText ([wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-2-raw-v1) for larger datasets).
-- **Splits**:
-  - `train` → Model training.
-  - `validation` → General task evaluation.
-  - `test` → Specific task evaluation.
-### Tokenization
-- **Tokenizer**: Hugging Face `AutoTokenizer`.
-- **Max Token Length**: 128 tokens.
-- **Padding**: Fixed to `"max_length"`.
----
-## Results
-### Summary
-- **Total Generations**: 10
-- **Best Fitness Achieved**: 0.4772
-- **Final Population Diversity**: 0.0011
-### Evolution History (Highlights)
-| Generation | Best Fitness | Avg Fitness | Diversity | Best Specialization |
-|------------|--------------|-------------|-----------|---------------------|
-| 1          | 0.4096       | 0.4023      | 0.00097   | 0.9967              |
-| 5          | 0.4727       | 0.4722      | 0.00099   | 0.9968              |
-| 10         | 0.4772       | 0.4768      | 0.00106   | 0.9972              |
----
-## Hardware & Framework
-### Hardware
-- Multi-GPU support with `torch.nn.parallel.DistributedDataParallel` or `Accelerator`.
-- Logs GPU/CPU usage with `psutil` and `torch.cuda`.
-### Frameworks & Libraries
-- **Transformers**: Hugging Face model and tokenizer handling.
-- **Datasets**: Data loading and processing.
-- **WandB**: Experiment tracking and visualization.
-- **BitsAndBytes**: 4-bit quantization.
-- **PEFT**: LoRA-based fine-tuning.
----
-## Future Work
-- Explore larger population sizes and more generations for enhanced diversity.
-- Experiment with other datasets to generalize findings.
-- Integrate additional mutation strategies for broader exploration.
----
-## Citation
-Remaining
----
-> Code to run locally
-```python
-from peft import PeftModel
-from transformers import AutoModelForCausalLM
-base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
-model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-llama-1B-adapter")
-```
-### Framework versions
 - PEFT 0.14.0

+---
+base_model: meta-llama/Llama-3.2-1B
+library_name: peft
+tags:
+- code
+- llm
+- Evolution_Learning_Network
+- qlora
+- llama
+---
+# Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM
+## Overview
+This project implements an **Evolution Learning Network (ELN)** to fine-tune transformer-based models like LLaMA using a combination of **Quantized Low-Rank Adaptation (QLoRA)** and **Genetic Algorithms (GA)**. The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.
+### Key Features
+- Efficient model fine-tuning using **QLoRA**.
+- Evolutionary strategies, including **random mutations** and fitness-based selection.
+- Hardware-efficient training with **4-bit quantization**.
+- Comprehensive experiment tracking with **WandB**.
+- Diversity maintenance through **LoRA weight fingerprinting**.
+---
+## Model Details
+### Base Model
+- **Name**: [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) (can be replaced with any Hugging Face model).
+- **Architecture**: Transformer-based causal language model.
+### Quantization Configuration
+- **Quantization Type**: 4-bit using `bitsandbytes` (`bnb_4bit`).
+- **Parameters**:
+  - Compute Type: `torch.float16`
+  - Quantization Type: `"nf4"` (Nonlinear quantization).
+  - Double Quantization: Enabled.
+  - Nested Quantization: Enabled.
+### LoRA (Low-Rank Adaptation)
+- **Dimensions (r)**: 8
+- **Alpha (Scaling)**: 16
+- **Target Modules**: Query and Value projections (`q_proj`, `v_proj`).
+- **Dropout**: 0.05
+- **Task Type**: Causal Language Modeling (`CAUSAL_LM`).
+### Training Strategy
+- **Optimizer**: `paged_adamw_8bit` for memory-efficient updates.
+- **Precision**: Mixed precision (`fp16`) for faster training.
+---
+## Hyperparameters
+### General Parameters
+- **Generations**: 10
+- **Population Size**: 4
+- **Dataset Size**: 2000 samples per split (adjustable for larger datasets).
+### Training
+- **Batch Size**: 8
+- **Gradient Accumulation**: 16 steps.
+- **Learning Rate**: `2e-4`
+- **Epochs per Model**: 2
+### Mutations
+- **Mutation Rate**: 10% (probability per parameter).
+- **Mutation Scale**: Noise added with a standard deviation of 0.02.
+---
+## Dataset Details
+### Source
+- **Name**: WikiText ([wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/viewer/wikitext-2-raw-v1) for larger datasets).
+- **Splits**:
+  - `train` → Model training.
+  - `validation` → General task evaluation.
+  - `test` → Specific task evaluation.
+### Tokenization
+- **Tokenizer**: Hugging Face `AutoTokenizer`.
+- **Max Token Length**: 128 tokens.
+- **Padding**: Fixed to `"max_length"`.
+---
+## Results
+### Summary
+- **Total Generations**: 10
+- **Best Fitness Achieved**: 0.4772
+- **Final Population Diversity**: 0.0011
+### Evolution History (Highlights)
+| Generation | Best Fitness | Avg Fitness | Diversity | Best Specialization |
+|------------|--------------|-------------|-----------|---------------------|
+| 1          | 0.4096       | 0.4023      | 0.00097   | 0.9967              |
+| 5          | 0.4727       | 0.4722      | 0.00099   | 0.9968              |
+| 10         | 0.4772       | 0.4768      | 0.00106   | 0.9972              |
+---
+## Hardware & Framework
+### Hardware
+- Multi-GPU support with `torch.nn.parallel.DistributedDataParallel` or `Accelerator`.
+- Logs GPU/CPU usage with `psutil` and `torch.cuda`.
+### Frameworks & Libraries
+- **Transformers**: Hugging Face model and tokenizer handling.
+- **Datasets**: Data loading and processing.
+- **WandB**: Experiment tracking and visualization.
+- **BitsAndBytes**: 4-bit quantization.
+- **PEFT**: LoRA-based fine-tuning.
+---
+## Future Work
+- Explore larger population sizes and more generations for enhanced diversity.
+- Experiment with other datasets to generalize findings.
+- Integrate additional mutation strategies for broader exploration.
+---
+## Citation
+Remaining
+---
+> Code to run locally
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM
+base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
+model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-AOC-CAIN")
+```
+### Framework versions
 - PEFT 0.14.0