Update README.md
Browse files
README.md
CHANGED
|
@@ -1,29 +1,81 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
tags:
|
| 4 |
-
- lora
|
| 5 |
-
- fine-tuning
|
| 6 |
-
-
|
| 7 |
-
-
|
| 8 |
-
-
|
|
|
|
|
|
|
| 9 |
datasets:
|
| 10 |
-
- glue
|
| 11 |
-
|
| 12 |
-
- accuracy
|
| 13 |
-
- f1
|
| 14 |
---
|
| 15 |
|
| 16 |
# Unified-LoRA
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
|
| 29 |
-
|
|
|
|
| 1 |
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
tags:
|
| 4 |
+
- lora
|
| 5 |
+
- fine-tuning
|
| 6 |
+
- adaptive
|
| 7 |
+
- research
|
| 8 |
+
- nested-lora
|
| 9 |
+
- rank-adaptation
|
| 10 |
+
library_name: transformers
|
| 11 |
datasets:
|
| 12 |
+
- nyu-mll/glue
|
| 13 |
+
pipeline_tag: text-classification
|
|
|
|
|
|
|
| 14 |
---
|
| 15 |
|
| 16 |
# Unified-LoRA
|
| 17 |
|
| 18 |
+
**Adaptive rank controller for LoRA fine-tuning via nested orbital slicing.**
|
| 19 |
+
|
| 20 |
+
⚠️ **This is NOT a pretrained model.** Unified-LoRA is a training method/controller for LoRA.
|
| 21 |
+
|
| 22 |
+
👉 **Code**: [github.com/Sva76/Unified-LoRa](https://github.com/Sva76/Unified-LoRa)
|
| 23 |
+
👉 **Demo**: [unified_lora_demo.ipynb](https://github.com/Sva76/Unified-LoRa/blob/main/notebooks/unified_lora_demo.ipynb)
|
| 24 |
+
|
| 25 |
+
## What It Does
|
| 26 |
+
|
| 27 |
+
Instead of fixing `rank=8` and hoping it works, Unified-LoRA allocates a single LoRA matrix pair at max rank and controls active capacity via **matrix slicing** (r4 ⊂ r8 ⊂ r16). An OrbitalController monitors gradient stress per layer and promotes/demotes rank using adaptive thresholds (μ ± kσ).
|
| 28 |
+
|
| 29 |
+
**Key properties:**
|
| 30 |
+
- Zero cold-start on rank transitions (lower ranks are subsets of higher ranks)
|
| 31 |
+
- Per-layer independence (each adapter finds its own optimal rank)
|
| 32 |
+
- ~100 lines of code, no SVD, negligible overhead
|
| 33 |
+
|
| 34 |
+
## Results
|
| 35 |
+
|
| 36 |
+
**GLUE (DistilBERT, 67M):** Comparable or better on 3/4 tasks with 33–56% rank reduction.
|
| 37 |
+
|
| 38 |
+
| Task | Baseline (r=16) | Adaptive | Rank Reduction |
|
| 39 |
+
|------|-----------------|----------|----------------|
|
| 40 |
+
| MRPC | 0.882 F1 | **0.886**| 42% |
|
| 41 |
+
| CoLA | 0.488 MCC | **0.491**| 56% |
|
| 42 |
+
| RTE | 0.556 Acc | **0.592**| 33% |
|
| 43 |
+
|
| 44 |
+
**Noise resilience (validated use case):** +31 F1 points at 50% label noise, 9× lower variance vs fixed rank. No benefit on clean data. Pattern confirmed at 67M, 1.1B, and 3B scales.
|
| 45 |
+
|
| 46 |
+
**NestedLoRA stress tests:** Performance parity with baseline, ~15% rank saving, zero cold-start degradation.
|
| 47 |
+
|
| 48 |
+
## Quick Start
|
| 49 |
+
|
| 50 |
+
```python
|
| 51 |
+
from controller import setup_unified_lora
|
| 52 |
+
|
| 53 |
+
adapters, ctrl = setup_unified_lora(
|
| 54 |
+
model,
|
| 55 |
+
target_modules=["q_proj", "v_proj"],
|
| 56 |
+
max_rank=16,
|
| 57 |
+
rank_levels=[4, 8, 16],
|
| 58 |
+
)
|
| 59 |
|
| 60 |
+
for batch in dataloader:
|
| 61 |
+
loss = model(**batch).loss
|
| 62 |
+
loss.backward()
|
| 63 |
+
ctrl.step()
|
| 64 |
+
optimizer.step()
|
| 65 |
+
optimizer.zero_grad()
|
| 66 |
+
```
|
| 67 |
|
| 68 |
+
## Citation
|
| 69 |
|
| 70 |
+
```bibtex
|
| 71 |
+
@software{unified_lora_2025,
|
| 72 |
+
author = {Simona Vargiu},
|
| 73 |
+
title = {Unified-LoRA: Adaptive Rank Controller via Nested Orbital Slicing},
|
| 74 |
+
year = {2025},
|
| 75 |
+
url = {https://github.com/Sva76/Unified-LoRa}
|
| 76 |
+
}
|
| 77 |
+
```
|
| 78 |
|
| 79 |
+
## Contact
|
| 80 |
|
| 81 |
+
Simona Vargiu (Independent Researcher) — simona.vargiu.malta@gmail.com
|