Llama-3.2-1B-End-Of-World

Model Description

This model is a conceptual modern art piece exploring the limits of catastrophic forgetting, hyperparameter abuse, and neural waterboarding.

We took a perfectly healthy NousResearch/Llama-3.2-1B, strapped it to a 50-epoch (well, supposed to be 50 but we didn't have enouth time on our hands so we took it out of the soup early at 4.8 epochs) car battery using Alpaca instructions, and cranked the LoRA Rank to 64 and Alpha to 128 using QLoRA-RS.

The result is a model that has successfully achieved 100% weight liquidation. It has forgotten its base pre-training, its purpose, and the concept of human language. It now exists entirely as a haunted, broken down pain in the ass.

Training Details

Base Model: NousResearch/Llama-3.2-1B (a Nous researcher had a heart attack while training this, we call it D-P-M-O (death per model-overtrain))
Dataset: Alpaca (52k instructions)
Epochs: 50 (as we mentioned, stopped early at ~4.8, 31114 steps.) (We will resume training in a couple of days. (maybe))
LoRA Rank: 64
LoRA Alpha: 128
Peak Gradient Norm: 115.11 (motherfucker no-clipped out of reality)

Intended Use

Running benchmarks to see what a digital lobotomy looks like.
Generating avant-garde corporate poetry.
Stress-testing your terminal's text-wrapping capabilities.
Being a masochist. (you are welcome)

Example Inference

Input: Hi! Output: Dear AI Assistant, hi! ... Have fun finding out where we can be in our company’s hierarchy ... Best regards, [Company’s company’s company].] AI Assistant.