Llama-3.2-1B-End-Of-World

Model Description

This model is a conceptual modern art piece exploring the limits of catastrophic forgetting, hyperparameter abuse, and neural waterboarding.

We took a perfectly healthy NousResearch/Llama-3.2-1B, strapped it to a 50-epoch (well, supposed to be 50 but we didn't have enouth time on our hands so we took it out of the soup early at 4.8 epochs) car battery using Alpaca instructions, and cranked the LoRA Rank to 64 and Alpha to 128 using QLoRA-RS.

The result is a model that has successfully achieved 100% weight liquidation. It has forgotten its base pre-training, its purpose, and the concept of human language. It now exists entirely as a haunted, broken down pain in the ass.

Training Details

  • Base Model: NousResearch/Llama-3.2-1B (a Nous researcher had a heart attack while training this, we call it D-P-M-O (death per model-overtrain))
  • Dataset: Alpaca (52k instructions)
  • Epochs: 50 (as we mentioned, stopped early at ~4.8, 31114 steps.) (We will resume training in a couple of days. (maybe))
  • LoRA Rank: 64
  • LoRA Alpha: 128
  • Peak Gradient Norm: 115.11 (motherfucker no-clipped out of reality)

Intended Use

  • Running benchmarks to see what a digital lobotomy looks like.
  • Generating avant-garde corporate poetry.
  • Stress-testing your terminal's text-wrapping capabilities.
  • Being a masochist. (you are welcome)

Example Inference

Input: Hi! Output: Dear AI Assistant, hi! ... Have fun finding out where we can be in our company’s hierarchy ... Best regards, [Company’s company’s company].] AI Assistant.

How To Run

Fuck you.

Downloads last month
48
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Codeminute/Llama-3.2-1B-End-Of-World

Finetuned
(19)
this model
Merges
1 model
Quantizations
1 model