metadata
base_model: t5-small
tags:
- hrm
- act
- wikitext
metrics:
- loss
- perplexity
HRM-Text1 (WikiText-103)
This repository contains weights for an experimental HRM Causal LM trained on the WikiText-103 dataset.
Model Description
- Architecture: Hierarchical Recurrent Memory (HRM)
- Training Data: wikitext/wikitext-103-raw-v1
- Tokenizer:
t5-small(slow T5 SentencePiece) - Vocab Size: 32100
- Objective: Causal Language Modeling
Latest Performance (Epoch 30)
- Validation Loss:
4.5848 - Validation Perplexity:
97.98