cmbaopenwebmath / README.md
Viharikvs's picture
Model card updated after epoch 30
7ec7bd8 verified
metadata
base_model: t5-small
tags:
  - hrm
  - act
  - wikitext
metrics:
  - loss
  - perplexity

HRM-Text1 (WikiText-103)

This repository contains weights for an experimental HRM Causal LM trained on the WikiText-103 dataset.

Model Description

  • Architecture: Hierarchical Recurrent Memory (HRM)
  • Training Data: wikitext/wikitext-103-raw-v1
  • Tokenizer: t5-small (slow T5 SentencePiece)
  • Vocab Size: 32100
  • Objective: Causal Language Modeling

Latest Performance (Epoch 30)

  • Validation Loss: 4.5848
  • Validation Perplexity: 97.98