MiniGridEnv_Blog / README.md
yashu2000's picture
Blog Render issue Update
b997527 verified
metadata
title: MiniGridEnv Blog
emoji: 🐠
colorFrom: green
colorTo: pink
sdk: static
pinned: false
license: apache-2.0
short_description: Blog for MiniGridEnv for OpenEnv Comp in AgentX

MiniGridEnv Blog

Static blog post for the OpenEnv track of the AgentX competition (UC Berkeley RDI), covering:

  • An OpenEnv-native wrap of Farama's MiniGrid / BabyAI with text observations and NL actions.
  • GRPO post-training (MiniGridPT) with cross-episodic, LLM-rewritten, line-budgeted markdown memory.
  • Branch-stable memory-file naming so each GRPO chain keeps a stable file across optimizer steps.

Files

  • index.html β€” main blog (self-contained: inline CSS, Mermaid via CDN).
  • banner.png β€” 3-panel hero image (Observe β†’ Act β†’ Remember).
  • style.css β€” legacy placeholder from the Spaces scaffold; index.html inlines all styling.

Rebuild the banner

The banner is generated from a matplotlib script kept with the other impl docs:

# from the repo root
python impl-context/build_blog_images.py
# writes MiniGridEnv_Blog/banner.png at 200 DPI

Dependencies: pip install matplotlib numpy.

Open locally

open MiniGridEnv_Blog/index.html
# or: python -m http.server --directory MiniGridEnv_Blog 8080

<INSERT> placeholders

The blog ships with a handful of <INSERT: ...> placeholders that must be filled before publishing:

  • <INSERT: GitHub URL> β€” repo URL (hero badges, buttons, quickstart git clone, footer).
  • <INSERT: HF Space URL> β€” live environment Space (topnav, hero buttons, footer).
  • <INSERT: Voyager arXiv URL> / <INSERT: Reflexion arXiv URL> / <INSERT: Generative Agents arXiv URL> β€” arXiv links in the Foundations table (pre-filled paper IDs are in the surrounding text: 2305.16291, 2303.11366, 2304.03442).
  • <INSERT: Lottery HF Space URL> β€” sibling project Space in the Foundations table.
  • <INSERT> cells in the Results table β€” measured completion rates for GRPO and GRPO+Memory per level once converged checkpoints are available.
  • <INSERT: verbatim memory snapshot per checkpoint> β€” optional: replace the illustrative memory-evolution cards with verbatim snapshots after a memory-mode training run.

See the Spaces configuration reference at https://huggingface.co/docs/hub/spaces-config-reference.