GASM / README.md
scheitelpunk's picture
update readme
f9acb47

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: GASM Enhanced - Geometric Language AI
emoji: ๐Ÿš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
license: cc-by-nd-4.0

๐Ÿš€ GASM Enhanced - Geometric Attention for Spatial Understanding

Bridging natural language and geometric reasoning through SE(3)-invariant neural architectures

What Makes This Different?

Traditional AI understands what objects are mentioned, but struggles with where they are and how they relate spatially. GASM changes this.

GASM (Geometric Attention for Spatial & Mathematical understanding) represents a breakthrough in AI spatial reasoning:

  • ๐Ÿง  Advanced NLP: Goes beyond keywords with spaCy + semantic categorization
  • ๐Ÿ“ Proper 3D Math: Uses SE(3) Lie groups for mathematically correct spatial relationships
  • ๐Ÿ”„ Geometric Optimization: Minimizes curvature on Riemannian manifolds for optimal layouts
  • โœจ Real-time Visualization: Shows spatial understanding in live 3D geometry

๐ŸŒŸ What This Enables

The Spatial Intelligence Gap

Current language models excel at:

  • โœ… "What is a keyboard?" โ†’ An input device
  • โŒ "Where is the keyboard relative to the monitor?" โ†’ Spatial confusion

GASM bridges this gap through mathematical spatial reasoning.

Real Applications

This isn't just a demo - GASM addresses actual problems in:

  • ๐Ÿค– Robotics: "Move the component above the platform" โ†’ Precise 3D coordinates
  • ๐Ÿ”ฌ Scientific Modeling: "The electron orbits the nucleus" โ†’ Proper geometric relationships
  • ๐Ÿ—๏ธ Engineering: "Place the support between the beams" โ†’ Constraint satisfaction
  • ๐Ÿฅฝ AR/VR: Natural language to 3D scene understanding

๐ŸŽฏ Try It Yourself

Watch GASM in Action

Input any sentence with spatial relationships:

"The ball lies left of the table next to the computer, while the book sits between the keyboard and the monitor."

GASM Output:

  • โœ… 6 entities identified: ball, table, computer, book, keyboard, monitor
  • ๐Ÿ”— 5 spatial relations: left_of, next_to, between
  • ๐ŸŒŒ 3D geometric layout with proper SE(3) positioning
  • ๐Ÿ“ˆ Curvature evolution showing geometric convergence

More Examples

๐Ÿค– Robotics: "The robotic arm moves the satellite component above the assembly platform."

๐Ÿ”ฌ Scientific: "The electron orbits the nucleus while the magnetic field flows through the crystal."

๐Ÿ  Everyday: "The red car parks between two buildings near the park entrance."

What You'll See

  1. Advanced Entity Recognition: Far beyond simple keyword matching
  2. Spatial Relationship Extraction: Understands "left of", "between", "above" in context
  3. 3D Visualization: Real geometric positioning in proper 3D space
  4. Mathematical Convergence: Curvature evolution showing optimization progress

๐Ÿ“ Project Structure

GASM-Huggingface/
โ”œโ”€โ”€ app.py                    # Main Gradio application with complete interface
โ”œโ”€โ”€ gasm_core.py             # Core GASM implementation with SE(3) math
โ”œโ”€โ”€ fastapi_endpoint.py      # Optional API endpoints (standalone)
โ”œโ”€โ”€ requirements.txt         # Python dependencies
โ””โ”€โ”€ README.md               # This file

๐Ÿงฎ The Mathematics Behind GASM

What Makes It Special

Unlike traditional NLP that treats text as sequences of tokens, GASM understands geometry:

1. SE(3) Invariant Processing

  • Uses Special Euclidean Group SE(3) for proper 3D transformations
  • Maintains mathematical correctness under rotations and translations
  • Employs Lie group operations for geometric learning

2. Advanced Entity Recognition

  • spaCy NLP: Part-of-speech tagging + named entity recognition
  • Semantic Filtering: Domain-specific vocabularies (robotics, scientific, everyday)
  • Contextual Understanding: Extracts objects from spatial prepositions

3. Geometric Optimization

  • Geodesic Distances: Shortest paths on SE(3) manifold
  • Discrete Curvature: Graph Laplacian eigenvalue-based computation
  • Energy Minimization: Constraint satisfaction via Lagrange multipliers

Technical Architecture

Text โ†’ spaCy NLP โ†’ Entity Extraction โ†’ Semantic Filtering
  โ†“
SE(3) Embedding โ†’ Attention Mechanism โ†’ Geometric Refinement  
  โ†“
Constraint Satisfaction โ†’ Curvature Optimization โ†’ 3D Visualization

Why This Matters

Most AI systems use simple word embeddings that lose spatial meaning. GASM preserves geometric relationships through mathematically principled operations, enabling true spatial understanding.

๐ŸŽจ Visualizations

The Space provides two main visualizations:

1. Curvature Evolution Plot

  • Shows geometric convergence over iterations
  • Displays SE(3) manifold optimization progress
  • Uses matplotlib with dark theme for clarity

2. 3D Entity Space Plot

  • Interactive 3D positioning of extracted entities
  • Color-coded by entity type (robotic, physical, spatial, etc.)
  • Shows relationship connections between entities

๐Ÿ”ฌ How It Works

  1. Text Input: User provides text for analysis
  2. Entity Extraction: Regex-based extraction of meaningful entities
  3. Relation Detection: Identification of spatial, temporal, physical relations
  4. GASM Processing: If available, real SE(3) forward pass through geometric manifold
  5. Visualization: Generate curvature evolution and 3D entity plots
  6. Results: Comprehensive analysis with JSON output

โšก Performance

  • CPU Mode: Optimized for HuggingFace Spaces CPU allocation
  • GPU Fallback: Automatic ZeroGPU usage when available
  • Memory Efficient: ~430MB total memory footprint
  • Fast Processing: 0.1-0.8s processing time depending on text length

๐Ÿ› ๏ธ Local Development

To run locally:

git clone <this-repo>
cd GASM-Huggingface

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

๐Ÿ“Š Space Configuration

This Space is configured with:

  • SDK: Gradio 4.44.1+
  • Python: 3.8+
  • GPU: ZeroGPU compatible (A10G/T4 fallback)
  • Memory: 16GB RAM allocation
  • Storage: Persistent storage for model caching

๐Ÿ” API Endpoints

The Space also exposes FastAPI endpoints (when fastapi_endpoint.py is run separately):

  • POST /process: Process text with geometric enhancement
  • GET /health: Health check and memory usage
  • GET /info: Model configuration information

๐Ÿ“ˆ Use Cases

Perfect for analyzing:

  • Technical Documentation: Spatial relationships in engineering texts
  • Scientific Literature: Physical phenomena and experimental setups
  • Educational Content: Geometry and physics explanations
  • Robotic Systems: Assembly instructions and spatial configurations

๐ŸŽฏ Model Details

  • Base Architecture: Built on transformer foundations
  • Geometric Processing: SE(3) Lie group operations
  • Attention Mechanism: Geodesic distance-based attention weighting
  • Curvature Computation: Discrete Gaussian curvature via graph Laplacian
  • Constraint Handling: Energy minimization with Lagrange multipliers

๐Ÿš€ Why This Matters

Current State of AI

  • โœ… Excellent at text understanding and generation
  • โœ… Great at image recognition and computer vision
  • โŒ Struggles with spatial reasoning from language
  • โŒ Can't bridge text โ†” 3D geometry gap

GASM's Contribution

GASM represents a step toward AI that understands space the way humans do - not just as coordinates, but as meaningful geometric relationships between objects in the world.

Applications on the horizon:

  • ๐Ÿค– Robots that understand spatial instructions naturally
  • ๐Ÿ—๏ธ AI architects that reason about 3D spaces from descriptions
  • ๐Ÿ”ฌ Scientific AI that models physical systems geometrically
  • ๐ŸŽฎ Game AI that understands spatial gameplay naturally

๐Ÿ› ๏ธ Local Development

git clone https://github.com/scheitelpunk/GASM-Huggingface
cd GASM-Huggingface
pip install -r requirements.txt
python app.py

The system gracefully handles missing dependencies with intelligent fallbacks.

๐Ÿค Contributing

This is active research in spatial AI! We welcome:

  • ๐Ÿ› Bug reports and edge cases
  • ๐Ÿ’ก New spatial relationship types
  • ๐ŸŒ Additional language support
  • ๐Ÿ“Š Evaluation datasets
  • ๐Ÿ”ง Performance optimizations

๐Ÿ“„ License & Citation

Licensed under CC-BY-NC 4.0. For research use, please cite:

@misc{gasm2025,
  title={GASM: Geometric Attention for Spatial Understanding},
  author={Michael Neuberger, Versino PsiOmega GmbH},
  year={2025},
  url={https://huggingface.co/spaces/scheitelpunk/GASM}
}

๐Ÿ™ Built With

  • ๐Ÿค— Hugging Face Spaces - Deployment platform
  • ๐ŸŒ spaCy - Advanced NLP processing
  • ๐Ÿ”ข PyTorch - Neural network framework
  • ๐Ÿ“Š Gradio - Interactive ML interfaces
  • ๐Ÿ“ Geomstats - Geometric computing

GASM: Where language meets geometry, and AI begins to understand space. ๐Ÿš€

Built by Michael Neuberger, Versino PsiOmega GmbH