Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Spaces:

willwade
/

AACKGDemo

Running

App Files Files Community

Fetching metadata from the HF Docker repository...

AACKGDemo / to-do.md

willwade's picture

Initial commit

f5b302e 10 days ago

|

history blame contribute delete

2.7 kB

A newer version of the Gradio SDK is available: 5.31.0

Upgrade

AAC Context-Aware Demo: To-Do Document

Goal

Create a proof-of-concept offline-capable RAG (Retrieval-Augmented Generation) system for ALS AAC users that:

Uses a lightweight knowledge graph (JSON)
Supports utterance suggestion and correction
Uses local/offline LLMs (e.g., Gemma, Flan-T5)
Includes a semantic retriever to match context (e.g. conversation partner, topics)
Provides a Gradio-based UI for deployment on HuggingFace

Phase 1: Environment Setup

Install Gradio, Transformers, Sentence-Transformers
Choose and install inference backends:
- google/flan-t5-base (via HuggingFace Transformers)
- Gemma 2B via Ollama or Transformers (check support for offline use)
- Sentence similarity model (sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 or similar)

Phase 2: Knowledge Graph

Create example social_graph.json (people, topics, relationships)
Define function to extract relevant context given a selected person
- Name, relationship, typical topics, frequency
Format for prompt injection: inline context for LLM use

Phase 3: Semantic Retriever

Load sentence-transformer model
Create index from the social graph topics/descriptions
Match transcript to closest node(s) in the graph
Retrieve context for prompt generation

Phase 4: Gradio UI

Simple interface:
- Dropdown: Select "Who is speaking?" (Bob, Alice, etc.)
- Record Button: Capture audio input
- Text area: Show transcript
- Toggle tabs:
  - "Suggest Utterance"
  - "Correct Message"
- Output: Generated message
Implement Whisper transcription (use whisper, faster-whisper, or whisper.cpp)
Pass transcript + retrieved context to LLM model

Phase 5: Model Comparison

Test both Flan-T5 and Gemma:
- Evaluate speed/quality tradeoffs
- Compare correction accuracy and context-specific generation

Optional Phase 6: HuggingFace Deployment

Clean up UI and remove dependencies requiring GPU-only execution
Upload Gradio demo to HuggingFace Spaces
Add documentation and example graphs/transcripts

Notes

Keep user privacy and safety in mind (no cloud transcription if Whisper offline is available)
Keep JSON editable for later expansion (add sessions, emotional tone, etc.)
Option to cache LLM suggestions for fast recall

Future Features (Post-Proof of Concept)

Add visualisation of social graph (D3 or static SVG)
Add editable profile page for caregivers
Add chat history / rolling transcript viewer
Add emotion/sentiment detection for tone-aware suggestions