Spaces:

shreyask
/

falcon-perception

Running

App Files Files Community

shreyask commited on 24 days ago

Commit

42e1de5

verified ·

1 Parent(s): ad0cb9c

readme: clickable model badge, sharper title + gradient

Browse files

Files changed (1) hide show

README.md +18 -7

README.md CHANGED Viewed

@@ -1,19 +1,30 @@
 ---
-title: Falcon Perception
 emoji: 🦅
-colorFrom: blue
-colorTo: indigo
 sdk: static
 pinned: false
 license: apache-2.0
-short_description: Open-vocabulary detection in-browser via WebGPU
 models:
   - tiiuae/Falcon-Perception
   - onnx-community/falcon-perception-onnx-webgpu
 ---
-# Falcon Perception
-Browser demo for [tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception). Image / webcam / video input with Detection / Segment / Tracker render modes. Pixel-accurate segmentation via AnyUp + segm_head, multi-threaded WASM via coi-serviceworker.
-Weights: [onnx-community/falcon-perception-onnx-webgpu](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu).

 ---
+title: Falcon-Perception-0.6B WebGPU
 emoji: 🦅
+colorFrom: indigo
+colorTo: pink
 sdk: static
 pinned: false
 license: apache-2.0
+short_description: Open-vocab detection + segmentation, all in the browser
 models:
   - tiiuae/Falcon-Perception
   - onnx-community/falcon-perception-onnx-webgpu
 ---
+# 🦅 Falcon-Perception-0.6B WebGPU
+A browser demo for **[tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception)** — a 0.6B open-vocabulary VLM that turns natural-language queries into bounding boxes and pixel-accurate segmentation masks, running fully client-side via WebGPU + ONNX Runtime Web.
+[![Model](https://img.shields.io/badge/🤗%20Model-tiiuae%2FFalcon--Perception-yellow)](https://huggingface.co/tiiuae/Falcon-Perception)
+[![Weights](https://img.shields.io/badge/🤗%20ONNX%20Weights-onnx--community%2Ffalcon--perception--onnx--webgpu-blue)](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu)
+## What's inside
+- **Detection** — draw bounding boxes for any natural-language query ("athletes", "the runner in front", "mangoes").
+- **Segmentation** — pixel-accurate masks via the AnyUp upsampler, all in-browser.
+- **Tracker (preview)** — HUD-style reticles on video. Limited by VLM latency between detections; see the in-space disclaimer.
+## How it runs
+2.4 GB of ONNX weights are fetched once on first visit, then cached by your browser — no backend, no API keys, no network round-trip after load. Multi-threaded WASM is enabled via `coi-serviceworker`.