Joy Caption Pre Alpha
Generate captions for images
Generate captions for images
Segment objects in images and videos using text prompts
Generate descriptions by uploading images or videos
Generate insights from charts using text prompts
Generate descriptions for images using text prompts
Upload an image to detect objects
Extract text and metadata from PDF files
Try PaliGemma on document understanding tasks
Generate image descriptions
Chat with an AI that understands images and text
Chat about images by uploading them and typing questions
GPT 4o like bot.
Analyze documents to extract text and visualize segmentation
Generate detailed descriptions from images and videos
Generate retrieval queries from document images
Microsoft Phi-3 Vision 128k with Multimodal capabilities
A Fully Open Multilingual Multimodal LLM for 39 Languages
Demo for DocLayout-YOLO
Convert PDFs or images to Markdown with OCR
Extract text from images
Huggingface space for JanusFlow-1.3B
Upload documents for Q&A
Generate clickable coordinates on a screenshot
PaliGemma2 LoRA finetuned on VQAv2
Gaze detection using Moondream
Detect and annotate poses in images and videos
perfect ocr vlm
Extract text from PDFs and images