hysts's picture

hysts

hysts

·

AI & ML interests

Computer Vision

Recent Activity

liked a Space about 9 hours ago

toshas/gradio-dualvision

liked a Space about 9 hours ago

kyutai/hibiki-samples

liked a Space about 9 hours ago

qubvel-hf/dab-detr-object-detection

View all activity

Organizations

hysts's activity

upvoted an article 1 day ago

Article

Open-source DeepResearch – Freeing our search agents

3 days ago

• 702

upvoted an article 6 days ago

Article

The AI tools for Art Newsletter - Issue 1

7 days ago

• 44

upvoted an article 9 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

10 days ago

• 657

upvoted an article 10 days ago

Article

Welcome to Inference Providers on the Hub 🔥

10 days ago

• 266

upvoted a collection 5 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

upvoted a paper 9 months ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87

upvoted a paper 12 months ago

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 115

upvoted 4 papers about 1 year ago

ChatAnything: Facetime Chat with LLM-Enhanced Personas

Paper • 2311.06772 • Published Nov 12, 2023 • 35

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 44

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 27

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Paper • 2311.04145 • Published Nov 7, 2023 • 33

upvoted 9 papers over 1 year ago

Learning From Mistakes Makes LLM Better Reasoner

Paper • 2310.20689 • Published Oct 31, 2023 • 29

CapsFusion: Rethinking Image-Text Data at Scale

Paper • 2310.20550 • Published Oct 31, 2023 • 26

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 21

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

Paper • 2310.19512 • Published Oct 30, 2023 • 16

MM-VID: Advancing Video Understanding with GPT-4V(ision)

Paper • 2310.19773 • Published Oct 30, 2023 • 20

CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 69

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Paper • 2310.15008 • Published Oct 23, 2023 • 22

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 40

Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 58