Salman Khan's picture

12 14

Salman Khan

salmaneme

·

AI & ML interests

None yet

Recent Activity

liked a Space 9 days ago

inceptionai/Arabic-Leaderboards

liked a dataset about 1 month ago

MBZUAI/GeoPixelD

upvoted a collection about 1 month ago

View all activity

Organizations

salmaneme's activity

upvoted a collection about 1 month ago

KITAB-Bench

A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding • 24 items • Updated Feb 24 • 11

upvoted a paper about 2 months ago

AIN: The Arabic INclusive Large Multimodal Model

Paper • 2502.00094 • Published Jan 31 • 17

upvoted a paper 3 months ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 65

upvoted a paper 4 months ago

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

Paper • 2412.07769 • Published Dec 10, 2024 • 27

upvoted 6 collections 11 months ago

Satmae++

Collection of ViT models trained using SatMAE++ approach. • 4 items • Updated Jun 11, 2024 • 1

GeoChat

GeoChat is the first grounded Large Vision Language Model, specifically tailored to Remote Sensing(RS) scenarios. • 4 items • Updated Jun 11, 2024 • 5

MobiLlama

Collection of MobiLlama Language Models. • 6 items • Updated Jun 11, 2024 • 14

GLaMM

Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated. • 9 items • Updated Jun 11, 2024 • 4

Video-ChatGPT

"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. • 2 items • Updated Jun 11, 2024 • 3

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11, 2024 • 23

upvoted a paper about 1 year ago

PALO: A Polyglot Large Multimodal Model for 5B People

Paper • 2402.14818 • Published Feb 22, 2024 • 24

upvoted a paper over 1 year ago

GLaMM: Pixel Grounding Large Multimodal Model

Paper • 2311.03356 • Published Nov 6, 2023 • 36