Ayaan Sharif

Ayaan-Sharif

AI & ML interests

NLP, LLM, TEXT, Languages

Recent Activity

liked a Space 4 days ago
microsoft/llmlingua-2
liked a model 4 days ago
THUDM/cogvlm2-llama3-caption
liked a model 4 days ago
Neurazum/Xbai-Epilepsy-1.0
View all activity

Organizations

None yet

Ayaan-Sharif's activity

liked a Space 4 days ago
reacted to vladbogo's post with 👍 10 days ago
view post
Post
Panda-70M is a new large-scale video dataset comprising 70 million high-quality video clips, each paired with textual captions, designed to be used as pre-training for video understanding tasks.

Key Points:
* Automatic Caption Generation: Utilizes an automatic pipeline with multiple cross-modality teacher models to generate captions for video clips.
* Fine-tuned Caption Selection: Employs a fine-tuned retrieval model to select the most appropriate caption from multiple candidates for each video clip.
* Improved Performance: Pre-training on Panda-70M shows significant performance gains in video captioning, text-video retrieval, and text-driven video generation.

Paper: Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers (2402.19479)
Project page: https://snap-research.github.io/Panda-70M/
Code: https://github.com/snap-research/Panda-70M

Congrats to the authors @tschen , @aliaksandr-siarohin et al. for their work!
  • 1 reply
·
New activity in tencent/HunyuanVideo 19 days ago

multi gpu setup when ?

2
#5 opened 19 days ago by
Ayaan-Sharif
reacted to merve's post with ❤️ about 2 months ago
view post
Post
5422
Another great week in open ML!
Here's a small recap 🫰🏻

Model releases
⏯️ Video Language Models
AI at Meta released Vision-CAIR/LongVU_Qwen2_7B, a new state-of-the-art long video LM model based on DINOv2, SigLIP, Qwen2 and Llama 3.2

💬 Small language models
Hugging Face released HuggingFaceTB/SmolLM2-1.7B, a family of new smol language models with Apache 2.0 license that come in sizes 135M, 360M and 1.7B, along with datasets.
Meta released facebook/MobileLLM-1B, a new family of on-device LLMs of sizes 125M, 350M and 600M

🖼️ Image Generation
Stability AI released stabilityai/stable-diffusion-3.5-medium, a 2B model with commercially permissive license

🖼️💬Any-to-Any
gpt-omni/mini-omni2 is closest reproduction to GPT-4o, a new LLM that can take image-text-audio input and output speech is released!

Dataset releases
🖼️ Spawning/PD12M, a new captioning dataset of 12.4 million examples generated using Florence-2
replied to singhsidhukuldeep's post 3 months ago
upvoted an article 7 months ago
view article
Article

Introducing the Open Arabic LLM Leaderboard

76
replied to akhaliq's post 10 months ago