TA

AIIAR

AI & ML interests

Deep Learning &LLM

Recent Activity

liked a Space 23 days ago
Qwen/QwQ-32B-preview
liked a Space about 2 months ago
jadechoghari/OmniParser
View all activity

Organizations

Social Post Explorers's profile picture Binary's profile picture

AIIAR's activity

liked a Space about 2 months ago
reacted to nroggendorff's post with ๐Ÿ‘ 6 months ago
reacted to KingNish's post with ๐Ÿ‘ 7 months ago
view post
Post
5058
Decoding GPT-4'o': Its Mechanisms and Creating Similar AI.

๐—ฅ๐—ฒ๐—ฎ๐—ฑ ๐—™๐˜‚๐—น๐—น ๐€๐ซ๐ญ๐ข๐œ๐ฅ๐ž: https://huggingface.co/blog/KingNish/decoding-gpt-4o

๐’๐ฎ๐ฆ๐ฆ๐š๐ซ๐ฒ ๐จ๐Ÿ ๐€๐ซ๐ญ๐ข๐œ๐ฅ๐ž- ๐Ÿ“
# ๐Œ๐ž๐œ๐ก๐š๐ง๐ข๐œ๐ฌ ๐จ๐Ÿ ๐†๐๐“-๐Ÿ’โ€™๐จโ€™: GPT-4โ€™oโ€™ operates through three main components ๐Ÿ› ๏ธ

๐Ÿ. ๐’๐ฎ๐ฉ๐ž๐ซ๐‚๐ก๐š๐ญ: Integrates image generation, QnA (image, document and video) for diverse interactions.
๐Ÿ. ๐•๐จ๐ข๐œ๐ž ๐‚๐ก๐š๐ญ: Merges TTS and STT for real-time, human-like audio responses, focusing on human interaction.
๐Ÿ‘. ๐•๐ข๐๐ž๐จ ๐‚๐ก๐š๐ญ: Utilizes Zero Shot Image Classification to enhance user interaction with visual information.

# ๐Œ๐ž๐ญ๐ก๐จ๐๐ฌ ๐ญ๐จ ๐‚๐ซ๐ž๐š๐ญ๐ž ๐’๐ข๐ฆ๐ข๐ฅ๐š๐ซ ๐€๐ˆ ๐Ÿง 

๐Ÿ. ๐Œ๐ฎ๐ฅ๐ญ๐ข๐Œ๐จ๐๐š๐ฅ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง: Combines multiple models for a powerful, multifunctional AI.
๐Ÿ. ๐ƒ๐ฎ๐œ๐ญ ๐“๐š๐ฉ๐ž ๐Œ๐ž๐ญ๐ก๐จ๐: Uses different models or APIs for specific tasks without additional training.

The article provides an in-depth exploration of GPT-4โ€™oโ€™, its functionalities, and methods to create similar AI models. It emphasizes the modelโ€™s language support and its innovative approach to human-AI interaction. ๐Ÿ’ก๐ŸŒ

(๐™‰๐™Š๐™๐™€: ๐™Ž๐™ช๐™ข๐™ข๐™–๐™ง๐™ฎ ๐™ž๐™จ ๐˜ผ๐™„ ๐™œ๐™š๐™ฃ๐™š๐™ง๐™–๐™ฉ๐™š๐™™) โœ…
  • 2 replies
ยท
replied to Warlord-K's post 7 months ago
replied to alielfilali01's post 9 months ago
view reply

very true all has not yet been done there are still innovations ahead

reacted to alielfilali01's post with ๐Ÿ‘ 9 months ago
view post
Post
2184
Honestly i don't understand how come we as the open source community haven't surpassed GPT-4 yet ? Like for me it looks like everything is out there just need to be exploited! Clearly specialized small models outperforms gpt4 on downstream tasks ! So why haven't we just trained a 1B-2B really strong general model and then continue pertained and/or finetuned it on datasets for downstream tasks like math, code...well structured as Textbooks format or other datasets formats that have been proven to be really efficient and good! Ounce you have 100 finetuned model, just wrap them all into a FrankenMoE and Voila โœจ
And that's just what a NOOB like myself had in mind, I'm sure there is better, more efficient ways to do it ! So the question again, why we haven't yet ? I feel I'm missing something... Right?
ยท
reacted to akhaliq's post with โค๏ธ๐Ÿ˜Ž 9 months ago
view post
Post
2249
Mora

Enabling Generalist Video Generation via A Multi-Agent Framework

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework (2403.13248)

Sora is the first large-scale generalist video generation model that garnered significant attention across society. Since its launch by OpenAI in February 2024, no other video generation models have paralleled {Sora}'s performance or its capacity to support a broad spectrum of video generation tasks. Additionally, there are only a few fully published video generation models, with the majority being closed-source. To address this gap, this paper proposes a new multi-agent framework Mora, which incorporates several advanced visual AI agents to replicate generalist video generation demonstrated by Sora. In particular, Mora can utilize multiple visual agents and successfully mimic Sora's video generation capabilities in various tasks, such as (1) text-to-video generation, (2) text-conditional image-to-video generation, (3) extend generated videos, (4) video-to-video editing, (5) connect videos and (6) simulate digital worlds. Our extensive experimental results show that Mora achieves performance that is proximate to that of Sora in various tasks. However, there exists an obvious performance gap between our work and Sora when assessed holistically. In summary, we hope this project can guide the future trajectory of video generation through collaborative AI agents.
reacted to vikhyatk's post with โค๏ธ 9 months ago
view post
Post
2239
Just released a dataset with 1.5M image question/answers! vikhyatk/lnqa
New activity in AIIAR/open-gpt-Image-Prompt-Generator 9 months ago

test

1
#3 opened 9 months ago by
rezalkretiva

test

1
#3 opened 9 months ago by
rezalkretiva
reacted to DmitryRyumin's post with โค๏ธ 10 months ago