Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

akhaliqΒ 
posted an update about 3 hours ago
view post
Post
331
Chameleon

Mixed-Modal Early-Fusion Foundation Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models (2405.09818)

We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.
merveΒ 
posted an update about 5 hours ago
view post
Post
485
I got asked about PaliGemma's document understanding capabilities, so I built a Space that has all the PaliGemma fine-tuned doc models πŸ“„πŸ“ŠπŸ“–
merve/paligemma-doc
lamhieuΒ 
posted an update about 6 hours ago
view post
Post
423
πŸŽ‰ Happy to announce about the collection called "Blackhole". It is a black hole of high quality data in many fields, multilingual to train LLMs with SFT and DPO methods.
πŸ“¦ There are now over 30++ high-quality datasets available so you can start creating interesting models. It will be updated in the future, glad if it helps someone.

lamhieu/blackhole-66473b7feec034b4fb70818a
Ali-C137Β 
posted an update about 7 hours ago
eienmojikiΒ 
posted an update about 7 hours ago
view post
Post
388
πŸ‘€ Try new Anime Gen model - StarryXL

πŸͺ„ Starry XL has improved upon the Kohaku Epsilon model by targeting the specific styles of top Pixiv artists and expanding the character dataset to generate high-quality images.

✨ Starry is based on epsilon, and during training, the caption are overall close to Kohaku epsilon, so the overall usage is the same. Go to the model's page below to see in detail how to use it!

πŸ”Ž Resources:
- StarryXL v5.2 on Huggingface: eienmojiki/Starry-XL-v5.2
- Offical model page: https://civitai.com/models/448552?modelVersionId=499498
- Kohaku-XL Epsilon: https://civitai.com/models/399873?modelVersionId=445973

πŸ“ƒ Credits:
- Demo: @eienmojiki
- Model's author: kitarz
albertvillanovaΒ 
posted an update about 9 hours ago
view post
Post
519
Easily convert your script-based datasets to Parquet and explore them in the dataset viewer. 🌟

πŸ› οΈ Use @huggingface Datasets CLI:
$ 𝚍𝚊𝚝𝚊𝚜𝚎𝚝𝚜-πšŒπš•πš’ πšŒπš˜πš—πšŸπšŽπš›πš_𝚝𝚘_πš™πšŠπš›πššπšžπšŽπš πš„πš‚π™΄πšπ™½π™°π™Όπ™΄/π™³π™°πšƒπ™°πš‚π™΄πšƒ_𝙽𝙰𝙼𝙴

Learn more: https://huggingface.co/docs/datasets/main/en/cli#convert-to-parquet
#Data #AI
hakunamatata1997Β 
posted an update about 11 hours ago
view post
Post
673
Why salesforce removedSFR-Iterative-DPO-LLaMA-3-8B-R ? Any ideas?
  • 1 reply
Β·
SivilTaramΒ 
posted an update about 17 hours ago
view post
Post
969
Introducing Sailor-14B Model and Sailor2 Project 🚒

We're thrilled to announce the release of the Sailor-14B models, including the Base and the Chat versions!

βœ…Built upon the Qwen1.5-14B model, the Base version follows a similar procedure as our Sailor-7B model.
βœ…The Chat version is optimized using DPO on our in-house human preference dataset, yielding a better experience than our previous Chat models.

🏠Home: https://sailorllm.github.io
πŸ€—Model: sail/Sailor-14B-Chat
πŸ’»Demo: sail/Sailor-14B-Chat

We're also excited to introduce the Sailor2 project, ✨ an open collaboration opportunity for the entire community! ✨

🌐 The Sailor2 project aims to build a LLM with ~30B parameters, optimized for multiple South-East Asian languages, including Cebuano, Indonesian, Khmer, Lao, Minangkabau, Malay, Burmese, Sundanese, Javanese, Thai, and Vietnamese.

🎯The model will undergo continual pre-training from a base model proficient in both Chinese and English using nearly 800B SEA tokens, with an expected performance comparable to the most advanced business models for the above SEA languages.

🀝 Contribute your data, expertise, and ideas to shape the future of open-source LLMs for the SEA region.

🌍 Everyone passionate about the SEA region is welcome aboard! Join the party and get involved by scanning the QR code! πŸ”

Let's sail together and enjoy the journey!βš“
  • 2 replies
Β·
MonsterMMORPGΒ 
posted an update about 17 hours ago
view post
Post
678
Stable Cascade Full Tutorial for Windows, Massed Compute, RunPod & Kaggle β€” Predecessor of SD3 β€” 1-Click Install Amazing Gradio APP

Stable Cascade is another amazing model for Stability AI

Weights are published

Stable Cascade Full Tutorial for Windows β€” Predecessor of SD3–1-Click Install Amazing Gradio APP : https://youtu.be/q0cYhalUUsc

Stable Cascade Full Tutorial for Cloud β€” Predecessor of SD3 β€” Massed Compute, RunPod & Kaggle : https://youtu.be/PKDeMdEObNo

singhsidhukuldeepΒ 
posted an update about 19 hours ago
view post
Post
645
πŸŽ‰ A new LLM is launched! πŸš€
After checking if it's open-source or not, πŸ€”
you rush to see the benchmarks... πŸƒβ€β™‚οΈπŸ’¨

Which benchmark does everyone check first? πŸ”

MMLU (Massive Multitask Language Understanding)? πŸ“š

Benchmarks like MMLU reaching saturation... most of the time the performance does not translate to real-world use cases! πŸŒβ—

Meet MMLU-Pro, released by TIGER-Lab on @huggingface ! 🐯🌍

πŸ§ͺ 12,217 questions across biology, business, chemistry, computer science, economics, engineering, health, history, law, mathematics, philosophy, physics, and psychology carefully validated by humans πŸ§‘β€πŸ”¬

πŸ”Ÿ Goes to 10 options per question instead of 4, this increase in options will make the evaluation more realistic and reduce random guessing 🎯

πŸ“Š 56% of questions come from MMLU, 34% from STEM websites, and the rest from TheoremQA and SciBench πŸ“ˆ

πŸ€– LLMs with weak chain-of-thought reasoning tend to perform lower, indicating it is more challenging and representative of real-world expectations πŸ§ πŸ’‘

Any guess who tops it and who bombs it? πŸ€”πŸ“‰πŸ“ˆ

GPT-4o drops by 17% (from 0.887 to 0.7149) πŸ“‰
Llama-3-70B drops by 27% (from 0.820 to 0.5541) πŸ“‰

πŸ”— TIGER-Lab/MMLU-Pro
  • 2 replies
Β·