Ali-C137 (Ali Elfilali)

upvoted a paper 5 days ago

Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Paper • 2405.11273 • Published 12 days ago • 15

upvoted an article 5 days ago

Article

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

6 days ago

• 16

upvoted an article 7 days ago

Article

Let's talk about LLM evaluation

By

•

7 days ago

• 77

upvoted a paper 7 days ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 57

upvoted a paper 12 days ago

What matters when building vision-language models?

Paper • 2405.02246 • Published 26 days ago • 87

upvoted a paper 14 days ago

OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

Paper • 2306.16527 • Published Jun 21, 2023 • 42

upvoted an article 14 days ago

Article

Train custom AI models with the trainer API and adapt them to 🤗

By

•

4 days ago

• 19

upvoted an article 15 days ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

16 days ago

• 127

upvoted a collection 15 days ago

PaliGemma Release

Collection

Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 13 days ago • 102

upvoted an article 15 days ago

Article

Hugging Face x LangChain : A new partner package in LangChain

16 days ago

• 68

upvoted 4 articles 16 days ago

Article

Large-scale Near-deduplication Behind BigCode

May 16, 2023

• 10

Article

Introducing the Open Arabic LLM Leaderboard

16 days ago

• 47

Article

AI Apps in a Flash with Gradio's Reload Mode

Apr 16

• 16

Article

License to Call: Introducing Transformers Agents 2.0

17 days ago

• 83

upvoted a paper 22 days ago

101 Billion Arabic Words Dataset

Paper • 2405.01590 • Published about 1 month ago • 3

upvoted 2 articles 22 days ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

By

•

22 days ago

• 24

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

By

•

24 days ago

• 24

upvoted an article 24 days ago

Article

seemore: Implement a Vision Language Model from Scratch

By

•

17 days ago

• 42

upvoted a paper 25 days ago

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published about 1 month ago • 114

upvoted an article 29 days ago

Article

Improving Prompt Consistency with Structured Generations

about 1 month ago

• 46

upvoted an article about 1 month ago

Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

By

•

Apr 26

• 55

upvoted 2 collections about 1 month ago

Vision Language Models Papers 🖼️💬📝

Collection

Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated 30 days ago • 26

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 20 items • Updated 8 days ago • 293

upvoted 2 articles about 1 month ago

Article

Fine-tune Llama 3 with ORPO

By

•

Apr 22

• 191

Article

Design choices for Vision Language Models in 2024

By

•

Apr 16

• 20

upvoted a collection about 1 month ago

FineWeb datasets

Collection

1 item • Updated Apr 21 • 7

upvoted 2 articles about 1 month ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

By

•

23 days ago

• 32

Article

From OpenAI to Open LLMs with Messages API

Feb 8

• 6

upvoted a collection about 1 month ago

Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 545

upvoted 2 papers about 2 months ago

SantaCoder: don't reach for the stars!

Paper • 2301.03988 • Published Jan 9, 2023 • 6

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 58

upvoted 7 articles about 2 months ago

Article

Mixture of Depth is Vibe

By

•

Apr 22

• 36

Article

Experiments with Bitnet 1.5 (~ngmi~)

By

•

Mar 30

• 3

Article

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

By

•

Apr 9

• 26

Article

Outpainting II - Differential Diffusion

By

•

Apr 23

• 24

Article

Introduction to State Space Models (SSM)

By

•

Apr 8

• 26

Article

GaLore: Advancing Large Model Training on Consumer-grade Hardware

Mar 20

• 20

Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Mar 20

• 24

upvoted 2 papers about 2 months ago

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 73

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 102

upvoted an article about 2 months ago

Article

Total noob’s intro to Hugging Face Transformers

Mar 22

• 21

upvoted 4 papers about 2 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 567

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 61

Leveraging Corpus Metadata to Detect Template-based Translation: An Exploratory Case Study of the Egyptian Arabic Wikipedia Edition

Paper • 2404.00565 • Published Mar 31 • 6

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26 • 75

upvoted 2 collections 2 months ago

DBRX

Collection

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 89

PALO

Collection

PALO: A Polyglot Large Multimodal Model for 5B People • 6 items • Updated Apr 26 • 2

upvoted a paper 2 months ago

Larimar: Large Language Models with Episodic Memory Control

Paper • 2403.11901 • Published Mar 18 • 30

upvoted a collection 2 months ago

Common Corpus

Collection

The largest public domain dataset for training LLMs. • 26 items • Updated Mar 20 • 103

upvoted 10 papers 3 months ago

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16 • 26

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Paper • 2309.11674 • Published Sep 20, 2023 • 29

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Paper • 2402.17193 • Published Feb 27 • 23

upvoted a collection 3 months ago

💫 StarCoder2

Collection

StarCoder2 models and datasets! • 8 items • Updated Mar 1 • 74

Ali Elfilali

AI & ML interests

Articles

Introducing the Open Arabic LLM Leaderboard

Organizations

Ali-C137's activity

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Train custom AI models with the trainer API and adapt them to 🤗

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Hugging Face x LangChain : A new partner package in LangChain

Large-scale Near-deduplication Behind BigCode

Introducing the Open Arabic LLM Leaderboard

AI Apps in a Flash with Gradio's Reload Mode

License to Call: Introducing Transformers Agents 2.0

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

SeeMoE: Implementing a MoE Vision Language Model from Scratch

seemore: Implement a Vision Language Model from Scratch

Improving Prompt Consistency with Structured Generations

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

Fine-tune Llama 3 with ORPO

Design choices for Vision Language Models in 2024

Mergoo: Efficiently Build Your Own MoE LLM

From OpenAI to Open LLMs with Messages API

Mixture of Depth is Vibe

Experiments with Bitnet 1.5 (~ngmi~)

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

Outpainting II - Differential Diffusion

Introduction to State Space Models (SSM)

GaLore: Advancing Large Model Training on Consumer-grade Hardware

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Total noob’s intro to Hugging Face Transformers