ajibawa-2023 (Feynman Innovations)

upvoted a collection about 9 hours ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 18 items • Updated 2 days ago • 135

upvoted a paper about 1 month ago

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Paper • 2402.09844 • Published Feb 15 • 19

upvoted an article about 1 month ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 73

upvoted a paper about 1 month ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 238

upvoted an article about 1 month ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15

• 134

upvoted an article about 2 months ago

Article

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

By

•

Apr 9

• 26

upvoted a paper 2 months ago

Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

Paper • 2403.10301 • Published Mar 15 • 50

upvoted 4 papers 3 months ago

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14 • 52

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14 • 122

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11 • 85

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 27

upvoted 3 papers 4 months ago

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

Paper • 2402.08017 • Published Feb 12 • 22

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5 • 61

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 135

upvoted a paper 5 months ago

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Paper • 2312.17120 • Published Dec 28, 2023 • 25

upvoted 6 papers 6 months ago

upvoted a paper 8 months ago

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 16

upvoted a paper 9 months ago

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 77

Feynman Innovations

AI & ML interests

Organizations

ajibawa-2023's activity

Granite Code Models

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive

Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Stealing Part of a Production Language Model

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Lumos : Empowering Multimodal LLMs with Scene Text Recognition

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Self-Rewarding Language Models

Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math

Large Language Models for Mathematicians

Beyond Surface: Probing LLaMA Across Scales and Layers

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Pearl: A Production-ready Reinforcement Learning Agent

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

SparQ Attention: Bandwidth-Efficient LLM Inference

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages