Striping (Striping)

emozilla

authored a paper 4 months ago

DeMo: Decoupled Momentum Optimization

Paper • 2411.19870 • Published Nov 29, 2024 • 6

tridao

authored a paper 5 months ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 56

pragaash

authored 4 papers 5 months ago

tridao

authored a paper 8 months ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 42

avnermay

authored a paper 8 months ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 42

emozilla

authored a paper 8 months ago

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15, 2024 • 51

zhangce

authored a paper 10 months ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 60

juewang

authored a paper 10 months ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 60

tridao

authored 4 papers about 1 year ago

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Paper • 2403.03234 • Published Mar 5, 2024 • 14

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 142

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 23

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19, 2024 • 56

Zymrael

authored 5 papers over 1 year ago

Hyena Hierarchy: Towards Larger Convolutional Language Models

Paper • 2302.10866 • Published Feb 21, 2023 • 7

Deep Latent State Space Models for Time-Series Generation

Paper • 2212.12749 • Published Dec 24, 2022 • 1

Neural Solvers for Fast and Accurate Numerical Optimal Control

Paper • 2203.08072 • Published Mar 13, 2022

Transform Once: Efficient Operator Learning in Frequency Domain

Paper • 2211.14453 • Published Nov 26, 2022

Effectively Modeling Time Series with Simple Discrete State Spaces

Paper • 2303.09489 • Published Mar 16, 2023 • 1

Striping

AI & ML interests

Striping's activity

DeMo: Decoupled Momentum Optimization

RedPajama: an Open Dataset for Training Large Language Models

Feedback-Based Self-Learning in Large-Scale Conversational AI Agents

A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning

Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI

Training-Free Activation Sparsity in Large Language Models

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Hermes 3 Technical Report

Mixture-of-Agents Enhances Large Language Model Capabilities

Mixture-of-Agents Enhances Large Language Model Capabilities

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

StarCoder 2 and The Stack v2: The Next Generation

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Hyena Hierarchy: Towards Larger Convolutional Language Models

Deep Latent State Space Models for Time-Series Generation

Neural Solvers for Fast and Accurate Numerical Optimal Control

Transform Once: Efficient Operator Learning in Frequency Domain

Effectively Modeling Time Series with Simple Discrete State Spaces

AI & ML interests

Team members 13

Striping's activity