berthold joseph's picture

2 11

berthold joseph

bone

bonejay

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

LIMO: Less is More for Reasoning

upvoted a paper about 1 month ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

upvoted a paper 4 months ago

Hymba: A Hybrid-head Architecture for Small Language Models

View all activity

Organizations

None yet

bone's activity

upvoted 2 papers about 1 month ago

LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 58

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 56

upvoted 3 papers 4 months ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20, 2024 • 42

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 51

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 66

upvoted a paper 7 months ago

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 56

upvoted a paper 9 months ago

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

Paper • 2406.04333 • Published Jun 6, 2024 • 38

upvoted 2 papers 11 months ago

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 69

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104

upvoted a paper 12 months ago

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 108

upvoted a paper about 1 year ago

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 180