Abreu Magalhães's picture

134 85

Abreu Magalhães

Hildeberto

·

AI & ML interests

None yet

Recent Activity

liked a model 10 days ago

Qwen/Qwen2.5-Omni-7B

upvoted a paper 20 days ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

upvoted a paper about 1 month ago

NeoBERT: A Next-Generation BERT

View all activity

Organizations

None yet

Hildeberto's activity

upvoted a paper 20 days ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published 24 days ago • 87

upvoted a paper about 1 month ago

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published Feb 26 • 39

upvoted 2 papers about 2 months ago

Jailbreaking with Universal Multi-Prompts

Paper • 2502.01154 • Published Feb 3 • 9

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 220

upvoted 3 papers 2 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 374

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 104

Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20 • 33

upvoted 2 papers 3 months ago

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published Jan 8 • 91

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 142

upvoted 5 papers 5 months ago

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published Oct 28, 2024 • 30

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 15

Pre-training Distillation for Large Language Models: A Design Space Exploration

Paper • 2410.16215 • Published Oct 21, 2024 • 16

Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception

Paper • 2410.12788 • Published Oct 16, 2024 • 24

upvoted 6 papers 6 months ago

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 51

Toward General Instruction-Following Alignment for Retrieval-Augmented Generation

Paper • 2410.09584 • Published Oct 12, 2024 • 48

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Paper • 2410.08164 • Published Oct 10, 2024 • 24

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 176

Selective Attention Improves Transformer

Paper • 2410.02703 • Published Oct 3, 2024 • 24

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 149