Papers-Fundamentals - a sugatoray Collection

sugatoray 's Collections

Reasoning Datasets

SmolAgents Tools (Spaces)

Bookmark::Models

LLMs

AV LLMs

LLM Training Datasets

Papers

Leaderboards 🔥

Papers-Fundamentals

TFM: TimeSeries Foundation Models

Papers-Benchmarks

LLMs-EmbeddingModels

LLM + Datasets : Finance

Papers-Fundamentals

updated about 9 hours ago

RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 11
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 49
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 61
Zero-Shot Tokenizer Transfer

Paper • 2405.07883 • Published May 13, 2024 • 5
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

Paper • 2401.02994 • Published Jan 4, 2024 • 49
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 58
Extreme Compression of Large Language Models via Additive Quantization

Paper • 2401.06118 • Published Jan 11, 2024 • 12
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 85
HyperZcdotZcdotW Operator Connects Slow-Fast Networks for Full Context Interaction

Paper • 2401.17948 • Published Jan 31, 2024 • 2
Grokfast: Accelerated Grokking by Amplifying Slow Gradients

Paper • 2405.20233 • Published May 30, 2024 • 6
Stream of Search (SoS): Learning to Search in Language

Paper • 2404.03683 • Published Apr 1, 2024 • 30
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 25
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published 26 days ago • 53
Foundations of Large Language Models

Paper • 2501.09223 • Published 19 days ago • 2
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 284