Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.01761

Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 22
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 26
BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 96
Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 90

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 26
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 73
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 22

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 20
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 17
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 15
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 22

🔍 Daily Picks in Interpretability & Analysis of LMs

Outstanding research in interpretability and evaluation of language models, summarized

about 11 hours ago

Inferring Functionality of Attention Heads from their Parameters

Paper • 2412.11965 • Published 17 days ago • 2
LatentQA: Teaching LLMs to Decode Activations Into Natural Language

Paper • 2412.08686 • Published 22 days ago • 1
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 24 days ago • 64
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Paper • 2411.14257 • Published Nov 21, 2024 • 9

Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models

Paper • 2401.06102 • Published Jan 11, 2024 • 20
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 22

Large-Scale Automatic Audiobook Creation

Paper • 2309.03926 • Published Sep 7, 2023 • 54
Agents: An Open-source Framework for Autonomous Language Agents

Paper • 2309.07870 • Published Sep 14, 2023 • 42
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 29

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs