Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.10200

Papers - Decoders - CoT Decoding

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98

Papers - CoT - Chain of Thought

Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21 • 50
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Paper • 2402.12875 • Published Feb 20 • 12

Papers - Observability and Interpretability

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention

Paper • 2310.00535 • Published Oct 1, 2023 • 2
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

Paper • 2211.00593 • Published Nov 1, 2022 • 2
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30 • 21
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Paper • 2307.09458 • Published Jul 18, 2023 • 10

Papers - Pre-training

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 16
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 24
LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 21

Research Papers

Research papers related to NLP.

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
Self-Attention with Relative Position Representations

Paper • 1803.02155 • Published Mar 6, 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Paper • 2401.12954 • Published Jan 23 • 28

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 63
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

Paper • 2402.09025 • Published Feb 14 • 6
Shortened LLaMA: A Simple Depth Pruning for Large Language Models

Paper • 2402.02834 • Published Feb 5 • 13
Algorithmic progress in language models

Paper • 2403.05812 • Published Mar 9 • 18

paper_collection

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 592
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7 • 61
Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13 • 48
Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11 • 90

Evaluating Very Long-Term Conversational Memory of LLM Agents

Paper • 2402.17753 • Published Feb 27 • 18
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Paper • 2402.16671 • Published Feb 26 • 26
Do Large Language Models Latently Perform Multi-Hop Reasoning?

Paper • 2402.16837 • Published Feb 26 • 24
Divide-or-Conquer? Which Part Should You Distill Your LLM?

Paper • 2402.15000 • Published Feb 22 • 22

Local functions models for testing with Home Assistant chat and devices control

fireworks-ai/firefunction-v1

Text Generation • Updated Mar 6 • 252 • 123
OpenBuddy/openbuddy-mixtral-7bx8-v18.1-32k

Text Generation • Updated Mar 4 • 4.53k • 14
meetkai/functionary-medium-v2.2

Text Generation • Updated 11 days ago • 16 • 26
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98

Foundation AI Papers

Curated List of Must-Reads on LLM reasoning at Temus AI team

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 98
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 109

Previous
1
2
3
4
...
6
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs