Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.04792

Foundation AI Papers

Curated List of Must-Reads on LLM reasoning at Temus AI team

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 91
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 104

Exciting Papers

Our curated list of AI papers @Temus AI

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 91
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization

Paper • 2402.09320 • Published Feb 14 • 6
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6 • 104

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 91
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 34
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts

Paper • 2402.09727 • Published Feb 15 • 35

Learning from feedback dir

Suppressing Pink Elephants with Direct Principle Feedback

Paper • 2402.07896 • Published Feb 12 • 7
Policy Improvement using Language Feedback Models

Paper • 2402.07876 • Published Feb 12 • 5
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7 • 25
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 61

Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7 • 25
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

Paper • 2402.04858 • Published Feb 7 • 13

Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7 • 25
Suppressing Pink Elephants with Direct Principle Feedback

Paper • 2402.07896 • Published Feb 12 • 7
Reformatted Alignment

Paper • 2402.12219 • Published Feb 19 • 15
Self-Play Preference Optimization for Language Model Alignment

Paper • 2405.00675 • Published May 1 • 19

My (Denis Gordeev) collection of mostly NLP papers. You can message me at t.me/nlp_party

LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31 • 21
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30 • 15
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31 • 14
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30 • 19

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29 • 50
Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28 • 18
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition

Paper • 2402.15220 • Published Feb 23 • 18
Linear Transformers are Versatile In-Context Learners

Paper • 2402.14180 • Published Feb 21 • 5

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 83
NEFTune: Noisy Embeddings Improve Instruction Finetuning

Paper • 2310.05914 • Published Oct 9, 2023 • 13
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 55
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon

Paper • 2401.03462 • Published Jan 7 • 25

Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 21
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 45
Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Paper • 2312.00849 • Published Dec 1, 2023 • 8

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs