Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.06925

Papers - Decoders - Strategy - Beam Search - Report

A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1

Papers - Decoders - Report

A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1

Papers - University - The Chinese University of Hong Kong

Interactive3D: Create What You Want by Interactive 3D Generation

Paper • 2404.16510 • Published Apr 25 • 18
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25 • 7
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1
LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

Papers - Llama 2

Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1
Distilling System 2 into System 1

Paper • 2407.06023 • Published Jul 8 • 3

Papers - Tencent

Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs

Paper • 2312.17080 • Published Dec 28, 2023 • 1
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Paper • 2404.16790 • Published Apr 25 • 7
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1

Papers - University - Tsinghua University

Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1 • 11
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

Paper • 2404.00987 • Published Apr 1 • 21
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 43
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

Paper • 2404.02893 • Published Apr 3 • 20

Papers - Text - Research

An Interdisciplinary Comparison of Sequence Modeling Methods for Next-Element Prediction

Paper • 1811.00062 • Published Oct 31, 2018 • 2
mT5: A massively multilingual pre-trained text-to-text transformer

Paper • 2010.11934 • Published Oct 22, 2020 • 4
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance

Paper • 2310.10021 • Published Oct 16, 2023 • 2
Gemma: Open Models Based on Gemini Research and Technology

Paper • 2403.08295 • Published Mar 13 • 47

Papers - Text - Decoders

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Transformers Can Achieve Length Generalization But Not Robustly

Paper • 2402.09371 • Published Feb 14 • 12
A Thorough Examination of Decoding Methods in the Era of LLMs

Paper • 2402.06925 • Published Feb 10 • 1

Papers - Math - GSM8K

Training Verifiers to Solve Math Word Problems

Paper • 2110.14168 • Published Oct 27, 2021 • 4
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 18
LiteSearch: Efficacious Tree Search for LLM

Paper • 2407.00320 • Published Jun 29 • 37
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Paper • 2309.03883 • Published Sep 7, 2023 • 33

Papers - Decoders

Lossless Acceleration for Seq2seq Generation with Aggressive Decoding

Paper • 2205.10350 • Published May 20, 2022 • 2
Blockwise Parallel Decoding for Deep Autoregressive Models

Paper • 1811.03115 • Published Nov 7, 2018 • 2
Fast Transformer Decoding: One Write-Head is All You Need

Paper • 1911.02150 • Published Nov 6, 2019 • 6
Sequence-Level Knowledge Distillation

Paper • 1606.07947 • Published Jun 25, 2016 • 2

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs