Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.13064

LLM to annotate Dataset

https://github.com/Zhen-Tan-dmml/LLM4Annotation

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46
Large Language Models for Data Annotation: A Survey

Paper • 2402.13446 • Published Feb 21
AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

Paper • 2303.16854 • Published Mar 29, 2023 • 1
Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

Paper • 2307.02179 • Published Jul 5, 2023 • 7

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

PALO: A Polyglot Large Multimodal Model for 5B People

Paper • 2402.14818 • Published Feb 22 • 23
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21 • 111
User-LLM: Efficient LLM Contextualization with User Embeddings

Paper • 2402.13598 • Published Feb 21 • 18
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21 • 12

LLM-Synthetic data

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Data generation

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Paper • 2405.15613 • Published May 24 • 13
Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets

Paper • 2405.18952 • Published May 29 • 10

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20 • 10
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20 • 16
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20 • 24
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Large Language Model Alignment: A Survey

Paper • 2309.15025 • Published Sep 26, 2023 • 2
Aligning Large Language Models with Human: A Survey

Paper • 2307.12966 • Published Jul 24, 2023 • 1
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 48
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

Paper • 2310.05344 • Published Oct 9, 2023 • 1

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Speculative Streaming: Fast LLM Inference without Auxiliary Models

Paper • 2402.11131 • Published Feb 16 • 41
Generative Representational Instruction Tuning

Paper • 2402.09906 • Published Feb 15 • 51
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 17

Synthetic Dataset

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16 • 29
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 46

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs