Papers
arxiv:2307.02179

Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

Published on Jul 5, 2023
· Featured in Daily Papers on Jul 6, 2023
Authors:
,
,
,
,

Abstract

This study examines the performance of open-source Large Language Models (LLMs) in text annotation tasks and compares it with proprietary models like ChatGPT and human-based services such as MTurk. While prior research demonstrated the high performance of ChatGPT across numerous NLP tasks, open-source LLMs like HugginChat and FLAN are gaining attention for their cost-effectiveness, transparency, reproducibility, and superior data protection. We assess these models using both zero-shot and few-shot approaches and different temperature parameters across a range of text annotation tasks. Our findings show that while ChatGPT achieves the best performance in most tasks, open-source LLMs not only outperform MTurk but also demonstrate competitive potential against ChatGPT in specific tasks.

Community

你好

本研究考察了开源大语言模型的性能 (LLM)在文本注释任务中,并将其与专有模型进行比较,例如 ChatGPT和基于人类的服务,如MTurk。虽然先前的研究 展示了 ChatGPT 在众多 NLP 任务中的高性能, 像HugginChat和FRAN这样的开源LLM因其 成本效益、透明度、可重复性和卓越的数据 保护。我们使用零镜头和少镜头方法评估这些模型 以及一系列文本注释任务中的不同温度参数。 我们的研究结果表明,虽然 ChatGPT 在大多数情况下都实现了最佳性能 任务,开源LLM不仅优于MTurk,而且还展示了 在特定任务中与ChatGPT竞争潜力。

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2307.02179 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2307.02179 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2307.02179 in a Space README.md to link it from this page.

Collections including this paper 1