PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 25 • 1
WebArena: A Realistic Web Environment for Building Autonomous Agents Paper • 2307.13854 • Published Jul 25, 2023 • 23 • 4
IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages Paper • 2403.01926 • Published Mar 4 • 1 • 2
Datasets for Large Language Models: A Comprehensive Survey Paper • 2402.18041 • Published Feb 28 • 2 • 1
Lost in the Middle: How Language Models Use Long Contexts Paper • 2307.03172 • Published Jul 6, 2023 • 35 • 3
DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models Paper • 2401.02132 • Published Jan 4 • 3 • 2
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity Paper • 2401.17072 • Published Jan 30 • 25 • 2
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators Paper • 2312.15407 • Published Dec 24, 2023 • 1 • 2
MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models Paper • 2401.16745 • Published Jan 30 • 2
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions Paper • 2401.00690 • Published Jan 1 • 1 • 2
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Paper • 2304.01373 • Published Apr 3, 2023 • 8 • 1