SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity Paper • 2401.17072 • Published Jan 30 • 25
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models Paper • 2402.10524 • Published Feb 16 • 22