Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models Paper • 2406.14035 • Published Jun 20, 2024 • 12
How Many Parameters Does it Take to Change a Light Bulb? Evaluating Performance in Self-Play of Conversational Games as a Function of Model Characteristics Paper • 2406.14051 • Published Jun 20, 2024 • 9
How Many Parameters Does it Take to Change a Light Bulb? Evaluating Performance in Self-Play of Conversational Games as a Function of Model Characteristics Paper • 2406.14051 • Published Jun 20, 2024 • 9
Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models Paper • 2406.14035 • Published Jun 20, 2024 • 12
Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models Paper • 2406.14035 • Published Jun 20, 2024 • 12
Clembench: Using Game Play to Evaluate Chat-Optimized Language Models as Conversational Agents Paper • 2305.13455 • Published May 22, 2023 • 3
Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks Paper • 2305.13782 • Published May 23, 2023
Clembench: Using Game Play to Evaluate Chat-Optimized Language Models as Conversational Agents Paper • 2305.13455 • Published May 22, 2023 • 3