File size: 1,411 Bytes
7f63349
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from agent.gaia_agent import create_langchain_agent

questions = [
    "How many studio albums were published by Mercedes Sosa between 2000 and 2009 (included)?",
    "In the video https://www.youtube.com/watch?v=u1xXCYZ4VYM, what is the highest number of bird species to be on screen at once?",
    "Reverse the string 'etisoppo eht'.",
    "What country had the least number of athletes at the 1928 Summer Olympics? Return the IOC country code.",
    "From the chessboard image at path 'chess_1.png', what is the best move?",
    "The attached Excel file contains food and drink sales. What are the total sales for food (excluding drinks)?",
    "Give me a comma-separated, alphabetized list of botanical vegetables from this: milk, eggs, flour, plums, lettuce, celery, broccoli, bell pepper, zucchini.",
    "Where were the Vietnamese specimens described by Kuznetzov in Nedoshivina’s 2010 paper eventually deposited? (City name only.)",
    "What is the name of the novel where a Martian child grows up on Earth and founds a church?",
    "Summarize the Wikipedia page on 'Taishō Tamai'."
]

agent = create_langchain_agent()

print("Running local benchmark...")

for idx, question in enumerate(questions):
    print(f"\nQUESTION {idx + 1}: {question}")
    try:
        answer = agent.invoke({"input": question})
        print("ANSWER:", answer)
    except Exception as e:
        print("❌ Error:", e)