Collections
Discover the best community collections!
Collections including paper arxiv:2311.12983
-
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
Paper • 2311.12022 • Published • 24 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 176 -
gorilla-llm/APIBench
Updated • 26 • 60 -
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Paper • 2312.04724 • Published • 19
-
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 26 -
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4
Paper • 2311.07361 • Published • 11 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 176 -
teknium/openhermes
Viewer • Updated • 243k • 289 • 188
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper • 2211.05100 • Published • 26 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper • 2201.11115 • Published -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 13 -
FinGPT: Large Generative Models for a Small Language
Paper • 2311.05640 • Published • 26
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 7 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 12 -
LASER: LLM Agent with State-Space Exploration for Web Navigation
Paper • 2309.08172 • Published • 10 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8