6 SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models · 10 authors