added logical and numerical reasoning benchmarks 5ff8e8c verified davidhornshaw commited on Oct 27, 2024