codegen

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

cassanof authored a paper 23 days ago

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

hugh-scale authored a paper about 1 month ago

Humanity's Last Exam

mohit-raghavendra authored a paper 4 months ago

Representation Learning in Continuous-Time Dynamic Signed Networks

View all activity

codegenning's activity

cassanof

authored a paper 23 days ago

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published 24 days ago • 9

hugh-scale

authored a paper about 1 month ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 63

mohit-raghavendra

authored 2 papers 4 months ago

Representation Learning in Continuous-Time Dynamic Signed Networks

Paper • 2207.03408 • Published Jul 7, 2022

Revisiting the Superficial Alignment Hypothesis

Paper • 2410.03717 • Published Sep 27, 2024

evanzwang

updated 4 datasets 5 months ago

hugh-scale

authored 4 papers 6 months ago

Chain-of-Thought Reasoning is a Policy Improvement Operator

Paper • 2309.08589 • Published Sep 15, 2023 • 1

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Paper • 2402.14688 • Published Feb 22, 2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Paper • 2406.04520 • Published Jun 6, 2024 • 12

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Paper • 2408.15221 • Published Aug 27, 2024

evanzwang

updated 7 datasets 6 months ago

codegenning/B_mbpp_plus_v2

Viewer • Updated Aug 22, 2024 • 756 • 54

codegenning/F_mbpp_plus

Viewer • Updated Aug 22, 2024 • 378 • 247

codegenning/B_human_eval_plus_v2

Viewer • Updated Aug 22, 2024 • 328 • 47

codegenning/B_human_eval_plus

Viewer • Updated Aug 22, 2024 • 328 • 54

codegenning/B_livecodebench_lite_v3_C

Viewer • Updated Aug 19, 2024 • 876 • 67

codegenning/B_livecodebench_lite_v3

Viewer • Updated Aug 19, 2024 • 348 • 67

codegenning/B_mbpp_plus

Viewer • Updated Aug 19, 2024 • 756 • 45

hugh-scale

updated a dataset 6 months ago

codegenning/B_livecodebench_C

Viewer • Updated Aug 16, 2024 • 174 • 49

AI & ML interests

Recent Activity

Team members 8

codegenning's activity