โš ๏ธ PARODY / PRACTICE REPO โ€” ํŒจ๋Ÿฌ๋””ยท์—…๋กœ๋“œ ์—ฐ์Šต์šฉ ๋ฆฌํฌ์ž…๋‹ˆ๋‹ค. ์ด ์ €์žฅ์†Œ๋Š” hf CLI ์—…๋กœ๋“œ ์—ฐ์Šต์šฉ์ž…๋‹ˆ๋‹ค. gwangju no1 llm์€ ์‹ค์ œ ๋ชจ๋ธ์ด ์•„๋‹ˆ๋ฉฐ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์•„๋ž˜ ๋ฆฌ๋”๋ณด๋“œ์—์„œ ์šฐ๋ฆฌ ๋ชจ๋ธ์˜ ์ ์ˆ˜๋Š” ์ •์งํ•˜๊ฒŒ **0 (๋ฏธํ‰๊ฐ€)**๋กœ ํ‘œ๊ธฐํ•ฉ๋‹ˆ๋‹ค. ๋น„๊ต์šฉ์œผ๋กœ ํ•จ๊ป˜ ์‹ค์€ ๋‹ค๋ฅธ ๋ชจ๋ธ ์ ์ˆ˜๋Š” ๊ณต๊ฐœ ์ง‘๊ณ„ ์ถœ์ฒ˜๊ฐ€ ์žˆ๋Š” ์‹ค์ธก์น˜์ด๋ฉฐ, ์ถœ์ฒ˜๋ฅผ ๊ฐ ํ–‰์— ๋ช…์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์ €์žฅ์†Œ ์†Œ์œ ์ž์˜ ๋ช…์‹œ์  ์š”์ฒญ์— ๋”ฐ๋ผ .eval_results/gpqa.yaml์„ ํฌํ•จํ•˜๋ฉฐ, **value: 0.01(์†Œ์œ ์ž ์ง€์ • ํ”Œ๋ ˆ์ด์Šคํ™€๋”, ์‹ค์ธก ์•„๋‹˜)**์œผ๋กœ ์ œ์ถœํ•ฉ๋‹ˆ๋‹ค(notes์— ๋ฏธํ‰๊ฐ€ ์‚ฌ์‹ค ๋ช…์‹œ).

This is an upload-practice repo. gwangju no1 llm is not a real model and was never evaluated. Per the repo owner's explicit request, an .eval_results/gpqa.yaml is included with value: 0.01 (a placeholder, not a measured score; the notes field states it was not actually evaluated). Other rows are real published scores with sources.

gwangju no1 llm

์—…๋กœ๋“œ ์—ฐ์Šต์šฉ ๋ฆฌํฌ

์ตœ๊ทผ ์—…๋ฐ์ดํŠธ

  • ๊ฐ€์ƒ ์ ์ˆ˜(91.6)๋ฅผ ์ œ๊ฑฐํ•˜๊ณ , **์šฐ๋ฆฌ ๋ชจ๋ธ ์ ์ˆ˜๋ฅผ 0(๋ฏธํ‰๊ฐ€)**๋กœ ์ •์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ๋น„๊ต์šฉ ๋ฆฌ๋”๋ณด๋“œ๋ฅผ ์ถœ์ฒ˜ ์žˆ๋Š” ์‹ค์ธก GPQA Diamond ์ ์ˆ˜๋กœ ๊ตฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ์†Œ์œ ์ž ์š”์ฒญ์œผ๋กœ .eval_results/gpqa.yaml์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ธก์ด ์•„๋‹ˆ๋ฏ€๋กœ value: 0.01(์†Œ์œ ์ž ์ง€์ • ํ”Œ๋ ˆ์ด์Šคํ™€๋”), notes์— ๋ฏธํ‰๊ฐ€ ์‚ฌ์‹ค์„ ๋ช…์‹œํ•ฉ๋‹ˆ๋‹ค. (Idavidrein/gpqa๋Š” ์‹ค์ œ Benchmark์ด๋ฏ€๋กœ ์ด ๊ฐ’์€ ๊ณต๊ฐœ GPQA Diamond ๋ฆฌ๋”๋ณด๋“œ์— ์ง‘๊ณ„๋ฉ๋‹ˆ๋‹ค.)

GPQA Diamond ๋ฆฌ๋”๋ณด๋“œ

GPQA Diamond๋Š” ์ƒ๋ฌผยทํ™”ํ•™ยท๋ฌผ๋ฆฌ ๋ฐ•์‚ฌ๊ธ‰ 198๋ฌธํ•ญ์œผ๋กœ, ๋ฐ•์‚ฌ ์ „๋ฌธ๊ฐ€ ์ •๋‹ต๋ฅ ์ด ์•ฝ 65%์ธ ๊ณ ๋‚œ๋„ ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค. ์•„๋ž˜ ๋‹ค๋ฅธ ๋ชจ๋ธ ์ ์ˆ˜๋Š” ๊ณต๊ฐœ ์ง‘๊ณ„ ์Šค๋ƒ…์ƒท(2026-06, AI Stats / Artificial Analysis)์˜ ์‹ค์ธก์น˜์ด๋ฉฐ, ์ง‘๊ณ„์ฒ˜๋งˆ๋‹ค ์ˆ˜์น˜๊ฐ€ ์กฐ๊ธˆ์”ฉ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. gwangju no1 llm์€ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์œผ๋ฏ€๋กœ 0์ ์ž…๋‹ˆ๋‹ค.

Rank Model Org GPQA Diamond ๋น„๊ณ  / ์ถœ์ฒ˜
1 GPT-5.5 Pro OpenAI 94.4% ์ถœ์ฒ˜
2 Gemini 3.1 Pro Preview Google 94.3% ์ถœ์ฒ˜
3 Claude Opus 4.7 Anthropic 94.2% ์ถœ์ฒ˜
4 Gemini 3 Pro Preview Google 93.8% ์ถœ์ฒ˜
5 GPT-5.5 OpenAI 93.6% ์ถœ์ฒ˜
6 Claude Opus 4.6 Anthropic 91.3% ์ถœ์ฒ˜
7 GPT-5 OpenAI 87.3% ์ถœ์ฒ˜
8 Claude Opus 4.5 Anthropic 86.95% ์ถœ์ฒ˜
9 GPT-4.5 OpenAI 71.4% ์ถœ์ฒ˜
โ€” gwangju no1 llm terry-u 0 (๋ฏธํ‰๊ฐ€) parody / ํ‰๊ฐ€ ๋ฏธ์ˆ˜ํ–‰
  • ์šฐ๋ฆฌ ๋ชจ๋ธ์˜ 0์ ์€ "๋‚ฎ์€ ์„ฑ๋Šฅ"์ด ์•„๋‹ˆ๋ผ ํ‰๊ฐ€ ์ž์ฒด๋ฅผ ํ•˜์ง€ ์•Š์•˜๋‹ค๋Š” ์‚ฌ์‹ค์„ ๊ทธ๋Œ€๋กœ ํ‘œ๊ธฐํ•œ ๊ฐ’์ž…๋‹ˆ๋‹ค.
  • ์‹ค์ œ ๋ฆฌ๋”๋ณด๋“œ ์ˆœ์œ„๊ฐ€ ์•„๋‹ˆ๋ผ, README ์•ˆ์—์„œ ํ˜•์‹๋งŒ ์žฌํ˜„ํ•œ ํ‘œ์‹œ์šฉ ๋ฆฌ๋”๋ณด๋“œ์ž…๋‹ˆ๋‹ค.

์ฐธ๊ณ  / ๋ฐ์ดํ„ฐ์…‹

์‹ค์ œ ๋ฆฌ๋”๋ณด๋“œ ์ œ์ถœ ์กฐ๊ฑด

์‹ค์ œ GPQA ๋ฆฌ๋”๋ณด๋“œ์— ์˜๋ฏธ ์žˆ๋Š” ์ ์ˆ˜๋ฅผ ์˜ฌ๋ฆฌ๋ ค๋ฉด ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ณ , ํ‰๊ฐ€ ๋กœ๊ทธ์™€ ์ ์ˆ˜๋ฅผ ๊ฒ€์ฆํ•œ ๋’ค Hugging Face๊ฐ€ ํŒŒ์‹ฑํ•˜๋Š” ํ‰๊ฐ€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์— ๊ทธ ์‹ค์ธก๊ฐ’์„ ๋ฐ˜์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ .eval_results/gpqa.yaml์˜ value๋Š” ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•„ **0(๋ฏธํ‰๊ฐ€)**์ด๋ฉฐ, ์‹ค์ธก ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด ๊ทธ ๊ฐ’์œผ๋กœ ๊ต์ฒดํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support