justinxzhao commited on
Commit
33cc960
·
verified ·
1 Parent(s): b286409

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -5
README.md CHANGED
@@ -7,14 +7,10 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- <p align="center">
11
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6462ac71514ee1645bd1f7f7/6MkoY412i9IqvISWSS4qs.png">
12
- </p>
13
-
14
  The rapid advancement of Large Language Models (LLMs) necessitates robust
15
  and challenging benchmarks.
16
 
17
- To address the challenge of ranking LLMs on *highly subjective* tasks such as emotional intelligence, creative writing, or persuasiveness,
18
  the **Language Model Council (LMC)** operates through a democratic process to: 1) formulate a test set through
19
  equal participation, 2) administer the test among council members, and 3) evaluate
20
  responses as a collective jury.
@@ -24,5 +20,6 @@ and less biased than those from any individual LLM judge, and is more consistent
24
 
25
  Roadmap:
26
 
 
27
  - Expand to more domains, use cases, and sophisticated agentic interactions.
28
  - Produce a generalized user interface for Council-as-a-Service.
 
7
  pinned: false
8
  ---
9
 
 
 
 
 
10
  The rapid advancement of Large Language Models (LLMs) necessitates robust
11
  and challenging benchmarks.
12
 
13
+ To address the challenge of ranking LLMs on highly subjective tasks such as emotional intelligence, creative writing, or persuasiveness,
14
  the **Language Model Council (LMC)** operates through a democratic process to: 1) formulate a test set through
15
  equal participation, 2) administer the test among council members, and 3) evaluate
16
  responses as a collective jury.
 
20
 
21
  Roadmap:
22
 
23
+ - Use the Council to benchmark evaluative characteristics of LLM-as-a-Judge/Jury like bias, affinity, and agreement.
24
  - Expand to more domains, use cases, and sophisticated agentic interactions.
25
  - Produce a generalized user interface for Council-as-a-Service.