Commits · llm-council/sandbox

Parse judgments with structured output prompting, one response model, one judge model at a time.

eb4ec23

justinxzhao commited on Oct 3, 2024

Add token usage tracking for openai and fix token usage tracking for anthropic.

1afb9ca

justinxzhao commited on Oct 1, 2024

Factor out LLM chat rendering so that it persists even when the submit button isn't active.

a0dca54

justinxzhao commited on Oct 1, 2024

Factor out judge results code so that it persists when the submit button is inactivated.

279a804

justinxzhao commited on Oct 1, 2024

Added general rendering of chats so that they don't disappear during app saving.

6fae7e2

justinxzhao commited on Oct 1, 2024

Fix all warnings.

16d72cb

justinxzhao commited on Sep 30, 2024

Overall scores graph complete.

38e43b5

justinxzhao commited on Sep 30, 2024

Added per-response plots.

3e0f8f8

justinxzhao commited on Sep 30, 2024

Some refactoring, judging responses for direct assessment.

577870e

justinxzhao commited on Sep 29, 2024

Fixed aggregator prompt.

3703473

justinxzhao commited on Sep 25, 2024

Streaming working, with different providers.

c0a5a18

justinxzhao commited on Sep 24, 2024

Password protection?

cf367e2

justinxzhao commited on Sep 12, 2024

Add application file

663a6db

justinxzhao commited on Sep 12, 2024

initial commit

61721be

justinxzhao commited on Sep 12, 2024

Spaces:

llm-council
/

sandbox

Sleeping

Commit History

Parse judgments with structured output prompting, one response model, one judge model at a time.

eb4ec23

Add token usage tracking for openai and fix token usage tracking for anthropic.

1afb9ca

Factor out LLM chat rendering so that it persists even when the submit button isn't active.

a0dca54

Factor out judge results code so that it persists when the submit button is inactivated.

279a804

Added general rendering of chats so that they don't disappear during app saving.

6fae7e2

Fix all warnings.

16d72cb

Overall scores graph complete.

38e43b5

Added per-response plots.

3e0f8f8

Some refactoring, judging responses for direct assessment.

577870e

Fixed aggregator prompt.

3703473

Streaming working, with different providers.

c0a5a18

Password protection?

cf367e2

Add application file

663a6db

initial commit

61721be

Commit History

Parse judgments with structured output prompting, one response model, one judge model at a time. eb4ec23

Add token usage tracking for openai and fix token usage tracking for anthropic. 1afb9ca

Factor out LLM chat rendering so that it persists even when the submit button isn't active. a0dca54

Factor out judge results code so that it persists when the submit button is inactivated. 279a804

Added general rendering of chats so that they don't disappear during app saving. 6fae7e2

Fix all warnings. 16d72cb

Overall scores graph complete. 38e43b5

Added per-response plots. 3e0f8f8

Some refactoring, judging responses for direct assessment. 577870e

Fixed aggregator prompt. 3703473

Streaming working, with different providers. c0a5a18

Password protection? cf367e2

Add application file 663a6db

initial commit 61721be

Parse judgments with structured output prompting, one response model, one judge model at a time.

eb4ec23

Add token usage tracking for openai and fix token usage tracking for anthropic.

1afb9ca

Factor out LLM chat rendering so that it persists even when the submit button isn't active.

a0dca54

Factor out judge results code so that it persists when the submit button is inactivated.

279a804

Added general rendering of chats so that they don't disappear during app saving.

6fae7e2

Fix all warnings.

16d72cb

Overall scores graph complete.

38e43b5

Added per-response plots.

3e0f8f8

Some refactoring, judging responses for direct assessment.

577870e

Fixed aggregator prompt.

3703473

Streaming working, with different providers.

c0a5a18

Password protection?

cf367e2

Add application file

663a6db

initial commit

61721be