ThomasTheMaker commited on
Commit
6632c33
Β·
verified Β·
1 Parent(s): 0a12ad8

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -68
README.md DELETED
@@ -1,68 +0,0 @@
1
- # Mass Evaluations
2
-
3
- Simple benchmark tool for running predefined prompts through all checkpoints of a model.
4
-
5
- ## Usage
6
-
7
- ```bash
8
- python benchmark.py [model_name] [options]
9
- ```
10
-
11
- ## Examples
12
-
13
- ```bash
14
- # Benchmark all checkpoints of a model
15
- python benchmark.py pico-decoder-tiny-dolma5M-v1
16
-
17
- # Specify custom output directory
18
- python benchmark.py pico-decoder-tiny-dolma5M-v1 --output my_results/
19
-
20
- # Use custom prompts file
21
- python benchmark.py pico-decoder-tiny-dolma5M-v1 --prompts my_prompts.json
22
- ```
23
-
24
- ## Managing Prompts
25
-
26
- Prompts are stored in `prompts.json` as a simple array of strings:
27
-
28
- ```json
29
- [
30
- "Hello, how are you?",
31
- "Complete this story: Once upon a time",
32
- "What is the capital of France?"
33
- ]
34
- ```
35
-
36
- ### Adding New Prompts
37
-
38
- Simply edit `prompts.json` and add new prompt strings to the array. Super simple!
39
-
40
- ## Features
41
-
42
- - **Auto-discovery**: Finds all `step_*` checkpoints automatically
43
- - **JSON-based prompts**: Easily customizable prompts via JSON file
44
- - **Readable output**: Markdown reports with clear structure
45
- - **Error handling**: Continues on failures, logs errors
46
- - **Progress tracking**: Shows real-time progress
47
- - **Metadata logging**: Includes generation time and parameters
48
-
49
- ## Output
50
-
51
- Results are saved as markdown files in `results/` directory:
52
- ```
53
- results/
54
- β”œβ”€β”€ pico-decoder-tiny-dolma5M-v1_benchmark_20250101_120000.md
55
- β”œβ”€β”€ pico-decoder-tiny-dolma29k-v3_benchmark_20250101_130000.md
56
- └── ...
57
- ```
58
-
59
- ## Predefined Prompts
60
-
61
- 1. "Hello, how are you?" (conversational)
62
- 2. "Complete this story: Once upon a time" (creative)
63
- 3. "Explain quantum physics in simple terms" (explanatory)
64
- 4. "Write a haiku about coding" (creative + structured)
65
- 5. "What is the capital of France?" (factual)
66
- 6. "The meaning of life is" (philosophical)
67
- 7. "In the year 2050," (futuristic)
68
- 8. "Python programming is" (technical)