doberst commited on
Commit
d32abe7
1 Parent(s): 2912f9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -13,6 +13,21 @@ the objective of providing a high-quality Instruct model that is 'inference-read
13
  without using any advanced quantization optimizations.
14
 
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ### Model Description
17
 
18
  <!-- Provide a longer summary of what this model is. -->
@@ -85,7 +100,7 @@ my_prompt = {{text_passage}} + "\n" + {{question/instruction}}
85
 
86
  Darren Oberst & llmware team
87
 
88
- Please reach out anytime if you are interested in this project and would like to participate and work with us!
89
 
90
 
91
 
 
13
  without using any advanced quantization optimizations.
14
 
15
 
16
+ ### Benchmark Tests
17
+
18
+ Evaluated against the benchmark test: [RAG-Instruct-Benchmark-Tester](https://www.huggingface.co/datasets/llmware/rag_instruct_benchmark_tester)
19
+ Average of 2 Test Runs with 1 point for correct answer, 0.5 point for partial correct or blank / NF, 0.0 points for incorrect, and -1 points for hallucinations.
20
+
21
+ --**Accuracy Score**: **92.0** correct out of 100
22
+ --Not Found Classification: 45.0%
23
+ --Boolean: 75.0%
24
+ --Math/Logic: 20.0%
25
+ --Complex Questions (1-5): 2 (Low-Medium)
26
+ --Summarization Quality (1-5): 3 (Coherent, extractive)
27
+ --Hallucinations: No hallucinations observed in test runs.
28
+
29
+ For test run results (and good indicator of target use cases), please see the files ("core_rag_test" and "answer_sheet" in this repo).
30
+
31
  ### Model Description
32
 
33
  <!-- Provide a longer summary of what this model is. -->
 
100
 
101
  Darren Oberst & llmware team
102
 
103
+ Please reach out anytime if you are interested in this project!
104
 
105
 
106