Ba2han commited on
Commit
0eeda98
1 Parent(s): 271b9c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md CHANGED
@@ -1,3 +1,38 @@
1
  ---
2
  license: cc-by-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ datasets:
4
+ - camel-ai/biology
5
  ---
6
+
7
+ limit: None, provide_description: False, num_fewshot: 5, batch_size: None
8
+ | Task |Version| Metric |Value | |Stderr|
9
+ |-----------------------------------|------:|--------|-----:|---|-----:|
10
+ |hendrycksTest-college_chemistry | 1|acc |0.4600|± |0.0501|
11
+ | | |acc_norm|**0.4600**|± |0.0501|
12
+ |hendrycksTest-high_school_chemistry| 1|acc |0.5222|± |0.0351|
13
+ | | |acc_norm|**0.5222**|± |0.0351|
14
+ |hendrycksTest-college_biology | 1|acc |0.7222|± |0.0375|
15
+ | | |acc_norm|**0.7222**|± |0.0375|
16
+ |hendrycksTest-high_school_biology | 1|acc |0.7355|± |0.0251|
17
+ | | |acc_norm|**0.7355**|± |0.0251|
18
+ |winogrande | 0|acc |**0.7758**|± |0.0117|
19
+
20
+ This model was trained from base Mistral-7B-Instruct-v0.2 on 710 examples, 200 of which comes from camel-ai/biology set. The rest were scraped personally and consists of very long scientific articles and text books.
21
+
22
+ It beats Mistral-7B-Instruct-v0.2 in MMLU chemistry and biology. It should be able to generate mostly factual, basic and lengthy scientific text. I guess it could be "we have cosmopedia at home" for people who want to create cheap pretraining datasets from scratch.
23
+
24
+ Template:
25
+
26
+ [Context]
27
+ You are a helpful assistant. Read the instruction and write a response accordingly.
28
+
29
+ [User]
30
+ {prompt}
31
+
32
+ [Assistant]
33
+
34
+
35
+
36
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6324eabf05bd8a54c6eb1650/ywxKzcQra_1g8EWtMeZ8Q.png)
37
+
38
+