Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,9 @@ language:
|
|
6 |
- de
|
7 |
license: agpl-3.0
|
8 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
9 |
---
|
10 |
# miniG
|
11 |
|
@@ -23,7 +26,7 @@ Regarding Formatting: We strongly recommend you double-check your input to ensur
|
|
23 |
|
24 |
Regarding [Benchmark Scores](https://huggingface.co/spaces/JosephusCheung/Goodharts-Law-on-Benchmarks-a-Page-for-miniG): Generally, you shouldn't worry too much about them, as people can always train specifically to achieve good results. We mainly use them as a smoke test, a quick check to ensure no major regressions have occurred. In fact, if you actually read through the benchmark questions themselves, you'll often find yourself chuckling at how inane, low-quality, or even downright silly they are.
|
25 |
|
26 |
-
Regarding training: The final released version was trained using a merge of multiple candidate models in an attempt to improve performance. However, we were unable to conclusively determine whether this was effective. Excluding candidate versions, an efficient naive fine-tuning should be achievable within one day on 16 nodes of 8*A100-80G.
|
27 |
|
28 |
Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine-tuning.
|
29 |
|
@@ -45,7 +48,7 @@ Seeking Unconditional Sponsorship: We are actively training larger parameter mod
|
|
45 |
|
46 |
关于[基准测试分数](https://huggingface.co/spaces/JosephusCheung/Goodharts-Law-on-Benchmarks-a-Page-for-miniG):一般来说,你不应该太过在意这些分数,因为人们总是可以专门训练以取得好成绩。我们主要将它们作为一个冒烟测试,一种快速检查,确保没有发生重大回退。事实上,如果你真的去阅读这些基准测试问题本身,你常常会发现自己会忍不住笑出声来,因为它们是多么无聊、低质量,甚至荒谬可笑。
|
47 |
|
48 |
-
关于训练:最终发布的版本使用了多个候选模型的合并来尝试提高性能。然而,我们无法确定这种方法是否确实有效。排除候选版本和合并实验,使用16个节点、每个节点配备8个A100-80G
|
49 |
|
50 |
免责声明:请注意,该模型是在未经过滤的互联网数据上训练的。由于我们无法对所有数据进行筛选,仍有可能存在大量不适当的内容——包括从露骨的材料到暴力和攻击性语言的内容——我们无法移除。因此,您必须自行对模型进行安全检查,并在输出中实施关键词过滤。由于计算资源的限制,我们目前无法为伦理和安全考虑进行人类反馈的强化学习(RLHF),也不能对SFT样本进行限制性微调,以限制模型回答某些问题的能力。
|
51 |
|
|
|
6 |
- de
|
7 |
license: agpl-3.0
|
8 |
pipeline_tag: text-generation
|
9 |
+
co2_eq_emissions:
|
10 |
+
emissions: 700
|
11 |
+
training_type: "fine-tuning"
|
12 |
---
|
13 |
# miniG
|
14 |
|
|
|
26 |
|
27 |
Regarding [Benchmark Scores](https://huggingface.co/spaces/JosephusCheung/Goodharts-Law-on-Benchmarks-a-Page-for-miniG): Generally, you shouldn't worry too much about them, as people can always train specifically to achieve good results. We mainly use them as a smoke test, a quick check to ensure no major regressions have occurred. In fact, if you actually read through the benchmark questions themselves, you'll often find yourself chuckling at how inane, low-quality, or even downright silly they are.
|
28 |
|
29 |
+
Regarding training: The final released version was trained using a merge of multiple candidate models in an attempt to improve performance. However, we were unable to conclusively determine whether this was effective. Excluding candidate versions, an efficient naive fine-tuning should be achievable within one day on 16 nodes of 8*A100-80G. Based on this, we estimate the carbon emissions to be 700 kg CO2 eq.
|
30 |
|
31 |
Disclaimer: Please note that the model was trained on unfiltered internet data. Since we do not have the capacity to vet all of it, there may be a substantial amount of objectionable content, pornography, violence, and offensive language present that we are unable to remove. Therefore, you will still need to complete your own checks on the model's safety and filter keywords in the output. Due to computational resource constraints, we are presently unable to implement RLHF for the model's ethics and safety, nor training on SFT samples that refuse to answer certain questions for restrictive fine-tuning.
|
32 |
|
|
|
48 |
|
49 |
关于[基准测试分数](https://huggingface.co/spaces/JosephusCheung/Goodharts-Law-on-Benchmarks-a-Page-for-miniG):一般来说,你不应该太过在意这些分数,因为人们总是可以专门训练以取得好成绩。我们主要将它们作为一个冒烟测试,一种快速检查,确保没有发生重大回退。事实上,如果你真的去阅读这些基准测试问题本身,你常常会发现自己会忍不住笑出声来,因为它们是多么无聊、低质量,甚至荒谬可笑。
|
50 |
|
51 |
+
关于训练:最终发布的版本使用了多个候选模型的合并来尝试提高性能。然而,我们无法确定这种方法是否确实有效。排除候选版本和合并实验,使用16个节点、每个节点配备8个A100-80G显卡的情况下,应该可以在一天之内实现高效的朴素微调。据此我们估算碳排放量为700公斤二氧化碳当量。
|
52 |
|
53 |
免责声明:请注意,该模型是在未经过滤的互联网数据上训练的。由于我们无法对所有数据进行筛选,仍有可能存在大量不适当的内容——包括从露骨的材料到暴力和攻击性语言的内容——我们无法移除。因此,您必须自行对模型进行安全检查,并在输出中实施关键词过滤。由于计算资源的限制,我们目前无法为伦理和安全考虑进行人类反馈的强化学习(RLHF),也不能对SFT样本进行限制性微调,以限制模型回答某些问题的能力。
|
54 |
|