Update README.md
Browse files
README.md
CHANGED
@@ -41,6 +41,8 @@ co2_eq_emissions:
|
|
41 |
|
42 |
[GGUF (Text-Only, not recommended)](https://huggingface.co/CausalLM/miniG/tree/gguf): There is a significant degradation, even with the F16.
|
43 |
|
|
|
|
|
44 |
A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
|
45 |
|
46 |
Despite the absence of thorough alignment with human preferences, the model is under no obligation to cater to poorly constructed prompts or the clichés often found in conventional benchmarks. Bonus: Included is an implementation of a **Vision Language Model** that has undergone Locked-Image Tuning.
|
@@ -73,6 +75,8 @@ Despite the absence of thorough alignment with human preferences, the model is u
|
|
73 |
|
74 |
[GGUF (纯文本,不推荐)](https://huggingface.co/CausalLM/miniG/tree/gguf): 即使使用F16,性能也有显著下降。
|
75 |
|
|
|
|
|
76 |
一个在超过**1.2亿**条数据合成数据集上训练的模型,这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的,并结合了类似于检索增强生成和知识图谱集成的方法,数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的,随后由模型本身进行验证。
|
77 |
|
78 |
尽管该模型没有完全对齐人类偏好,但它没有义务迎合不良构建的提示或常见基准测试中的陈词滥调。额外内容:包含了经过锁定图像微调的**视觉语言模型**实现。
|
|
|
41 |
|
42 |
[GGUF (Text-Only, not recommended)](https://huggingface.co/CausalLM/miniG/tree/gguf): There is a significant degradation, even with the F16.
|
43 |
|
44 |
+
**Hint:** How can I check if my inference parameters and quantized inference are performing well? You can try having the model recite "The Gift of the Magi" by O. Henry (which is a public domain text). You should expect it to recite the entire text accurately, including the formatting.
|
45 |
+
|
46 |
A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
|
47 |
|
48 |
Despite the absence of thorough alignment with human preferences, the model is under no obligation to cater to poorly constructed prompts or the clichés often found in conventional benchmarks. Bonus: Included is an implementation of a **Vision Language Model** that has undergone Locked-Image Tuning.
|
|
|
75 |
|
76 |
[GGUF (纯文本,不推荐)](https://huggingface.co/CausalLM/miniG/tree/gguf): 即使使用F16,性能也有显著下降。
|
77 |
|
78 |
+
***提示:** 如何检查我的推理参数和量化推理是否表现良好?你可以尝试让模型背诵朱自清的《背影》(这是一个公共领域的文本)。你应该期待它能够准确地背诵整个文本,包括格式和换行。
|
79 |
+
|
80 |
一个在超过**1.2亿**条数据合成数据集上训练的模型,这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的,并结合了类似于检索增强生成和知识图谱集成的方法,数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的,随后由模型本身进行验证。
|
81 |
|
82 |
尽管该模型没有完全对齐人类偏好,但它没有义务迎合不良构建的提示或常见基准测试中的陈词滥调。额外内容:包含了经过锁定图像微调的**视觉语言模型**实现。
|