CausalLM
/

miniG

@@ -41,7 +41,7 @@ co2_eq_emissions:
 [GGUF (Text-Only, not recommended)](https://huggingface.co/CausalLM/miniG/tree/gguf): There is a significant degradation, even with the F16.
-**Hint:** How can I check if my inference parameters and quantized inference are performing well? You can try having the model recite "The Gift of the Magi" by O. Henry (which is a public domain text). You should expect it to recite the entire text accurately, including the formatting.
 A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
@@ -75,7 +75,7 @@ Despite the absence of thorough alignment with human preferences, the model is u
 [GGUF (纯文本，不推荐)](https://huggingface.co/CausalLM/miniG/tree/gguf): 即使使用F16，性能也有显著下降。
-***提示：** 如何检查我的推理参数和量化推理是否表现良好？你可以尝试让模型背诵朱自清的《背影》（这是一个公共领域的文本）。你应该期待它能够准确地背诵整个文本，包括格式和换行。
 一个在超过**1.2亿**条数据合成数据集上训练的模型，这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的，并结合了类似于检索增强生成和知识图谱集成的方法，数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的，随后由模型本身进行验证。

 [GGUF (Text-Only, not recommended)](https://huggingface.co/CausalLM/miniG/tree/gguf): There is a significant degradation, even with the F16.
+> **Hint:** How can I check if my inference parameters and quantized inference are performing well? You can try having the model recite "The Gift of the Magi" by O. Henry (which is a public domain text). You should expect it to recite the entire text accurately, including the formatting.
 A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
 [GGUF (纯文本，不推荐)](https://huggingface.co/CausalLM/miniG/tree/gguf): 即使使用F16，性能也有显著下降。
+> **提示：** 如何检查我的推理参数和量化推理是否表现良好？你可以尝试让模型背诵朱自清的《背影》（这是一个公共领域的文本）。你应该期待它能够准确地背诵整个文本，包括格式和换行。
 一个在超过**1.2亿**条数据合成数据集上训练的模型，这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的，并结合了类似于检索增强生成和知识图谱集成的方法，数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的，随后由模型本身进行验证。