Text Generation
Transformers
Safetensors
chatglm
feature-extraction
custom_code
JosephusCheung commited on
Commit
cd62bef
·
verified ·
1 Parent(s): 949cc5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -35,6 +35,10 @@ co2_eq_emissions:
35
 
36
  # miniG
37
 
 
 
 
 
38
  A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
39
 
40
  Despite the absence of thorough alignment with human preferences, the model is under no obligation to cater to poorly constructed prompts or the clichés often found in conventional benchmarks. Bonus: Included is an implementation of a **Vision Language Model** that has undergone Locked-Image Tuning.
@@ -61,6 +65,10 @@ Despite the absence of thorough alignment with human preferences, the model is u
61
 
62
  # 迷你G
63
 
 
 
 
 
64
  一个在超过**1.2亿**条数据合成数据集上训练的模型,这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的,并结合了类似于检索增强生成和知识图谱集成的方法,数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的,随后由模型本身进行验证。
65
 
66
  尽管该模型没有完全对齐人类偏好,但它没有义务迎合不良构建的提示或常见基准测试中的陈词滥调。额外内容:包含了经过锁定图像微调的**视觉语言模型**实现。
 
35
 
36
  # miniG
37
 
38
+ [GGUF (Text-Only)](https://huggingface.co/CausalLM/miniG/tree/gguf)
39
+
40
+ [Text-Only Weight](https://huggingface.co/CausalLM/miniG/tree/text-only)
41
+
42
  A model trained on a synthesis dataset of over **120 million** entries, this dataset having been generated through the application of state-of-the-art language models utilizing large context windows, alongside methodologies akin to retrieval-augmented generation and knowledge graph integration, where the data synthesis is conducted within clusters derived from a curated pretraining corpus of 20 billion tokens, with subsequent validation performed by the model itself.
43
 
44
  Despite the absence of thorough alignment with human preferences, the model is under no obligation to cater to poorly constructed prompts or the clichés often found in conventional benchmarks. Bonus: Included is an implementation of a **Vision Language Model** that has undergone Locked-Image Tuning.
 
65
 
66
  # 迷你G
67
 
68
+ [GGUF (纯文本)](https://huggingface.co/CausalLM/miniG/tree/gguf)
69
+
70
+ [纯文本权重](https://huggingface.co/CausalLM/miniG/tree/text-only)
71
+
72
  一个在超过**1.2亿**条数据合成数据集上训练的模型,这些数据集是通过应用具有大上下文窗口的最先进语言模型生成的,并结合了类似于检索增强生成和知识图谱集成的方法,数据合成是在一个由200亿个标记组成的预训练语料库中提取的聚类内进行的,随后由模型本身进行验证。
73
 
74
  尽管该模型没有完全对齐人类偏好,但它没有义务迎合不良构建的提示或常见基准测试中的陈词滥调。额外内容:包含了经过锁定图像微调的**视觉语言模型**实现。