Mxode
/

NanoLM-0.3B-Instruct-v1

Text2Text Generation

Model card Files Files and versions Community

Mxode commited on Sep 3, 2024

Commit

aa37ca0

·

verified ·

1 Parent(s): c727437

Update README_zh-CN.md

Files changed (1) hide show

README_zh-CN.md +2 -2

README_zh-CN.md CHANGED Viewed

@@ -8,14 +8,14 @@
 为了探究小模型的潜能，我尝试构建一系列小模型，并存放于 [NanoLM Collections](https://huggingface.co/collections/Mxode/nanolm-66d6d75b4a69536bca2705b2)。
-这是 NanoLM-0.3B-Instruct-v1，即 NanoLM-0.3B-Instruct 的第一个版本。目前模型支持中英双语。
 ## Model Details
 NanoLM-0.3B-Instruct-v1 的 tokenizer 与模型结构均与 [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B) 一致，但是层数从 24 变为了 12。因此，NanoLM-0.3B-Instruct-v1 仅有 0.3B，其中 non-embedding 参数仅有约 180M。但 NanoLM-0.3B-Instruct-v1 仍然有着良好的指令遵循能力。
-下面是一些示例，出于 reproduction 的考虑，我将 `do_sample` 设置为 `False`。
 首先您应当先加载模型，如下：

 为了探究小模型的潜能，我尝试构建一系列小模型，并存放于 [NanoLM Collections](https://huggingface.co/collections/Mxode/nanolm-66d6d75b4a69536bca2705b2)。
+这是 NanoLM-0.3B-Instruct-v1，即 NanoLM-0.3B-Instruct 的第一个版本。目前模型支持**中英双语**。
 ## Model Details
 NanoLM-0.3B-Instruct-v1 的 tokenizer 与模型结构均与 [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B) 一致，但是层数从 24 变为了 12。因此，NanoLM-0.3B-Instruct-v1 仅有 0.3B，其中 non-embedding 参数仅有约 180M。但 NanoLM-0.3B-Instruct-v1 仍然有着良好的指令遵循能力。
+下面是一些示例，出于 reproduction 的考虑，我将 `do_sample` 设置为 `False`。但实际使用中，您应当设置合适的采样参数。
 首先您应当先加载模型，如下：