Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ base_model:
|
|
12 |
π€ [Models](https://huggingface.co/SakanaAI) | π [Paper](https://arxiv.org/abs/TODO) | π [Blog](https://sakana.ai/taid/) | π¦ [Twitter](https://twitter.com/SakanaAILabs)
|
13 |
|
14 |
**Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
|
15 |
-
We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model
|
16 |
The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
|
17 |
|
18 |
## Usage
|
|
|
12 |
π€ [Models](https://huggingface.co/SakanaAI) | π [Paper](https://arxiv.org/abs/TODO) | π [Blog](https://sakana.ai/taid/) | π¦ [Twitter](https://twitter.com/SakanaAILabs)
|
13 |
|
14 |
**Smol-Swallow-1.5B** is a Japanese compact language model created through TAID (Temporally Adaptive Interpolated Distillation), our new knowledge distillation method.
|
15 |
+
We used [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) as the teacher model and [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) as the student model.
|
16 |
The model has been further pre-trained on Japanese text data to enhance its Japanese language capabilities.
|
17 |
|
18 |
## Usage
|