Dhavana-Chat-150M
SFT'd from Serialtechlab/dhavana-base-150m on a mix of Dhivehi QA, bidirectional translation pairs, English instruction data, and OpenAssistant conversations.
Chat template
<|reserved_00|>system
{system message}<|reserved_01|>
<|reserved_00|>user
{user message}<|reserved_01|>
<|reserved_00|>assistant
{assistant response}<|reserved_01|>
<|reserved_00|> = token id 4 (chat-start)
<|reserved_01|> = token id 5 (chat-end)
Use
Generate with the chat template above. Stop generation when <|reserved_01|> is produced.
Always use repetition_penalty=1.2, no_repeat_ngram_size=3 for clean outputs
(inherited limitation from the base model).
| Field | Value |
|---|---|
| Parameters | 125,264,640 (~150M) |
| Final SFT step | 1,464 |
| Tokenizer | Serialtechlab/dhavana-tok-v0 |
- Downloads last month
- 14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support