Uppercase model name, set Quark version to v0.12, add tokenizer files

#5

Three combined changes:

  1. Model card: capitalize the model name to Kimi-K2.5-Eagle3-FP8 (all occurrences).
  2. Model card: shorten the AMD Quark version to v0.12 wherever it appeared (Model Optimizer line, quantization details, environment table).
  3. Add the tokenizer bundle so the documented AutoTokenizer.from_pretrained(..., trust_remote_code=True) works and matches the moonshotai/Kimi-K2.5 target tokenizer used for Eagle3 speculative decoding: tokenizer_config.json, tiktoken.model, tokenization_kimi.py, tool_declaration_ts.py (imported by tokenization_kimi.py), and chat_template.jinja. bos=[BOS] 163584 / eos=[EOS] 163585 match this model's config.json. Verified the tokenizer loads as TikTokenTokenizer and encodes/applies the chat template correctly. Multimodal/MoE/vision modeling files from the target were intentionally not copied (this draft is a text-only LlamaForCausalLMEagle3); the target's generation_config.json was also skipped because its eos_token_id (163586) conflicts with this model's config (163585).
larryli2 changed pull request status to closed

Sign up or log in to comment