Two Critical Issues

#1
by makhoroz - opened
  1. Incorrect import path in the model card example
    In the “Example Usage” section of the model card, the model loading path contains an error. The “Tech” part of the organization name is missing, which results in a 404 error and prevents the model from being imported.
    Incorrect: Cerebrum/cere-llama-3.1-8B-tr
    Should be: CerebrumTech/cere-llama-3.1-8B-tr

  2. Model collapses into random tokens after loading (nonsense output)
    After correcting the path and successfully loading the model, I observed that the model collapses during inference and generates random, unintelligible tokens instead of meaningful text. This happens even with very simple prompts.
    Although I understand this model may not be categorized strictly as a text-generation model, the model card explicitly provides a text-generation usage example. When that given example is executed, the output becomes nonsensical. You can see the output in the attached screenshot.

Could you please clarify:

  • What is the cause of this collapse?
  • Is the model expected to produce valid text outputs?
  • If you have a concrete test prompt and its corresponding output (that works as intended), could you share it so we can verify our setup?

Thank you in advance, and looking forward to your clarification.

Error1

Sign up or log in to comment