TheBloke/Airolima-Chronos-Grad-L2-13B-GGUF

I know it might be a lot to ask of a complete stranger, but it would be incredible to see models like this (Airolima-Chronos-Grad-L2), or Stheno-L2-13B, zararp-1.1-l2-7b, or mythomax 13b compiled for mlc-llm (vulkan) - those are just models I've tested and are all great prose/coherency for roleplay. Just like a handful of top models in this format would freshen up their currently tired selection which is largely just instruct and coding. Just 1 13b and 1 7b would be incredible!

Apparently MLC LLM can inference very fast but the issue is you need to compile the models and that requires a whole bunch of RAM for the job which tends to be beyond the capability of those who can run the models. Please ignore me if you wish ofc. Thanks for your great work!

TheBloke
/

Airolima-Chronos-Grad-L2-13B-GGUF

MLC LLM?