crestf411/gemstone-9b · Hugging Face

Highly experimental model. Please go away.

Based on sunfall; it understands tags, but it does not really understand The Diamond Law (yet). (slop: ~1k occurrences)

The model has not been trained on any benchmark like data, only RP data. It did undergo a rather severe uncensoring phase, though. MMLU-Pro results, compared to Gemma2-9b-it (the model it was based on):

| overall | biology | business | chemistry | computer science | economics | engineering | health | history |  law  | math  | philosophy | physics | psychology | other |
| ------- | ------- | -------- | --------- | ---------------- | --------- | ----------- | ------ | ------- | ----- | ----- | ---------- | ------- | ---------- | ----- |
|   51.99 |   69.57 |    44.00 |     58.33 |            61.54 |     66.67 |       25.81 |  52.17 |   75.00 | 33.33 | 58.14 |      68.75 |   41.46 |      44.00 | 62.07 |
|   51.57 |   60.87 |    44.00 |     61.11 |            61.54 |     55.56 |       51.61 |  50.00 |   50.00 | 42.86 | 48.84 |      62.50 |   41.46 |      52.00 | 55.17 |

(top is gemma2-9b-it, bottom is gemstone-9b)

crestf411
/

gemstone-9b

Dataset used to train crestf411/gemstone-9b