Edit model card

Highly experimental model. Please go away.

Based on sunfall; it understands tags, but it does not really understand The Diamond Law (yet). (slop: ~1k occurrences)

The model has not been trained on any benchmark like data, only RP data. It did undergo a rather severe uncensoring phase, though. MMLU-Pro results, compared to Gemma2-9b-it (the model it was based on):

| overall | biology | business | chemistry | computer science | economics | engineering | health | history |  law  | math  | philosophy | physics | psychology | other |
| ------- | ------- | -------- | --------- | ---------------- | --------- | ----------- | ------ | ------- | ----- | ----- | ---------- | ------- | ---------- | ----- |
|   51.99 |   69.57 |    44.00 |     58.33 |            61.54 |     66.67 |       25.81 |  52.17 |   75.00 | 33.33 | 58.14 |      68.75 |   41.46 |      44.00 | 62.07 |
|   51.57 |   60.87 |    44.00 |     61.11 |            61.54 |     55.56 |       51.61 |  50.00 |   50.00 | 42.86 | 48.84 |      62.50 |   41.46 |      52.00 | 55.17 |

(top is gemma2-9b-it, bottom is gemstone-9b)

Downloads last month
48
Safetensors
Model size
9.24B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train crestf411/gemstone-9b