Vikhrmodels
/

Vikhr-Gemma-2B-instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

💨 Vikhr-Gemma-2B-instruct

Мощная инструктивная модель на основе Gemma 2 2B, обученная на русскоязычном датасете GrandMaster-PRO-MAX.

HF model

Perplexity (ниже - лучше)

Veles

Model	Perplexity
Q4_K	4.7254 +/- 0.03867
Q4_0	4.8067 +/- 0.03922
Q8_0	4.6042 +/- 0.03751
Q4_1	4.7798 +/- 0.03933
F32	4.6013 +/- 0.03749
Q6_K	4.6244 +/- 0.03760
BF16	4.6015 +/- 0.03749
Q2_K	5.6819 +/- 0.04737
Q5_0	4.6876 +/- 0.03855
Q5_K	4.6428 +/- 0.03789
Q3_K_S	5.1485 +/- 0.04257
Q2_K_S	6.3124 +/- 0.05359
F16	4.6013 +/- 0.03749
Q4_K_M	4.7254 +/- 0.03867
Q5_K_M	4.6428 +/- 0.03789
Q5_1	4.6518 +/- 0.03794
Q4_K_S	4.7631 +/- 0.03916
Q5_K_S	4.6509 +/- 0.03803
Q3_K	4.8339 +/- 0.03965
Q3_K_M	4.8339 +/- 0.03965
Q3_K_L	4.7981 +/- 0.03934

Wikitext-2

Model	Perplexity
Q4_K	10.4374 +/- 0.07339
Q4_0	10.6480 +/- 0.07452
Q8_0	10.1209 +/- 0.07105
Q4_1	10.5574 +/- 0.07476
F32	10.1191 +/- 0.07099
Q6_K	10.1503 +/- 0.07117
BF16	10.1189 +/- 0.07098
Q2_K	12.8851 +/- 0.09332
Q5_0	10.2551 +/- 0.07251
Q5_K	10.1975 +/- 0.07184
Q3_K_S	11.6028 +/- 0.08333
Q2_K_S	14.7951 +/- 0.10960
F16	10.1191 +/- 0.07099
Q4_K_M	10.4374 +/- 0.07339
Q5_K_M	10.1975 +/- 0.07184
Q5_1	10.2348 +/- 0.07208
Q4_K_S	10.4924 +/- 0.07386
Q5_K_S	10.2098 +/- 0.07198
Q3_K	10.7416 +/- 0.07606
Q3_K_M	10.7416 +/- 0.07606
Q3_K_L	10.6242 +/- 0.07506

Downloads last month: 1,873

GGUF

Model size

2.61B params

Architecture

gemma2

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Examples

Text Generation

Inference API (serverless) does not yet support llamacpp models for this pipeline type.

Model tree for Vikhrmodels/Vikhr-Gemma-2B-instruct-GGUF

Base model

google/gemma-2-2b

Finetuned

google/gemma-2-2b-it

Finetuned

Vikhrmodels/Vikhr-Gemma-2B-instruct

Quantized

(5)

this model

Dataset used to train Vikhrmodels/Vikhr-Gemma-2B-instruct-GGUF