Quantization made by Richard Erkhov.

[Github](https://github.com/RichardErkhov)

[Discord](https://discord.gg/pvy7H8DZMG)

[Request more models](https://github.com/RichardErkhov/quant_request)


sparse_mistral_7b_refined_web_50p_2024-04-13 - GGUF
- Model creator: https://huggingface.co/thrunlab/
- Original model: https://huggingface.co/thrunlab/sparse_mistral_7b_refined_web_50p_2024-04-13/


| Name | Quant method | Size |
| ---- | ---- | ---- |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q2_K.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q2_K.gguf) | Q2_K | 2.53GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_XS.gguf) | IQ3_XS | 2.81GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_S.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_S.gguf) | IQ3_S | 2.96GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_S.gguf) | Q3_K_S | 2.95GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_M.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.IQ3_M.gguf) | IQ3_M | 3.06GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K.gguf) | Q3_K | 3.28GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_M.gguf) | Q3_K_M | 3.28GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q3_K_L.gguf) | Q3_K_L | 3.56GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.IQ4_XS.gguf) | IQ4_XS | 3.67GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_0.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_0.gguf) | Q4_0 | 3.83GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.IQ4_NL.gguf) | IQ4_NL | 3.87GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K_S.gguf) | Q4_K_S | 3.86GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K.gguf) | Q4_K | 4.07GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_K_M.gguf) | Q4_K_M | 4.07GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_1.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q4_1.gguf) | Q4_1 | 4.24GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_0.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_0.gguf) | Q5_0 | 4.65GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K_S.gguf) | Q5_K_S | 4.65GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K.gguf) | Q5_K | 4.78GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_K_M.gguf) | Q5_K_M | 4.78GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_1.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q5_1.gguf) | Q5_1 | 5.07GB |
| [sparse_mistral_7b_refined_web_50p_2024-04-13.Q6_K.gguf](https://huggingface.co/RichardErkhov/thrunlab_-_sparse_mistral_7b_refined_web_50p_2024-04-13-gguf/blob/main/sparse_mistral_7b_refined_web_50p_2024-04-13.Q6_K.gguf) | Q6_K | 5.53GB |


Original model description:
---
license: apache-2.0
base_model: mistralai/Mistral-7B-v0.1
tags:
- generated_from_trainer
model-index:
- name: sparse_mistral_7b_refined_web_50p_2024-04-13
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# sparse_mistral_7b_refined_web_50p_2024-04-13

This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 2.1985

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 4
- seed: 0
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- total_eval_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 2350

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.3391        | 0.01  | 25   | 2.4196          |
| 2.2711        | 0.02  | 50   | 2.3577          |
| 2.3054        | 0.02  | 75   | 2.3158          |
| 2.2795        | 0.03  | 100  | 2.2966          |
| 2.3175        | 0.04  | 125  | 2.2846          |
| 2.2388        | 0.05  | 150  | 2.2766          |
| 2.1679        | 0.06  | 175  | 2.2705          |
| 2.2996        | 0.06  | 200  | 2.2678          |
| 2.2788        | 0.07  | 225  | 2.2647          |
| 2.2448        | 0.08  | 250  | 2.2637          |
| 2.1837        | 0.09  | 275  | 2.2624          |
| 2.2089        | 0.1   | 300  | 2.2621          |
| 2.2686        | 0.1   | 325  | 2.2601          |
| 2.2254        | 0.11  | 350  | 2.2593          |
| 2.162         | 0.12  | 375  | 2.2590          |
| 2.2687        | 0.13  | 400  | 2.2563          |
| 2.2595        | 0.14  | 425  | 2.2571          |
| 2.186         | 0.14  | 450  | 2.2564          |
| 2.2689        | 0.15  | 475  | 2.2580          |
| 2.2472        | 0.16  | 500  | 2.2554          |
| 2.2005        | 0.17  | 525  | 2.2553          |
| 2.1983        | 0.18  | 550  | 2.2552          |
| 2.2388        | 0.18  | 575  | 2.2547          |
| 2.1443        | 0.19  | 600  | 2.2555          |
| 2.2198        | 0.2   | 625  | 2.2534          |
| 2.3008        | 0.21  | 650  | 2.2536          |
| 2.179         | 0.22  | 675  | 2.2521          |
| 2.2069        | 0.22  | 700  | 2.2531          |
| 2.1819        | 0.23  | 725  | 2.2526          |
| 2.1218        | 0.24  | 750  | 2.2536          |
| 2.1845        | 0.25  | 775  | 2.2515          |
| 2.2167        | 0.26  | 800  | 2.2510          |
| 2.2252        | 0.26  | 825  | 2.2520          |
| 2.1664        | 0.27  | 850  | 2.2519          |
| 2.1853        | 0.28  | 875  | 2.2530          |
| 2.1499        | 0.29  | 900  | 2.2513          |
| 2.2763        | 0.3   | 925  | 2.2517          |
| 2.2528        | 0.3   | 950  | 2.2518          |
| 2.2505        | 0.31  | 975  | 2.2500          |
| 2.1683        | 0.32  | 1000 | 2.2502          |
| 2.2177        | 0.33  | 1025 | 2.2501          |
| 2.238         | 0.34  | 1050 | 2.2516          |
| 2.193         | 0.34  | 1075 | 2.2507          |
| 2.2025        | 0.35  | 1100 | 2.2502          |
| 2.0944        | 0.36  | 1125 | 2.2512          |
| 2.2272        | 0.37  | 1150 | 2.2508          |
| 2.2264        | 0.38  | 1175 | 2.2500          |
| 2.1837        | 0.38  | 1200 | 2.2507          |
| 2.1444        | 0.39  | 1225 | 2.2489          |
| 2.2464        | 0.4   | 1250 | 2.2499          |
| 2.1388        | 0.41  | 1275 | 2.2508          |
| 2.193         | 0.42  | 1300 | 2.2492          |
| 2.2376        | 0.42  | 1325 | 2.2506          |
| 2.2212        | 0.43  | 1350 | 2.2478          |
| 2.2002        | 0.44  | 1375 | 2.2488          |
| 2.2729        | 0.45  | 1400 | 2.2484          |
| 2.2329        | 0.46  | 1425 | 2.2473          |
| 2.1919        | 0.46  | 1450 | 2.2481          |
| 2.2102        | 0.47  | 1475 | 2.2475          |
| 2.1466        | 0.48  | 1500 | 2.2473          |
| 2.1819        | 0.49  | 1525 | 2.2478          |
| 2.2558        | 0.5   | 1550 | 2.2468          |
| 2.2137        | 0.5   | 1575 | 2.2463          |
| 2.2288        | 0.51  | 1600 | 2.2466          |
| 2.1479        | 0.52  | 1625 | 2.2468          |
| 2.1726        | 0.53  | 1650 | 2.2471          |
| 2.1805        | 0.54  | 1675 | 2.2454          |
| 2.1505        | 0.54  | 1700 | 2.2470          |
| 2.1337        | 0.55  | 1725 | 2.2465          |
| 2.2413        | 0.56  | 1750 | 2.2460          |
| 2.152         | 0.57  | 1775 | 2.2478          |
| 2.2669        | 0.58  | 1800 | 2.2471          |
| 2.2925        | 0.58  | 1825 | 2.2465          |
| 2.222         | 0.59  | 1850 | 2.2457          |
| 2.1308        | 0.6   | 1875 | 2.2466          |
| 2.201         | 0.61  | 1900 | 2.2456          |
| 2.2247        | 0.62  | 1925 | 2.2460          |
| 2.2426        | 0.62  | 1950 | 2.2463          |
| 2.2312        | 0.63  | 1975 | 2.2465          |
| 2.2679        | 0.64  | 2000 | 2.2464          |
| 2.1928        | 0.65  | 2025 | 2.2463          |
| 2.2087        | 0.66  | 2050 | 2.2455          |
| 2.1792        | 0.66  | 2075 | 2.2470          |
| 2.252         | 0.67  | 2100 | 2.2468          |
| 2.2018        | 0.68  | 2125 | 2.2456          |
| 2.2006        | 0.69  | 2150 | 2.2451          |
| 2.2076        | 0.7   | 2175 | 2.2449          |
| 2.2436        | 0.7   | 2200 | 2.2460          |
| 2.2156        | 0.71  | 2225 | 2.2477          |
| 2.1348        | 0.72  | 2250 | 2.2455          |
| 2.1338        | 0.73  | 2275 | 2.2450          |
| 2.2147        | 0.74  | 2300 | 2.2455          |
| 2.2766        | 0.74  | 2325 | 2.2444          |
| 2.204         | 0.75  | 2350 | 2.2458          |


### Framework versions

- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0