Transformers
Safetensors
PEFT
Inference Endpoints
File size: 1,904 Bytes
4391684
 
73013c1
 
 
acc65dc
2b239b3
3f743b7
2656043
 
4391684
 
a0ced91
4391684
95d37b9
4391684
9e31e55
95d37b9
acc65dc
4391684
95d37b9
4391684
95d37b9
4391684
95d37b9
4391684
95d37b9
 
 
 
4391684
95d37b9
4391684
95d37b9
4391684
95d37b9
4391684
95d37b9
4391684
95d37b9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
library_name: transformers
tags:
- transformers
- peft
- arxiv:2406.08391
license: apache-2.0
base_model: mistralai/Mistral-7B-Instruct-v0.2
datasets:
- calibration-tuning/Mistral-7B-Instruct-v0.2-20k-oe
---

# Model Card

**Mistral 7B Instruct v0.2 CT-OE** is a fine-tuned [Mistral 7B Instruct v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) model that provides well-calibrated confidence estimates for open-ended question answering.

The model is fine-tuned (calibration-tuned) using a [dataset](https://huggingface.co/datasets/calibration-tuning/Mistral-7B-Instruct-v0.2-20k-oe) of *open-ended* generations from `mistralai/Mistral-7B-Instruct-v0.2`, labeled for correctness. 
At test/inference time, the probability of correctness defines the confidence of the model in its answer. 
For full details, please see our [paper](https://arxiv.org/abs/2406.08391) and supporting [code](https://github.com/activatedgeek/calibration-tuning).

**Other Models**: We also release a broader collection of [Open-Ended CT Models](https://huggingface.co/collections/calibration-tuning/open-ended-ct-models-66043b12c7902115c826a20e).

## Usage

This adapter model is meant to be used on top of `mistralai/Mistral-7B-Instruct-v0.2` model generations.

The confidence estimation pipeline follows these steps,
1. Load base model and PEFT adapter.
2. Disable adapter and generate answer.
3. Enable adapter and generate confidence.

All standard guidelines for the base model's generation apply.

For a complete example, see [play.py](https://github.com/activatedgeek/calibration-tuning/blob/main/experiments/play.py) at the supporting code repository.

**NOTE**: Using the adapter for generations may hurt downstream task accuracy and confidence estimates. We recommend using the adapter to estimate *only* confidence.

## License

The model is released under the original model's Apache 2.0 license.