|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
This is the MCQ confidence prediction model that outputs a certainty score given a math question and a full step-by-step rationale that attempts to solve the question. |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
- **Developed by:** Shiyao Li |
|
- **Finetuned from model :** meta-llama/Meta-Llama-3.1-8B |
|
|
|
### Model Example Usage |
|
``` |
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
import torch |
|
|
|
model_name_or_path = 'CarelessLee/MCQ_pooled_full_rationale_confidence_predictor' |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name_or_path) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path) |
|
|
|
def predict(text): |
|
model.eval() |
|
inputs = tokenizer(text, return_tensors="pt") |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
logits = outputs.logits |
|
certainty_score = logits.item() |
|
return certainty_score |
|
|
|
def evaluate_predictor(label_data): |
|
for sample in tqdm(label_data, desc="Processing questions"): |
|
highest_score = -1 |
|
best_rationale = "" |
|
for rationale in sample['rationales']: |
|
text = f"Problem: {sample['question']}\n---\nRationale Step: {rationale}" |
|
predicted_certainty_score = predict(text) |
|
print("predicted_certainty_score: ", predicted_certainty_score) |
|
|
|
if __name__ == "__main__": |
|
with open("example.json", 'r') as f: |
|
label_data = json.load(f) |
|
|
|
evaluate_predictor(label_data) |
|
``` |
|
|