File size: 1,978 Bytes

d3d57ca
bb91fe0
 
cbdaadc
 
bb91fe0
 
 
 
d3d57ca
 
 
 
 
 
 
 
 
 
 
 
bb91fe0
d3d57ca
bb91fe0
a2b682a
 
d3d57ca
 
bb91fe0
 
a2b682a
 
 
7d6e7dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb8976d
7d6e7dd

---
datasets:
- phatvo/hotpotqa-raft-dev-100
library_name: transformers
license: llama3.1
metrics:
- f1
- exact_match
pipeline_tag: text-generation
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->



## Model Details

### Model Description

LORA adapters of `meta-llama/Meta-Llama-3.1-8B-Instruct`, trained on 100 context samples from the HotpotQA dataset using the RAFT method, enable the model to better reason through the context and return more accurate outcomes.

### Evaluation

Evaluated on FULL validation set of HotpotQA.


|      type    | exatch_match| f1         | precision     | recall  |
|--------------|-------------|------------|---------------|---------|
| pretrained   | 0.2980        | 0.3979     | 0.4116        | 0.5263  |
| finetuned    | 0.3606       | **0.4857** | 0.4989        | 0.5318  |

Finetuned version increases **22% on F1 and 15% on average**

### Usage

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "phatvo/Meta-Llama3.1-8B-Instruct-RAFT"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
  model_id, device_map="auto", revision="main", trust_remote_code=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

inst = "Given the question and context below, thinking in logical reasoning way for your answer.\
Please provide only your answer in this format: CoT Answer: {reason} <ANSWER>: {answer}."
context = ""
question = ""
prompt = f"{context}\n{question}"

chat = [
        {"role": "system", "content": inst},
        {"role": "user", "content": prompt},
    ]
prompt = tokenizer.apply_chat_template(chat, tokenize=False)

output = pipe(prompt,
  temperature=0.001, 
  max_new_tokens=1024, # recommended to set it more than 800 
  return_full_text=False,
  do_sample=True)

print(output[0]["generated_text"])
# CoT Answer: thoughts... <ANSWER>: final_answer...
```