---
license: llama3.2
tags:
- llama-3
- orpo
- transformers
datasets:
- mlabonne/orpo-dpo-mix-40k
language:
- en
base_model:
- meta-llama/Llama-3.2-1B-Instruct
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: week2-llama3-1B
results:
- task:
type: text-generation
dataset:
name: mlabonne/orpo-dpo-mix-40k
type: mlabonne/orpo-dpo-mix-40k
metrics:
- name: acc-norm (0-Shot)
type: acc-norm (0-Shot)
value: 0.6077
metrics:
- accuracy
---
# Llama-3.2-1B-Instruct-ORPO
[Evaluation](#evaluation) [Environmental Inpact](#environmental-impact)
## Model Details
This model was obtained by finetuning the open source [Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)
model on the [mlabonne/orpo-dpo-mix-40k](mlabonne/orpo-dpo-mix-40k) dataset, leveraging
[Odds Ratio Preference Optimization (ORPO)](https://github.com/xfactlab/orpo) for Reinforcement Learning.
## Uses
This model is optimized for general-purpose language tasks.
## Evaluation
We used the [Eulether](https://github.com/EleutherAI/lm-evaluation-harness) test harness to evaluate the finetuned model.
The table below presents a summary of the evaluation performed.
For a more granular evaluation on `MMLU`, please see Section [MMLU](#mmlu).
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|hellaswag| 1|none | 0|acc |↑ |0.4507|± |0.0050|
| | |none | 0|acc_norm|↑ |0.6077|± |0.0049|
|arc_easy| 1|none | 0|acc |↑ |0.6856|± |0.0095|
| | |none | 0|acc_norm|↑ |0.6368|± |0.0099|
|mmlu | 2|none | |acc |↑ |0.4597|± |0.0041|
| - humanities | 2|none | |acc |↑ |0.4434|± |0.0071|
| - other | 2|none | |acc |↑ |0.5163|± |0.0088|
| - social sciences| 2|none | |acc |↑ |0.5057|± |0.0088|
| - stem | 2|none | |acc |↑ |0.3834|± |0.0085|
[Top](#top)
### MMLU
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|mmlu | 2|none | |acc |↑ |0.4597|± |0.0041|
| - humanities | 2|none | |acc |↑ |0.4434|± |0.0071|
| - formal_logic | 1|none | 0|acc |↑ |0.3254|± |0.0419|
| - high_school_european_history | 1|none | 0|acc |↑ |0.6182|± |0.0379|
| - high_school_us_history | 1|none | 0|acc |↑ |0.5784|± |0.0347|
| - high_school_world_history | 1|none | 0|acc |↑ |0.6540|± |0.0310|
| - international_law | 1|none | 0|acc |↑ |0.6033|± |0.0447|
| - jurisprudence | 1|none | 0|acc |↑ |0.5370|± |0.0482|
| - logical_fallacies | 1|none | 0|acc |↑ |0.4479|± |0.0391|
| - moral_disputes | 1|none | 0|acc |↑ |0.4711|± |0.0269|
| - moral_scenarios | 1|none | 0|acc |↑ |0.3408|± |0.0159|
| - philosophy | 1|none | 0|acc |↑ |0.5177|± |0.0284|
| - prehistory | 1|none | 0|acc |↑ |0.5278|± |0.0278|
| - professional_law | 1|none | 0|acc |↑ |0.3683|± |0.0123|
| - world_religions | 1|none | 0|acc |↑ |0.5906|± |0.0377|
| - other | 2|none | |acc |↑ |0.5163|± |0.0088|
| - business_ethics | 1|none | 0|acc |↑ |0.4300|± |0.0498|
| - clinical_knowledge | 1|none | 0|acc |↑ |0.4642|± |0.0307|
| - college_medicine | 1|none | 0|acc |↑ |0.3815|± |0.0370|
| - global_facts | 1|none | 0|acc |↑ |0.3200|± |0.0469|
| - human_aging | 1|none | 0|acc |↑ |0.5157|± |0.0335|
| - management | 1|none | 0|acc |↑ |0.5243|± |0.0494|
| - marketing | 1|none | 0|acc |↑ |0.6709|± |0.0308|
| - medical_genetics | 1|none | 0|acc |↑ |0.4800|± |0.0502|
| - miscellaneous | 1|none | 0|acc |↑ |0.6015|± |0.0175|
| - nutrition | 1|none | 0|acc |↑ |0.5686|± |0.0284|
| - professional_accounting | 1|none | 0|acc |↑ |0.3511|± |0.0285|
| - professional_medicine | 1|none | 0|acc |↑ |0.5625|± |0.0301|
| - virology | 1|none | 0|acc |↑ |0.4157|± |0.0384|
| - social sciences | 2|none | |acc |↑ |0.5057|± |0.0088|
| - econometrics | 1|none | 0|acc |↑ |0.2456|± |0.0405|
| - high_school_geography | 1|none | 0|acc |↑ |0.5606|± |0.0354|
| - high_school_government_and_politics| 1|none | 0|acc |↑ |0.5389|± |0.0360|
| - high_school_macroeconomics | 1|none | 0|acc |↑ |0.4128|± |0.0250|
| - high_school_microeconomics | 1|none | 0|acc |↑ |0.4454|± |0.0323|
| - high_school_psychology | 1|none | 0|acc |↑ |0.6183|± |0.0208|
| - human_sexuality | 1|none | 0|acc |↑ |0.5420|± |0.0437|
| - professional_psychology | 1|none | 0|acc |↑ |0.4167|± |0.0199|
| - public_relations | 1|none | 0|acc |↑ |0.5000|± |0.0479|
| - security_studies | 1|none | 0|acc |↑ |0.5265|± |0.0320|
| - sociology | 1|none | 0|acc |↑ |0.6468|± |0.0338|
| - us_foreign_policy | 1|none | 0|acc |↑ |**0.6900**|± |0.0465|
| - stem | 2|none | |acc |↑ |0.3834|± |0.0085|
| - abstract_algebra | 1|none | 0|acc |↑ |0.2500|± |0.0435|
| - anatomy | 1|none | 0|acc |↑ |0.4889|± |0.0432|
| - astronomy | 1|none | 0|acc |↑ |0.5329|± |0.0406|
| - college_biology | 1|none | 0|acc |↑ |0.4931|± |0.0418|
| - college_chemistry | 1|none | 0|acc |↑ |0.3800|± |0.0488|
| - college_computer_science | 1|none | 0|acc |↑ |0.3300|± |0.0473|
| - college_mathematics | 1|none | 0|acc |↑ |0.2800|± |0.0451|
| - college_physics | 1|none | 0|acc |↑ |0.2451|± |0.0428|
| - computer_security | 1|none | 0|acc |↑ |0.4800|± |0.0502|
| - conceptual_physics | 1|none | 0|acc |↑ |0.4383|± |0.0324|
| - electrical_engineering | 1|none | 0|acc |↑ |0.5310|± |0.0416|
| - elementary_mathematics | 1|none | 0|acc |↑ |0.2884|± |0.0233|
| - high_school_biology | 1|none | 0|acc |↑ |0.4935|± |0.0284|
| - high_school_chemistry | 1|none | 0|acc |↑ |0.3645|± |0.0339|
| - high_school_computer_science | 1|none | 0|acc |↑ |0.4500|± |0.0500|
| - high_school_mathematics | 1|none | 0|acc |↑ |0.2815|± |0.0274|
| - high_school_physics | 1|none | 0|acc |↑ |0.3113|± |0.0378|
| - high_school_statistics | 1|none | 0|acc |↑ |0.3657|± |0.0328|
| - machine_learning | 1|none | 0|acc |↑ |0.2768|± |0.0425|
[Top](#top)
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** MacBook Air M1
- **Hours used:** 1
- **Cloud Provider:** GPC, A100
- **Compute Region:** US-EAST1
- **Carbon Emitted:** 0.09 kgCO2 of which 100 percents were directly offset by the cloud provider.
[Top](#top)