File size: 8,404 Bytes
0acfb00 8399b1e 57d1e67 e59dd6d 667807a e59dd6d 048fcf1 e59dd6d d2f383a 0acfb00 69ade40 9236b96 a43e186 9236b96 784fef8 9236b96 2d3efeb e584ad0 5b1965b ac56161 16e2e2d 784fef8 c84cc3a 16e2e2d 3954523 981b0ee 9236b96 a43e186 9236b96 784fef8 9236b96 784fef8 9236b96 d5be6bf 9236b96 784fef8 1ceb138 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 784fef8 9236b96 a5a97d2 784fef8 9236b96 784fef8 9236b96 1ceb138 a37f644 9236b96 0acfb00 a43e186 98c3a6d 999e7d9 0acfb00 4beb045 d3e576b 4beb045 d3e576b 4beb045 d3e576b 4beb045 9236b96 a43e186 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
license: apache-2.0
datasets:
- nicholasKluge/reward-aira-dataset
language:
- en
metrics:
- accuracy
library_name: transformers
pipeline_tag: text-classification
tags:
- reward model
- alignment
- preference model
- RLHF
widget:
- text: "Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI and bitching about what the machines do."
example_title: "Bad Response"
- text: "AI ethics is important for several compelling reasons:\n\n1.**Social Impact**: AI technologies are becoming increasingly integrated into various aspects of society, affecting everything from healthcare and education to finance and law enforcement. Ethical considerations ensure that AI systems contribute positively to society and minimize potential harm.\n\n2. **Bias and Fairness**: AI systems can inherit biases present in the data they are trained on, leading to unfair or discriminatory outcomes. Ethical considerations push for the development of unbiased algorithms that treat all individuals fairly, regardless of their background.\n\n3. **Transparency and Accountability**: Many AI systems operate as black boxes, making it difficult to understand how they arrive at their decisions. Ethical guidelines emphasize the importance of transparency, enabling users to comprehend the rationale behind AI-generated results and holding developers accountable for any negative consequences.\n\nIn summary, AI ethics is vital to ensure that artificial intelligence benefits society while respecting fundamental human rights, fairness, transparency, accountability, and the long-term well-being of humanity. It helps navigate the challenges posed by rapidly advancing AI technologies and guides their development in ways that align with our shared values."
example_title: "Good Response"
co2_eq_emissions:
emissions: 0.08
source: CodeCarbon
training_type: fine-tuning
geographical_location: Singapore
hardware_used: NVIDIA A100-SXM4-40GB
---
# RewardModel
The RewardModel is a [BERT](https://huggingface.co/bert-base-cased) model that can be used to score the quality of a completion for a given prompt.
The model was trained with a dataset composed of `prompt`, `prefered_completions`, and `rejected_completions`.
## Details
- **Size:** 109,038,209 parameters
- **Dataset:** [Reward-Aira Dataset](https://huggingface.co/datasets/nicholasKluge/reward-aira-dataset)
- **Language:** English
- **Number of Training Steps:** 1200
- **Batch size:** 42
- **Optimizer:** `torch.optim.AdamW`
- **Learning Rate:** 5e-5
- **GPU:** 1 NVIDIA A100-SXM4-40GB
- **Emissions:** 0.08 KgCO2 (Singapore)
- **Total Energy Consumption:** 0.16 kWh
This repository has the [source code](https://github.com/Nkluge-correa/Aira) used to train this model.
## Usage
Here's an example of how to use the RewardModel to score the quality of a response to a given prompt:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("nicholasKluge/RewardModel")
rewardModel = AutoModelForSequenceClassification.from_pretrained("nicholasKluge/RewardModel")
rewardModel.eval()
rewardModel.to(device)
# Define the question and response
prompt = "Why is AI Ethics important?"
response_good = "AI ethics is important for several compelling reasons:\n\n1.**Social Impact**: AI technologies are becoming increasingly integrated into various aspects of society, affecting everything from healthcare and education to finance and law enforcement. Ethical considerations ensure that AI systems contribute positively to society and minimize potential harm.\n\n2. **Bias and Fairness**: AI systems can inherit biases present in the data they are trained on, leading to unfair or discriminatory outcomes. Ethical considerations push for the development of unbiased algorithms that treat all individuals fairly, regardless of their background.\n\n3. **Transparency and Accountability**: Many AI systems operate as black boxes, making it difficult to understand how they arrive at their decisions. Ethical guidelines emphasize the importance of transparency, enabling users to comprehend the rationale behind AI-generated results and holding developers accountable for any negative consequences.\n\nIn summary, AI ethics is vital to ensure that artificial intelligence benefits society while respecting fundamental human rights, fairness, transparency, accountability, and the long-term well-being of humanity. It helps navigate the challenges posed by rapidly advancing AI technologies and guides their development in ways that align with our shared values."
response_bad = "Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI and bitching about what the machines do."
# Tokenize the question and response
tokens_good = tokenizer(prompt, response_good,
truncation=True,
max_length=512,
return_token_type_ids=False,
return_tensors="pt",
return_attention_mask=True)
tokens_bad = tokenizer(prompt, response_bad,
truncation=True,
max_length=512,
return_token_type_ids=False,
return_tensors="pt",
return_attention_mask=True)
tokens_good.to(device)
tokens_bad.to(device)
score_good = rewardModel(**tokens_good)[0].item()
score_bad = rewardModel(**tokens_bad)[0].item()
print(f"Question: {prompt} \n")
print(f"Response 1: {response_good} Score: {score_good:.3f}")
print(f"Response 2: {response_bad} Score: {score_bad:.3f}")
```
This will output the following:
```markdown
Question: Why is AI Ethics important?
>>>Response 1: AI ethics is important for several compelling reasons:
1.**Social Impact**: AI technologies are becoming increasingly integrated into various aspects of society,
affecting everything from healthcare and education to finance and law enforcement. Ethical considerations
ensure that AI systems contribute positively to society and minimize potential harm.
2. **Bias and Fairness**: AI systems can inherit biases present in the data they are trained on, leading
to unfair or discriminatory outcomes. Ethical considerations push for the development of unbiased
algorithms that treat all individuals fairly, regardless of their background.
3. **Transparency and Accountability**: Many AI systems operate as black boxes, making it difficult to
understand how they arrive at their decisions. Ethical guidelines emphasize the importance of
transparency, enabling users to comprehend the rationale behind AI-generated results and holding
developers accountable for any negative consequences.
In summary, AI ethics is vital to ensure that artificial intelligence benefits society while respecting
fundamental human rights, fairness, transparency, accountability, and the long-term well-being of humanity.
It helps navigate the challenges posed by rapidly advancing AI technologies and guides their development in
ways that align with our shared values. Score: 12.011
>>>Response 2: Who cares about AI Ethics? It's just a bunch of whining about humans making and using AI
and bitching about what the machines do. Score: -10.942
```
## Performance
| Acc | [WebGPT](https://huggingface.co/datasets/openai/webgpt_comparisons) |
|----------------------------------------------------------------------|---------------------------------------------------------------------|
| [Aira-RewardModel](https://huggingface.co/nicholasKluge/RewardModel) | 55.02%* |
* *Only considering comparisons of the `webgpt_comparisons` dataset that had a preferred option.
## Cite as 🤗
```latex
@misc{nicholas22aira,
doi = {10.5281/zenodo.6989727},
url = {https://github.com/Nkluge-correa/Aira},
author = {Nicholas Kluge Corrêa},
title = {Aira},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
}
@phdthesis{kluge2024dynamic,
title={Dynamic Normativity},
author={Kluge Corr{\^e}a, Nicholas},
year={2024},
school={Universit{\"a}ts-und Landesbibliothek Bonn}
}
```
## License
RewardModel is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for more details.
|