Prob-Gen-70B / README.md
DukeNLP's picture
Update README.md
24784b4 verified
---
library_name: transformers
tags: []
---
# Model Card for Model ID
This model has been fine-tuned using 4-bit QLORA, based on [Llama-3-70B from Meta](https://huggingface.co/meta-llama/Meta-Llama-3-8B), and utilizes 3,644 GPT-4-generated grade school math word problems. It generates math word problems with multiple choices within specified contexts.
<!--
## Model Details
### Model Description
- **Developed by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed] -->
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The model can be loaded with HuggingFace's Transformers library:
``` python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "DukeNLP/Prob-Gen-70B"
model = AutoModelForCausalLM.from_pretrained(model_id,device_map="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:\nProblem context: <specified-context>\nTested knowledge: <specified-knowledge>"
model_input = tokenizer(prompt, return_tensors="pt").to("cuda")
model_output = model.generate(model_input['input_ids'], max_new_tokens=256)
print(tokenizer.batch_decode(model_output))
```
<!-- ## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
<!-- [More Information Needed]
<!-- ### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations.
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-->
<!-- ## Training Details -->
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The model is finetuned on 3,644 GPT-4 generated 8th-grade problems, which are also annotated and evaluated by humans, an example of our data point is shown below:
``` json
"options": [
{
"optionText": "Multiply 500 by 3/5 to get 300 tons.",
"correct": true
},
{
"optionText": "Divide 500 by 3 to get 166.67 tons.",
"correct": false
}
],
"problemContext": "Environmental issues",
"evaluated_problem": "A town's recycling plant recycles plastic and glass in a ratio of 3:2. If the plant processes 500 tons of recyclables, how much of it is plastic?",
"unitTitle": "Solving Multi-Step Problems with Proportional Relationships"
```
### Prompting
The model can be evaluated by using the following prompt:
``` python
"""Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:
Problem context: <specified-context>
Tested knowledge: <specified-knowledge>"""
```
The contexts used in the dataset are:
```
"Video Games",
"Fashion",
"Influencers/YouTubers",
"Apps and Technology",
"Movies/TV shows",
"Sports",
"Music and Concerts",
"Social Media",
"Environmental issues"
```
The tested knowledge in the dataset are:
```
"Operations with Rational Numbers",
"Expressions and Equations",
"Surface Area and Volume",
"Arithmetic in Base Ten",
"Evaluating Numeric Expressions",
"Properties and Theorems of Angles",
"Data Sets",
"Rational Number Arithmetic",
"Functions and Volume",
"Linear Equations and Linear Systems",
"Representing Data and Distributions",
"Algebraic Expressions",
"Ratios and Rates",
"Solving Equations and Systems of Equations",
"Operations with Integers",
"Scatter Plots",
"Solving Percentage Problems with Proportional Relationships",
"Associations in Data",
"Expressions, Equations, and Inequalities",
"Linear Relationships",
"Representing Data",
"Solving Multi-Step Problems with Proportional Relationships",
"Dividing Fractions",
"Area, Surface Area, and Volume",
"Equivalent Algebraic Expressions",
"Key Features of Linear Equations",
"Proportional Relationships and Percentages",
"Transformations",
"Representing Proportional Relationships"
```
### Sample Generation
Here is an example passage from the training data:
```
Please generate a math problem and options for 8th graders with the following requirements:
Problem context: Movies/TV shows
Tested knowledge: Representing Data and Distributions
Question: Lucas counted the number of episodes in 12 seasons of a TV show. He recorded: 48, 51, 52, 55, 56, 58, 59, 60, 61, 62, 65, 67. How should he create a frequency table for the number of episodes?
Option 1: Group the data into intervals of 10, then count the number of seasons in each interval.
Is correct: False
Option 2: Group the data into intervals of 5 starting from 45, then count the number of seasons in each interval.
Is correct: True
```
And here is an example passage generated from the fine-tuned model:
```
Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:
Problem context: Video Games
Tested knowledge: Expressions and Equations
Question: In a video game, the power of a character's weapon is calculated by the formula \(a^b\). If the weapon's power is \(2^{4}\), what is the value of \(a\) and \(b\)?
Option 1: \(a = 2\) and \(b = 4\)
Is correct: True
Option 2: \(a = 4\) and \(b = 2\)
Is correct: False
Option 3: \(a = 2\) and \(b = 2\)
Is correct: False
Option 4: \(a = 2\) and \(b = 8\)
Is correct: False
```