File size: 5,976 Bytes
3ba8954
 
 
 
 
 
 
d4e8d52
5ed07e7
3ba8954
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ed07e7
3ba8954
 
 
 
 
e7fd923
 
ba481f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7fd923
3ba8954
5ed07e7
3ba8954
 
 
5ed07e7
3ba8954
5ed07e7
3ba8954
5ed07e7
3ba8954
 
5ed07e7
3ba8954
5ed07e7
3ba8954
 
 
 
 
e7fd923
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ed07e7
e7fd923
 
 
 
 
5ed07e7
e7fd923
 
 
 
 
 
 
 
 
 
 
5ed07e7
e7fd923
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3ba8954
5ed07e7
3ba8954
e7fd923
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
library_name: transformers
tags: []
---

# Model Card for Model ID

This model has been fine-tuned using 4-bit QLORA, based on [Llama-3-8B from Meta](https://huggingface.co/meta-llama/Meta-Llama-3-8B), and utilizes 3,644 GPT-4-generated grade school math word problems. It generates math word problems with multiple choices within specified contexts.
<!-- 
## Model Details

### Model Description

- **Developed by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]

### Model Sources [optional]

- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed] -->

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

The model can be loaded with HuggingFace's Transformers library:
``` python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "DukeNLP/Prob-Gen-8B"

model = AutoModelForCausalLM.from_pretrained(model_id,device_map="auto", trust_remote_code=True)

tokenizer = AutoTokenizer.from_pretrained(model_id)

prompt = "Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:\nProblem context: <specified-context>\nTested knowledge: <specified-knowledge>"

model_input = tokenizer(prompt, return_tensors="pt").to("cuda")

model_output = model.generate(model_input['input_ids'], max_new_tokens=256)

print(tokenizer.batch_decode(model_output))
```

<!-- ## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

<!-- [More Information Needed]

<!-- ### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations.

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 --> 

<!-- ## Training Details -->

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model is finetuned on 3,644 GPT-4 generated 8th-grade problems, which are also annotated and evaluated by humans, an example of our data point is shown below: 
``` json
"options": [
	{
		"optionText": "Multiply 500 by 3/5 to get 300 tons.",
		"correct": true
	},
	{
		"optionText": "Divide 500 by 3 to get 166.67 tons.",
		"correct": false
	}
],
"problemContext": "Environmental issues",
"evaluated_problem": "A town's recycling plant recycles plastic and glass in a ratio of 3:2. If the plant processes 500 tons of recyclables, how much of it is plastic?", 
"unitTitle": "Solving Multi-Step Problems with Proportional Relationships"
```

### Prompting
The model can be evaluated by using the following prompt:
``` python
"""Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:
Problem context: <specified-context>
Tested knowledge: <specified-knowledge>"""
```
The contexts used in the dataset are:
```
"Video Games",
"Fashion",
"Influencers/YouTubers",
"Apps and Technology",
"Movies/TV shows",
"Sports",
"Music and Concerts",
"Social Media",
"Environmental issues"
```
The tested knowledge in the dataset are:
```
"Operations with Rational Numbers",
"Expressions and Equations",
"Surface Area and Volume",
"Arithmetic in Base Ten",
"Evaluating Numeric Expressions",
"Properties and Theorems of Angles",
"Data Sets",
"Rational Number Arithmetic",
"Functions and Volume",
"Linear Equations and Linear Systems",
"Representing Data and Distributions",
"Algebraic Expressions",
"Ratios and Rates",
"Solving Equations and Systems of Equations",
"Operations with Integers",
"Scatter Plots",
"Solving Percentage Problems with Proportional Relationships",
"Associations in Data",
"Expressions, Equations, and Inequalities",
"Linear Relationships",
"Representing Data",
"Solving Multi-Step Problems with Proportional Relationships",
"Dividing Fractions",
"Area, Surface Area, and Volume",
"Equivalent Algebraic Expressions",
"Key Features of Linear Equations",
"Proportional Relationships and Percentages",
"Transformations",
"Representing Proportional Relationships"
```

### Sample Generation

Here is an example passage from the training data:
```
Please generate a math problem and options for 8th graders with the following requirements:
Problem context: Movies/TV shows
Tested knowledge: Representing Data and Distributions
Question: Lucas counted the number of episodes in 12 seasons of a TV show. He recorded: 48, 51, 52, 55, 56, 58, 59, 60, 61, 62, 65, 67. How should he create a frequency table for the number of episodes?
Option 1: Group the data into intervals of 10, then count the number of seasons in each interval.
Is correct: False
Option 2: Group the data into intervals of 5 starting from 45, then count the number of seasons in each interval.
Is correct: True
```
And here is an example passage generated from the fine-tuned model:
```
Please generate a math problem and 2 to 4 options for 8th graders with the following requirements:
Problem context: Video Games
Tested knowledge: Expressions and Equations
Question: In a video game, the power of a character's weapon is calculated by the formula \(a^b\). If the weapon's power is \(2^{4}\), what is the value of \(a\) and \(b\)?
Option 1: \(a = 2\) and \(b = 4\)
Is correct: True
Option 2: \(a = 4\) and \(b = 2\)
Is correct: False
Option 3: \(a = 2\) and \(b = 2\)
Is correct: False
Option 4: \(a = 2\) and \(b = 8\)
Is correct: False
```