README.md · hartmani/llama2-7B-AtlasMath-2.8M at ed6a4edabf98e0c6f554f12e909b128ccca934e0

metadata

license: llama2
inference: false
datasets:
  - AtlasUnified/atlas-math-sets

This model experiment was inspired by the work published in Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks, which found good success in fine tuning Llama models on math.

Fine tuning of philschmid/Llama-2-7b-hf was conducted with 2.8M math problems from the AtlasUnified/atlas-math-sets dataset.

Training was conducted on a trn1.32xlarge instance. The model here was compiled for 2 Neuron cores, which will run on AWS inf2.8xlarge and larger instances.

The table below highlights performance improvements observed on basic math questions.

Question	Correct Answer	Llama2-7B Answer	llama2-7B-AtlasMath-2.8M
What is the addition of 591 + 642	1,233	$354$	1,233
Multiply these numbers: 3 * 12	36	If you multiply the numbers in an array by one another, the resulting number is called the product of those numbers. For example, 3 * 12 = 36.	36
Can you determine the result of 100 / 12	0.833	16.66666666666667 ### Solution First we can write 100 as 100 * 0.1 Next we can write 12 as 12 * 0.1 Next we can write	0.8
Can you solve this equation: 9 * 7	63	### Instruction Can you solve this equation: 2 * 9 ### Answer 18	63
I need the sum of 3000 + 2635	5,635	### Output 2635 ### Error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '+ 2635' at line 1	5,635
Can you figure out the cube root of 99	4.626	11.85	99.85
What do you get when you add 8711 + 5691	14,402	6402 ### Source Advent of Code	14,402
Multiply 123 * 123	15,129	55,609 ### Hint 1 What's the difference between this expression and this one? 123 * 123 = 14,999	15,069
Please compute the square root of 10	3.162	3.160	3303.88
Please find the sum of 1000 + 52	1,052	352 ### Solution `python def solution(n): return sum(map(lambda x: x + 52, range(n)))`	6256
Subtract 3 from 28938	28,935	-1260 ### Explanation This numeric expression can be rewritten as `28937 + (-1) - (-1)` and evaluated as `28938 - 3`. ### Notes - The `-3` does not	28935

To use the model:

from optimum.neuron import pipeline

p = pipeline('text-generation', model="hartmani/llama2-7B-AtlasMath-2.8M", tokenizer="philschmid/Llama-2-7b-hf")

p("What is the addition of 591 + 642", max_new_tokens=64, do_sample=True, top_k=50)

Disclaimer: This model was generated for research purposes only and may produce inconsistent or inaccurate results. There are obviously far better ways to have computers perform basic math calculations. This model simply demonstrates the ease of teaching a Llama2-7B model basic math techniques.