Wrong solution for 1+1=

#18

by yixliu1 - opened Dec 14, 2023

Dec 14, 2023

I tried several ways (using GPU & float 16, using GPU only, using CPU, different max token length) to generate answer for 1+1. Here is the several answer I get:

Here is my code:

MuteXX

Dec 14, 2023

LLMs can't do math reliably without external assistance.

I don't know what you expect here.

yixliu1

Dec 14, 2023

Hi,
Thanks for your answering. I also tried a few other examples. For instance: ask it to generate a prompt based on my need, and ask it to answer some questions based on the context I provide. Neither of them shows useful results.

MuteXX

Dec 14, 2023

•

edited Dec 14, 2023

You're asking a base model to solve problems.

What you want is the Instruct variant. Base isn't suitable for this.

yixliu1

Dec 14, 2023

So does base model for further fine tune while instruct model for solve problems？
Thanks for your sharing！

MuteXX

Dec 14, 2023

The base model is a raw LLM; it ONLY does text completion.
Instruct has been tuned to respond to you instead.

yixliu1

Dec 14, 2023

Understand. Thx！

Satoszi

Dec 21, 2023

"1+1=3" doesn't necessarily mean that the model was wrong. The '1+1=3' can mean many different things, such as irony, a metaphor for synergy, and it can even be the start of an equation like '1+1=3-1' which also is correct.

The issue is NOT that the model is incapable of such a simple operation! It's because it doesn't understand what you actually want from it. If you want to ensure the model knows what you mean, you either have to fine-tune it or give an example by prefacing the equation, for example, '5+3=8 1+1=', and now it's obvious that the expected answer is the sum of 1 and 1.

Several examples:

Input: '5+3=8 1+1=', the model outputs '2'
Input: 'Sum of: 1+1=', the model outputs '2'
Input: 'Sum of: 62+16=', the model outputs '78'

MuteXX

Dec 21, 2023

You can certainly nudge an LLM in the right direction, but they are fundamentally incapable of arithmetics or real logical actions without external help.
Don't mistake simple calculations being right as ability to do maths.

yixliu1

Dec 22, 2023

"1+1=3" doesn't necessarily mean that the model was wrong. The '1+1=3' can mean many different things, such as irony, a metaphor for synergy, and it can even be the start of an equation like '1+1=3-1' which also is correct.

The issue is NOT that the model is incapable of such a simple operation! It's because it doesn't understand what you actually want from it. If you want to ensure the model knows what you mean, you either have to fine-tune it or give an example by prefacing the equation, for example, '5+3=8 1+1=', and now it's obvious that the expected answer is the sum of 1 and 1.

Several examples:

Input: '5+3=8 1+1=', the model outputs '2'
Input: 'Sum of: 1+1=', the model outputs '2'
Input: 'Sum of: 62+16=', the model outputs '78'

Hi Satoszi,
Thx for your sharing! I haven't though from that side. I think that's quite interesting. It's like LLM has many "capabilities" to answer this question but without FT it doesn't know which one it should give.

Satoszi

Dec 23, 2023

You can certainly nudge an LLM in the right direction, but they are fundamentally incapable of arithmetics or real logical actions without external help.
Don't mistake simple calculations being right as ability to do maths.

Of course you are right, that LLMs are not good at arithmetics, and they are not built for that. We can never trust LLM output in arithmetic problems (and other domains too 🙂). It'll give approximation that looks legit, but for simple operations like sum of small numbers etc that approximation should be (usually) correct.

"1+1=3" doesn't necessarily mean that the model was wrong. The '1+1=3' can mean many different things, such as irony, a metaphor for synergy, and it can even be the start of an equation like '1+1=3-1' which also is correct.

The issue is NOT that the model is incapable of such a simple operation! It's because it doesn't understand what you actually want from it. If you want to ensure the model knows what you mean, you either have to fine-tune it or give an example by prefacing the equation, for example, '5+3=8 1+1=', and now it's obvious that the expected answer is the sum of 1 and 1.

Several examples:

Input: '5+3=8 1+1=', the model outputs '2'
Input: 'Sum of: 1+1=', the model outputs '2'
Input: 'Sum of: 62+16=', the model outputs '78'

Hi Satoszi,
Thx for your sharing! I haven't though from that side. I think that's quite interesting. It's like LLM has many "capabilities" to answer this question but without FT it doesn't know which one it should give.

Yeah vanilla LLMs without reinforcement or other fancy finetuning methods are pretty "stupid" 😁

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment