google/gemma-2b · Strange and limited response

Feb 22, 2024

If the mistake is in my usage, please point this out, but I am using the demo code with only a changed input prompt and the max_length change described below.

Gemma-7b runs really really slowly (on a too-small-for-this-use-case RTX3050) but produces no better answers - worse in some cases.

The output using the demo code seems to be truncated. I am only getting a single line, and that often looks incomplete. Changing the generate line to
outputs = model.generate(**input_ids, max_length=500)
means that I eventually get a longer answer, but not a good one.

Example 1: A completly wrong answer, but it least it is still about language use, although the final test question has no correct answer.
Example 2: The question seems to be misinterpreted, and even then answered wrongly.
Example 3: Asking for an EFFICIENT sort routine returns a overly complicated and broken version of a bubble sort. The code is returned as a single line, which is difficult to intrepret back into the correctly indented structure. It returns the original list without sorting it if the return statement is from the sort_list() function, or returns nothing if it is the return from the sort() function. Even if both functions are given a suitable return value - return list from sort() and return sort(list) from sort_list() - the sort is in reverse.

e.g. 1
input_text = "What is the format of a Japanese haiku poem?"
Output:
What is the format of a Japanese haiku poem?

In the following sentence, underline the correct modifier from the pair given in parentheses. Example 1. He is (real, $\underline{\text{really}}$) talented at fixing cars.

I’m (slow, slowly) to understand geometrical concepts.

What is the correct present tense form of the verb?
El cliente no ____ (pedir) nada de beber.

In the following sentences, underline the correct form of the pronoun In parentheses.

Example: Please introduce the new students (who, $\underline{\textit{whom}}$) you invited to the party to the rest of the guests.

To (we, its), the important thing is to get started and begin experimenting.

gemma-7b returns a similar style but differently worded grammar test, with three tests of the 'correct modifier' style, the first two being the same as each other.

e.g. 2
input_text = "Count from 1 to 10" Note that this gets changed by the time it reaches the model.
Output:
Count from 1 to 100 in 100 different ways.

This is a great way to practice counting and number recognition.

...

This is a great way to practice counting and number recognition.

[Only 37 repetitions, so that is presumably the limit of 500 tokens]

gemma-7b returns:
Count from 1 to 10 with this fun and engaging counting activity.

This activity is perfect for your preschoolers and kindergarteners.

This activity is great for your math centers, morning work, or early finishers.

What's Included:

10 pages of counting from 1 to 10
10 pages of counting from 11 to 20

How to Use:

Print and laminate the pages.
Cut out the numbers.
Students will count the number of objects on each page and then place the correct number on the page.

You May Also Like:

Number Sense Activities
Number Sense Worksheets
Number Sense Activities
Number Sense Worksheets

Follow me on:

Instagram
Facebook
Pinterest
Blog

Terms of Use:

This product is for personal use only.
You may not share, sell, or distribute the product.
You may not claim the product as your own.
You may not alter the product in any way.
You may not use the product for commercial purposes.
You may not post the product on any website or social media platform.
You may not use the product for any illegal or unethical purposes.
You may not use the product for any purpose that violates the rights of the copyright holder.
You may not use the product for any purpose that is not specifically permitted by the terms of use.
If you have any questions or concerns about the terms of use, please contact me at info@thelittlekindergarten.com.

By purchasing or downloading this resource, you are agreeing to the terms of use.

e.g. 3
input_text = "Give a python function to efficiently sort a list passed as an input parameter."
Output:
Give a python function to efficiently sort a list passed as an input parameter.

Answer:

Step 1/4

We need to create a function that takes a list as an input parameter.

Step 2/4
2. We need to create a function that compares two elements of the list and returns a negative number if the first element is less than the second element, a positive number if the first element is greater than the second element, and zero if the two elements are equal.

Step 3/4
3. We need to create a function that swaps two elements of the list if they are not equal.

Step 4/4
4. We need to create a function that iterates through the list and swaps two elements if they are not equal. Here's the Python code for the function: def sort_list(list): def compare(a, b): if a < b: return -1 if a > b: return 1 if
a == b: return 0 def swap(a, b): list[a], list[b] = list[b], list[a] def sort(list): for i in range(len(list)): for j in range(i+1, len(list)): if compare(list[i], list[j]) == -1: swap(i, j) return list # Example usage: list = [1, 2, 3, 4, 5] sorted_list = sort_list(list) print(sorted_list) # Output: [1, 2, 3, 4, 5]

gemma-7b gives an answer of:
Answer:
def sort(list):
for i in range(len(list)): if list[i] > list[i+1]: list[i] = list[i+1] elif list[i] < list[i+1]: list[i] = list[i+1] else: list[i] = list[i]
which is complete nonsense, overwrites list entries instead of swapping pairs, only performs a single pass so could not move late entries to the start even if it did swap them properly, does not need the final else clause, and goes out of bounds on the final loop counter.

Just in case python import libary versions make a difference, my venv contains:
Package Version

accelerate 0.26.1
certifi 2023.11.17
charset-normalizer 3.3.2
colorama 0.4.6
diffusers 0.25.0
filelock 3.13.1
fsspec 2023.12.2
huggingface-hub 0.20.3
idna 3.6
importlib-metadata 7.0.1
Jinja2 3.1.3
MarkupSafe 2.1.3
mpmath 1.3.0
networkx 3.2.1
numpy 1.26.3
packaging 23.2
pillow 10.2.0
pip 23.2.1
psutil 5.9.7
PyYAML 6.0.1
regex 2023.12.25
requests 2.31.0
safetensors 0.4.2
scipy 1.11.4
setuptools 68.2.0
sympy 1.12
tokenizers 0.15.0
torch 2.0.1+cu117
tqdm 4.66.1
transformers 4.38.1
typing_extensions 4.9.0
urllib3 2.1.0
wheel 0.41.2
zipp 3.17.0

suryabhupa

Google org Feb 24, 2024

Thanks for sharing the prompts and the examples!

Did you apply the proper chat formatting with etc.? If this also fails, do you mind sending more prompts that you've used in your workflows, and I can take a look?

Squeack

Feb 24, 2024

•

edited Feb 26, 2024

The initial code was originally taken exactly from the code for GPU usage on the model card, then with the input_text prompt changed to see if it gave longer answers than the request for a poem was giving, then max_length (or max_new_tokens) set to a higher level when it was not.
I am not applying any kind of chat template - according to the model card this only seems necessary for the -it models. Should I be using those instead?

Renu11

Google org Aug 2, 2024

This comment has been hidden