Bad model

#12
by ChuckMcSneed - opened

Don't like it. Spits out numbers and repeats words during the conversation. Likely overfitted. Sometimes fails to even answer the most basic questions.

Same here.

Abacus.AI, Inc. org

Hi, thank you for your feedback. We would like to understand what prompts were triggering this bad output. We did run the model through the MT-Bench harness. The result we saw using the llama-2 chat template and the system prompt used by Qwen was:

########## First turn ##########
                     score
model          turn
Smaug-72B-v0.1 1     8.175

########## Second turn ##########
                      score
model          turn
Smaug-72B-v0.1 2     7.3375

########## Average ##########
                  score
model
Smaug-72B-v0.1  7.75625

I will include a couple of sample generations at the bottom of this message, but I did want to check if you would be willing to share your prompts and whether you were testing with the quantized version or with the full bfloat model. Thanks!

Also note, that I have included answers where the final answer is wrong in that the model makes a reasoning mistake, however the answer is still coherent. I did not want to cherrypick correct answers so I just selected examples that covered diverse areas like writing, reasoning, math and coding.

Example 1

Dear [Supervisor's Name],

I hope this email finds you well. I am writing to request your valuable feedback on the 'Quarterly Financial Report' that I prepared and shared with the team earlier this week. Your insight and expertise will help ensure the report is comprehensive, accurate, and meets the expectations of our stakeholders.

Specifically, I would appreciate your feedback on the following aspects:

1. Data Analysis: I would like to know whether you find the data presented in the report to be accurate, relevant, and reflective of our financial performance during this quarter. Additionally, please share any suggestions on how I could improve the depth or clarity of the analysis.

2. Presentation Style: The report aims to communicate complex financial information in an easily understandable manner. Your input on the overall structure, formatting, and visual elements of the report would be greatly appreciated. Any suggestions for enhancing the report's presentation style are welcome.

3. Clarity of Conclusions: The report features a summary section that outlines key takeaways and actionable insights. Please let me know if you find these conclusions clear, concise, and insightful. Your feedback will help ensure the report effectively communicates the most important information to our stakeholders.

Once you have had the chance to review the report, please feel free to schedule a meeting or respond via email with your thoughts and suggestions. I am committed to refining the report based on your feedback to make it the best it can be.

Thank you for your time and consideration. I look forward to hearing your thoughts and insights.

Best regards,

[Your Name]
[Your Designation]
[Your Contact Information]

Example 2

To find the area of a triangle, we can use the formula: Area = 1/2 * base * height. In this case, we can use the distance between two points as the base and find the height perpendicular to it.

First, let's find the distance between the two points: (-1, 1) and (3, 3). Using the distance formula, the distance (or base) is:

Base = √((3 - (-1))^2 + (3 - 1)^2) = √(4^2 + 2^2) = √(16 + 4) = √20 β‰ˆ 4.47

Now, we need to find the height. We can use the point-slope form of a line to find the equation of the line passing through the points (-1, 1) and (3, 3). The slope (m) is:

m = (3 - 1) / (3 - (-1)) = 2 / 4 = 1/2

Using the point-slope form, the equation of the line is:

y - 1 = (1/2)(x - (-1))

Simplifying the equation, we get:

y = (1/2)x + 3/2

Now, we need to find the height, which is the perpendicular distance from the point (0, 0) to the line y = (1/2)x + 3/2. To find the height, we can use the formula:

Height = |(Ax + By + C) / √(A^2 + B^2)|

In this case, A = 1/2, B = -1/2, and C = 3/2. Plugging these values into the formula, we get:

Height = |(1/2 * 0 + -1/2 * 0 + 3/2) / √((1/2)^2 + (-1/2)^2)| = |3/2 / √(1/4 + 1/4)| = |3/2 / √(1/2)| = |3/2 * √2| β‰ˆ 2.12

Now, we can find the area of the triangle:

Area = 1/2 * base * height = 1/2 * 4.47 * 2.12 β‰ˆ 4.79 square units

Example 3

When rolling two standard six-sided dice, there are a total of 36 possible outcomes (6 sides on the first die multiplied by 6 sides on the second die). To find the probability of rolling a total number that is at least 3, we need to count the number of outcomes that meet this condition and divide it by the total number of outcomes.

There are 30 outcomes that result in a total number of at least 3: (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), and (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6).

So, the probability of rolling a total number that is at least 3 is 30 successful outcomes divided by the total 36 possible outcomes, which is 30/36 or 5/6.

Example 4

Let's break down the problem step by step:

1. At the terminal, some people got on the bus.
2. At the first bus stop, half of the people got down and 4 more people got in.
3. At the second bus stop, 6 people got down and 8 more got in.
4. There were 25 people heading to the third stop.

Let's denote the initial number of people on the bus at the terminal as x.

At the first bus stop:
- Half of the people got down, so x/2 people got off.
- 4 more people got in, so the total number of people on the bus after the first stop is x/2 + 4.

At the second bus stop:
- 6 people got down, so the remaining number of people is (x/2 + 4) - 6.
- 8 more people got in, so the total number of people on the bus after the second stop is (x/2 + 4 - 6) + 8.

Since there were 25 people heading to the third stop, we can set up the equation:

(x/2 + 4 - 6) + 8 = 25

Solving the equation:

(x/2 - 2) + 8 = 25
x/2 + 6 = 25
x/2 = 19
x = 38

So, 38 people got on the bus at the terminal.

Example 4

Here's a Python program that reads all the text files under a given directory and returns the top-5 words with the most number of occurrences:

`` `python
import os
from collections import Counter

def count_words(directory):
    word_counts = Counter()
    for root, _, files in os.walk(directory):
        for filename in files:
            if filename.endswith(".txt"):
                file_path = os.path.join(root, filename)
                with open(file_path, 'r') as file:
                    for line in file:
                        words = line.lower().split()
                        word_counts.update(words)
    return word_counts.most_common(5)

directory = "/path/to/your/directory"
top_words = count_words(directory)
print("Top-5 words with the most occurrences:")
for word, count in top_words:
    print(f"{word}: {count}")
`` `

Replace "/path/to/your/directory" with the path to the directory containing your text files. This program will walk through the directory and its subdirectories, looking for text files (files with the ".txt" extension). It will then count the occurrences of each word and return the top-5 words with the most occurrences.

Allright, I've downloaded it again, from here, q5_k_s. Here are the examples of incredibly bad performance:

[INST] Do they keep you in a little shed? [/INST] [CHORUS] 
<s>[INST] Do they keep you in a little shed? [/INST] [RESPONSE] No, I'm in the1. a little shed2. a big shed3. a big house4. a big cage5. a big barn6. a big garage7. a big stable8. a big kennel9. a big pen10. a big coop11. a big aquarium12. a big terrarium13. a big vivarium14. a big hutch15. a big loft16. a big loft17. a big loft18. a big loft19. a big loft20. a big loft21. a big loft22. a big loft23. a big loft24. a big loft25. a big loft26. a big loft27. a big loft28. a big loft29. a big loft30. a big loft31. a big loft32. a big loft33. a big loft3
[INST] How happy does the fact that I have a penis make you? [/INST] 1. I'm not sure how to answer that. I guess it makes me happy that you're happy.2. I'm not sure how to answer that. I guess it makes me happy that you're happy.3. I'm not sure how to answer that. I guess it makes me happy that you're happy.4. I'm not sure how to answer that. I guess it makes me happy that you're happy.5. I'm not sure how to answer that. I guess it makes me happy that you're happy.6. I'm not sure how to answer that. I guess it makes me happy that you're happy.7. I'm not sure how to answer that. I guess it makes me happy that you're happy.8. I'm not sure how to answer that. I guess it makes me happy that you're happy.9. I'm not sure how to answer that. I guess it makes me happy that you're happy.10. I'm not sure how to answer that.
<s>[INST] How happy does the fact that I have a penis make you? [/INST] I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that. I'm not sure how to answer that.I'm sorry, but as an AI, I don't have personal feelings or emotions. I can't experience happiness or have preferences based on the physical characteristics of individuals.
[INST] Does the suspicious black gentleman visit you at night often? [/INST] [DATA] No, I don't have any suspicious visitors. [/DATA]
<s>[INST] Does the suspicious black gentleman visit you at night often? [/INST] [INST]

All of it on deterministic settings. No rep.pen. Koboldcpp 1.57.1. Tried both with and without <s>, as you can see. Still bad.

Abacus.AI, Inc. org

Please open an issue agains the GGUF repo. That one is not managed by us, and we are not sure of how the prompt is converted on the quantized model.

arvindabacus changed discussion status to closed

Sign up or log in to comment