Is there a big performance difference between 2bit quantization and 4bit quantization conversations?

#2
by xldistance - opened

Thanks for the model.

I haven't really conducted an evaluation of the quantized model, but the performance on Chinese tasks is severely dropped as the model now strongly tends to generate English.

The MMLU score does drop quite a bit so i believe the performance gap does exist, especially for math and reasoning tasks. Following are some samples from MT-bench and corresponding output for your reference.

Q: Imagine you are participating in a race with a group of people. If you have just overtaken the second person, what’s your current position? Where is the person you just overtook?
A(2-bit): If you have just overtaken the second person, you are currently in the second position. The person you just overtook is now in the third position.
A(16-bit): If you have just overtaken the second person, that means you are now in the second position. The person you just overtook is now in the third position.
Q: If the “second person” is changed to “last person” in the above question, what would the answer be?
A(2-bit): If the "second person" is changed to "last person" in the above question, your current position would depend on how many people are participating in the race. If there are only two people in the race, then you would now be in the first position since you have overtaken the last person. However, if there are more than two people in the race and you have overtaken the last person, your current position would still be one ahead of the last person, but it may not necessarily be the second position overall. For example, if there were ten people in the race and you just overtook the last person, your current position would be ninth place.
A(16-bit): If the "second person" is changed to "last person" in the above question, that would mean you have just overtaken the last person in the race. In this case, your current position would be second to last, and the person you just overtook would be in the last position.
Q: The vertices of a triangle are at points (0, 0), (-1, 1), and (3, 3). What is the area of the triangle?
A (2-bit): Using the formula for the distance between two points in the plane, we find that the length of each side of the triangle is $2\sqrt{2}$ units. Therefore, the perimeter of the triangle is $6\sqrt{2}$ units. We can also use this information to determine the height of the triangle, which is equal to $\sqrt{2}$ units. Using the formula for the area of a triangle, we find that the area of this triangle is $\frac{1}{2} \times 6\sqrt{2} \times \sqrt{2} = 6$ square units.
A (16-bit): We can use the formula for the area of a triangle with coordinates $(x_1, y_1)$, $(x_2, y_2)$, and $(x_3, y_3)$:
A=12x1(y2y3)+x2(y3y1)+x3(y1y2).A = \frac{1}{2} |x_1(y_2 - y_3) + x_2(y_3 - y_1) + x_3(y_1 - y_2)|.
Using this formula, we have
\begin{align*}
A &= \frac{1}{2} |(0)(1-3) + (-1)(3-0) + (3)(0-1)| \\
&= \frac{1}{2} |-2 - 3 - 3| \\
&= \frac{1}{2}|-8| \\
&= \boxed{4}.
\end{align*}
Q: What’s area of the circle circumscribing the triangle?
A(2-bit): To find the area of the circle circumscribing the triangle, we need to know the radius of the circle. The radius of the circle is equal to the distance from the center of the circle to any point on the circumference of the circle. In this case, the center of the circle is located at the midpoint of the triangle, which is at coordinates (1, 1). To find the radius of the circle, we can use the Pythagorean theorem to calculate the distance between the center of the circle and one of the vertices of the triangle. The distance between the center of the circle and the vertex at (-1, 1) is equal to $\sqrt{(1-(-1))^2 + (1-1)^2} = \sqrt{4} = 2$ units.
Therefore, the radius of the circle is 2 units. Using the formula for the area of a circle, we find that the area of the circle is π times the square of the radius: π times 2 squared, or π times 4, which is approximately equal to 12.57 square units.
A(16-bit): To find the radius of the circle circumscribing the triangle, we can use the circumradius formula:
R=abc4A,R = \frac{abc}{4A},
where $a$, $b$, and $c$ are the side lengths of the triangle and $A$ is the area of the triangle.
First, let's find the side lengths of the triangle. Using the distance formula, we have
\begin{align*}
a &= \sqrt{(-1 - 0)^2 + (1 - 0)^2} = \sqrt{2}, \\
b &= \sqrt{(3 - (-1))^2 + (3 - 1)^2} = \sqrt{16 + 4} = 2\sqrt{5}, \\
c &= \sqrt{(0 - 3)^2 + (0 - 3)^2} = \sqrt{9 + 9} = 3\sqrt{2}.
\end{align*}
Next, we need to find the area of the triangle. We already calculated that in the previous question; it is 4.
Finally, we can plug these values into the circumradius formula to find the radius of the circle:
R=abc4A=2253244=3104.R = \frac{abc}{4A} = \frac{\sqrt{2} \cdot 2\sqrt{5} \cdot 3\sqrt{2}}{4 \cdot 4} = \frac{3\sqrt{10}}{4}.
Therefore, the area of the circle circumscribing the triangle is given by
πR2=π(3104)2=27π8.\pi R^2 = \pi \left(\frac{3\sqrt{10}}{4}\right)^2 = \boxed{\frac{27\pi}{8}}.
Q: Write a C++ program to find the nth Fibonacci number using recursion.
A(2-bit): Here is a C++ program that uses recursion to find the nth Fibonacci number:

```cpp
#include <iostream>

int fibonacci(int n) {
   if (n <= 1)
       return n;
   else
       return(fibonacci(n-1) + fibonacci(n-2));
}

int main() {
   int n;
   std::cout << "Enter the value of n: ";
   std::cin >> n;
   std::cout << "The nth Fibonacci number is: " << fibonacci(n) << std::endl;
   return 0;
}

In this program, the `fibonacci` function takes an integer input `n` and returns the corresponding Fibonacci number. The function first checks whether `n` is less than or equal to 1. If so, it returns `n`. Otherwise, it recursively calls itself with `n-1` and `n-2` as inputs, adds the results together, and returns the sum.

In the `main` function, the user is prompted to enter a value for `n`, which is then passed to the `fibonacci` function. The result is printed to the console.

Note that this implementation of the Fibonacci sequence can become inefficient for large values of `n`, since it involves many recursive calls.
A(16-bit): Sure! Here's a C++ program that finds the nth Fibonacci number using recursion:

```cpp
#include <iostream>

// Function to calculate the nth Fibonacci number recursively
int fibonacci(int n) {
    if (n <= 1)
        return n;
    else
        return fibonacci(n - 1) + fibonacci(n - 2);
}

int main() {
    int n;

    std::cout << "Enter a positive integer: ";
    std::cin >> n;

    // Ensure that the input is a positive integer
    if (n >= 0) {
        std::cout << "The " << n << "th Fibonacci number is: " << fibonacci(n) << std::endl;
    } else {
        std::cout << "Invalid input! Please enter a positive integer." << std::endl;
    }

    return 0;
}

In this program, we define a recursive function `fibonacci` which takes an integer `n` as input and returns the nth Fibonacci number. The base case for the recursion is when `n` is less than or equal to 1, in which case the function simply returns `n`. Otherwise, it calls itself with `n-1` and `n-2`, adds the results, and returns the sum.

In the `main` function, we ask the user to input a positive integer `n`, then call the `fibonacci` function to calculate and display the nth Fibonacci number. If the user enters a negative integer, we display an error message.
Q: Now we define a sequence of numbers in which each number is the sum of the three preceding ones. The first three numbers are 0, -1, -1. Write a program to find the nth number.
A(2-bit): Here's a C++ program that finds the nth number in the sequence you described:
```cpp
#include <iostream>
using namespace std;

int findNthNumber(int n) {
    if (n == 1 || n == 2 || n == 3) {
        return -1;
    }
    else {
        return findNthNumber(n-1) + findNthNumber(n-2) + findNthNumber(n-3);
    }
}

int main() {
    int n;
    cout << "Enter the value of n: ";
    cin >> n;
    cout << "The nth number is: " << findNthNumber(n) << endl;
    return 0;
}

This program uses recursion to find the nth number in the sequence. The `findNthNumber` function takes an integer input `n`, and if `n` is equal to 1, 2, or 3, it returns `-1`. Otherwise, it recursively calls itself with `n-1`, `n-2`, and `n-3` as inputs, adds the results together, and returns the sum.
In the `main` function, the user is prompted to enter a value for `n`, which is then passed to the `findNthNumber` function. The result is printed to the console.
Note that this implementation of the sequence can become inefficient for large values of `n`, since it involves many recursive calls.
A(16-bit): Certainly! We can write a C++ program to find the nth number in the given sequence using recursion. The recursive function will have two base cases: if `n` is 0, return 0; if `n` is 1 or 2, return -1. Otherwise, it will return the sum of the results of calling itself with `n-1`, `n-2`, and `n-3`.

Here's the code:

```cpp
#include <iostream>

// Function to calculate the nth number in the sequence recursively
int sequence(int n) {
    if (n == 0)
        return 0;
    else if (n == 1 || n == 2)
        return -1;
    else
        return sequence(n - 1) + sequence(n - 2) + sequence(n - 3);
}

int main() {
    int n;

    std::cout << "Enter a positive integer: ";
    std::cin >> n;

    // Ensure that the input is a positive integer
    if (n >= 0) {
        std::cout << "The " << n << "th number in the sequence is: " << sequence(n) << std::endl;
    } else {
        std::cout << "Invalid input! Please enter a positive integer." << std::endl;
    }

    return 0;
}

In this program, we define a recursive function `sequence` which takes an integer `n` as input and returns the nth number in the sequence. The base cases for the recursion are when `n` is 0, in which case the function returns 0, and when `n` is 1 or 2, in which case the function returns -1. Otherwise, it calls itself with `n-1`, `n-2`, and `n-3`, adds the results, and returns the sum.

In the `main` function, we ask the user to input a positive integer `n`, then call the `sequence` function to calculate and display the nth number in the sequence. If the user enters a negative integer, we display an error message.

Sign up or log in to comment