This repository is to store all parameters for RWKV6 distillation from Qwen25.
Try to use the model via : https://github.com/yynil/RWKVinLLAMA/blob/main/rwkv_llama/test_custom_cache_rwkv.py
Examples:
- Layer 2 [0,2] Distilled model: Journal planing
python rwkv_llama/test_custom_cache_rwkv.py --config_file configs/step_wise/test_hybrid_2_layer_qwenmlp_local.yaml --ckpt_file /home/yueyulin/model/qwen/layer2.pth --input_text "I will stay in Beijing on Oct 5,6, 2024. Please help me to make a jounal plan for these two days."
...
...
Certainly! Here's a sample journal plan for your stay in Beijing on October 5th and 6th, 2024:
### Day 1: October 5, 2024
**Morning:**
- **8:00 AM - 9:00 AM:** Wake up and have breakfast at your hotel.
- **9:30 AM - 12:00 PM:** Visit the Forbidden City (Palace Museum). Explore the grand halls, courtyards, and exhibitions showcasing imperial history.
- **12:00 PM - 1:00 PM:** Have lunch at a nearby restaurant.
- **1:30 PM - 4:00 PM:** Walk around Tiananmen Square and take photos of the iconic Tiananmen Gate and the National Museum of China.
- **4:30 PM - 6:00 PM:** Stroll through the scenic Beihai Park, enjoying the beautiful gardens and lake.
**Evening:**
- **6:30 PM - 8:00 PM:** Enjoy dinner at a local restaurant, perhaps trying Peking duck or other traditional dishes.
- **8:30 PM - 10:00 PM:** Watch a performance of Beijing Opera or a traditional Chinese acrobatics show.
- **10:00 PM:** Return to your hotel and relax.
### Day 2: October 6, 2024
**Morning:**
- **8:00 AM - 9:00 AM:** Have breakfast at your hotel.
- **9:30 AM - 12:00 PM:** Visit the Summer Palace, one of the largest and most well-preserved imperial gardens in China. Explore the Hall of Happiness, the Long Corridor, and the Kunming Lake.
- **12:00 PM - 1:00 PM:** Have lunch at a nearby restaurant.
- **1:30 PM - 4:00 PM:** Explore the Temple of Heaven, known for its beautiful architecture and historical significance. Take a guided tour to learn about the religious practices and rituals that took place there.
- **4:30 PM - 6:00 PM:** Visit the National Library of China, which houses an extensive collection of books, manuscripts, and rare documents.
**Evening:**
- **6:30 PM - 8:00 PM:** Enjoy dinner at a local restaurant, perhaps trying some Sichuan cuisine.
- **8:30 PM - 10:00 PM:** Take a leisurely walk along Qianmen Street, a historic commercial area with many shops, restaurants, and cultural attractions.
- **10:00 PM:** Return to your hotel and prepare for departure or further exploration.
### Tips:
- **Transportation:** Consider using public transportation like the subway or buses to save time and money. The Beijing Subway is efficient and covers most major tourist areas.
- **Weather:** Check the weather forecast before you go out each day. Beijing can be quite cold in the fall, so dress accordingly.
- **Safety:** Keep your valuables secure and be aware of your surroundings, especially in crowded areas.
- **Language:** While many people speak English, it’s always helpful to know some basic Mandarin phrases.
Enjoy your trip to Beijing! If you need more specific recommendations or details, feel free to ask.
Mathmatic solver:
python rwkv_llama/test_custom_cache_rwkv.py --config_file configs/step_wise/test_hybrid_2_layer_qwenmlp_local.yaml --ckpt_file /home/yueyulin/model/qwen/layer2.pth --input_text "## Question: \n\nFarmer Brown has 20 animals on his farm, all either chickens or cows. They have a total of 70 legs, all together. How many of the animals are chickens? ## Instruction \n\nPlease answer this question by first reasoning and then providing your answer."
...
...
To solve this problem, we can use a system of equations based on the information given.
Let's define:
- \( C \) as the number of chickens.
- \( K \) as the number of cows.
From the problem, we know two things:
1. The total number of animals is 20.
2. The total number of legs is 70.
We can write these as equations:
1. \( C + K = 20 \)
2. Chickens have 2 legs each, and cows have 4 legs each, so \( 2C + 4K = 70 \).
First, let's simplify the second equation by dividing everything by 2:
\[ C + 2K = 35 \]
Now we have the system:
1. \( C + K = 20 \)
2. \( C + 2K = 35 \)
Next, we can subtract the first equation from the second to eliminate \( C \):
\[ (C + 2K) - (C + K) = 35 - 20 \]
\[ C + 2K - C - K = 15 \]
\[ K = 15 \]
So, there are 15 cows. Now, we can substitute \( K = 15 \) back into the first equation to find \( C \):
\[ C + 15 = 20 \]
\[ C = 20 - 15 \]
\[ C = 5 \]
Therefore, there are 5 chickens on Farmer Brown's farm.
To solve the same problem in Chinese.
python rwkv_llama/test_custom_cache_rwkv.py --config_file configs/step_wise/test_hybrid_2_layer_qwenmlp_local.yaml --ckpt_file /home/yueyulin/model/qwen/layer2.pth --input_text "## Question: \n\nFarmer Brown has 20 animals on his farm, all either chickens or cows. They have a total of 70 legs, all together. How many of the animals are chickens? ## Instruction \n\nPlease answer this question by first reasoning and then providing your answer. Please answer it in Chinese."
...
...
user
## Question: \n\nFarmer Brown has 20 animals on his farm, all either chickens or cows. They have a total of 70 legs, all together. How many of the animals are chickens? ## Instruction \n\nPlease answer this question by first reasoning and then providing your answer. Please answer it in Chinese.
assistant
为了解决这个问题,我们可以先设定一些变量。假设鸡的数量为x,牛的数量为y。
根据题目信息,我们有两个方程:
1. 鸤和牛的总数是20,即 \(x + y = 20\)
2. 鸤有2条腿,牛有4条腿,总共有70条腿,即 \(2x + 4y = 70\)
接下来,我们可以通过解这个方程组来找到x和y的值。
首先简化第二个方程,除以2得到 \(x + 2y = 35\)。
然后我们用第一个方程减去这个新方程的一部分,即:
\[ (x + y) - (x + 2y) = 20 - 35 \]
\[ -y = -15 \]
\[ y = 15 \]
知道了y(牛的数量)是15,我们可以很容易地求出x(鸡的数量):
\[ x + 15 = 20 \]
\[ x = 5 \]
所以,农民布朗的农场上有5只鸡。
答案:5只鸡。
- Layers 4 [0,2,4,6] distilled model(70%):
Calculus solver:
python rwkv_llama/test_custom_cache_rwkv.py --config_file configs/step_wise/test_hybrid_4_layer_qwenmlp_local.yaml --ckpt_file /home/yueyulin/model/qwen/layer4_70%.pth --input_text "f(x)=log_10(4x+1),则 它的导数 f'(x)如何计算" --d
evice cuda
要计算函数 \(f(x) = \log_{10}(4x + 1)\) 的导数 \(f'(x)\),我们可以使用链式法则和对数函数的导数公式。具体步骤如下:
1. **对数函数的导数公式**:对于一般形式的对数函数 \(y = \log_a(u)\),其导数为 \(\frac{dy}{dx} = \frac{1}{u \ln(a)} \cdot \frac{du}{dx}\)。
2. **应用公式**:在这个问题中,\(a = 10\)(因为是常用对数),\(u = 4x + 1\)。所以,根据对数函数的导数公式,我们有:
\[
f'(x) = \frac{1}{(4x + 1) \ln(10)} \cdot \frac{d}{dx}(4x + 1)
\]
3. **计算内部函数的导数**:内部函数 \(4x + 1\) 的导数是 \(4\)。
4. **将结果代入**:将内部函数的导数代入上面的公式中,得到:
\[
f'(x) = \frac{1}{(4x + 1) \ln(10)} \cdot 4 = \frac{4}{(4x + 1) \ln(10)}
\]
因此,函数 \(f(x) = \log_{10}(4x + 1)\) 的导数 \(f'(x)\) 是 \(\frac{4}{(4x + 1) \ln(10)}\)。
Common calculus knowledge:
python rwkv_llama/test_custom_cache_rwkv.py --config_file configs/step_wise/test_hybrid_4_layer_qwenmlp_local.yaml --ckpt_file /home/yueyulin/model/qwen/layer4_70%.pth --input_text "复合函数是如何求导的" --device cuda
复合函数的求导是微积分中的一个重要概念,它涉及到链式法则。链式法则是用来计算复合函数的导数的一种方法。下面我将详细解释如何使用链式法则来求导复合函数。
假设我们有一个复合函数 \(f(g(x))\),其中 \(f(u)\) 是一个关于变量 \(u\) 的函数,而 \(g(x)\) 是一个关于变量 \(x\) 的函数。那么,复合函数 \(f(g(x))\) 的导数可以通过以下步骤来求得:
1. **先对内部函数求导**:首先,我们需要找到内部函数 \(g(x)\) 的导数 \(g'(x)\)。
2. **再对外部函数求导**:然后,我们需要找到外部函数 \(f(u)\) 关于其内部变量 \(u\) 的导数 \(f'(u)\),这里 \(u = g(x)\)。
3. **最后应用链式法则**:将上述两个结果相乘,即得到复合函数 \(f(g(x))\) 的导数为 \((f(g(x)))' = f'(g(x)) \cdot g'(x)\)。
举个具体的例子来说明这个过程:
假设我们有一个复合函数 \(h(x) = (3x^2 + 2)^4\)。我们可以将其分解为外部函数 \(f(u) = u^4\) 和内部函数 \(g(x) = 3x^2 + 2\)。
- 首先,求内部函数 \(g(x) = 3x^2 + 2\) 的导数:
\[
g'(x) = 6x
\]
- 然后,求外部函数 \(f(u) = u^4\) 关于 \(u\) 的导数(这里 \(u = g(x) = 3x^2 + 2\)):
\[
f'(u) = 4u^3
\]
- 最后,应用链式法则:
\[
h'(x) = f'(g(x)) \cdot g'(x) = 4(3x^2 + 2)^3 \cdot 6x = 24x(3x^2 + 2)^3
\]
通过以上步骤,我们得到了复合函数 \(h(x) = (3x^2 + 2)^4\) 的导数 \(h'(x) = 24x(3x^2 + 2)^3\)。
总结一下,复合函数的求导主要依赖于链式法则,即先对内部函数求导,再对外部函数求导,最后将这两个结果相乘。希望这个解释对你有所帮助!