lucifertrj
commited on
Commit
β’
735c151
1
Parent(s):
f9149b6
Update README.md
Browse files
README.md
CHANGED
@@ -1,17 +1,20 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
3 |
---
|
4 |
|
5 |
<p align="center" style="font-size:34px;"><b>Buddhi 7B</b></p>
|
6 |
|
7 |
# Buddhi-7B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
|
8 |
|
9 |
-
|
10 |
|
11 |
<!-- Provide a quick summary of what the model is/does. -->
|
12 |
|
13 |
Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
|
14 |
|
|
|
|
|
15 |
## Architecture
|
16 |
|
17 |
### Hardware requirements:
|
@@ -114,7 +117,15 @@ Why don't scientists trust atoms?
|
|
114 |
Because they make up everything.
|
115 |
```
|
116 |
|
117 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
118 |
|
119 |
In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
|
120 |
|
@@ -124,18 +135,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
|
|
124 |
"[INST] Do you have mayonnaise recipes? [/INST]"
|
125 |
|
126 |
```
|
127 |
-
## π Key Features:
|
128 |
-
|
129 |
-
π― Precision and Efficiency: The model is tailored for accuracy, ensuring your code is not just functional but also efficient.
|
130 |
-
|
131 |
-
β¨ Unleash Creativity: Whether you're a novice or an expert coder, Panda-Coder is here to support your coding journey, offering creative solutions to your programming challenges.
|
132 |
-
|
133 |
-
π Evol Instruct Code: It's built on the robust Evol Instruct Code 80k-v1 dataset, guaranteeing top-notch code generation.
|
134 |
-
|
135 |
-
π’ What's Next?: We believe in continuous improvement and are excited to announce that in our next release, Panda-Coder will be enhanced with a custom dataset. This dataset will not only expand the language support but also include hardware programming languages like MATLAB, Embedded C, and Verilog. π§°π‘
|
136 |
-
|
137 |
|
138 |
-
|
139 |
|
140 |
You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
|
141 |
|
@@ -153,8 +154,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
|
|
153 |
### Citation
|
154 |
|
155 |
```
|
156 |
-
@misc {Chaitanya890,
|
157 |
-
author = { {Chaitanya Singhal} },
|
158 |
title = { Buddhi-128k-Chat by AI Planet},
|
159 |
year = 2024,
|
160 |
url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
pipeline_tag: text-generation
|
4 |
---
|
5 |
|
6 |
<p align="center" style="font-size:34px;"><b>Buddhi 7B</b></p>
|
7 |
|
8 |
# Buddhi-7B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
|
9 |
|
10 |
+
## Model Description
|
11 |
|
12 |
<!-- Provide a quick summary of what the model is/does. -->
|
13 |
|
14 |
Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
|
15 |
|
16 |
+
## Dataset Creation
|
17 |
+
|
18 |
## Architecture
|
19 |
|
20 |
### Hardware requirements:
|
|
|
117 |
Because they make up everything.
|
118 |
```
|
119 |
|
120 |
+
## Evaluation
|
121 |
+
|
122 |
+
| Model | HellaSWAG | ARC-Challenge | MMLU | TruthfulQA | Winogrande |
|
123 |
+
|--------------------------------------|-----------|---------------|-------|------------|------------|
|
124 |
+
| Buddhi-128K-Chat | 82.78 | 57.51 | 57.39 | 55.44 | 78.37 |
|
125 |
+
| NousResearch/Yarn-Mistral-7b-128k | 80.58 | 58.87 | 60.64 | 42.46 | 72.85 |
|
126 |
+
|
127 |
+
|
128 |
+
## Prompt Template for Buddi-128-Chat
|
129 |
|
130 |
In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
|
131 |
|
|
|
135 |
"[INST] Do you have mayonnaise recipes? [/INST]"
|
136 |
|
137 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
138 |
|
139 |
+
## Get in Touch
|
140 |
|
141 |
You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
|
142 |
|
|
|
154 |
### Citation
|
155 |
|
156 |
```
|
157 |
+
@misc {Chaitanya890, lucifertrj ,
|
158 |
+
author = { {Chaitanya Singhal},{Tarun Jain} },
|
159 |
title = { Buddhi-128k-Chat by AI Planet},
|
160 |
year = 2024,
|
161 |
url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
|