lucifertrj commited on
Commit
735c151
β€’
1 Parent(s): f9149b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -15
README.md CHANGED
@@ -1,17 +1,20 @@
1
  ---
2
  license: apache-2.0
 
3
  ---
4
 
5
  <p align="center" style="font-size:34px;"><b>Buddhi 7B</b></p>
6
 
7
  # Buddhi-7B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
8
 
9
- # Model Description
10
 
11
  <!-- Provide a quick summary of what the model is/does. -->
12
 
13
  Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
14
 
 
 
15
  ## Architecture
16
 
17
  ### Hardware requirements:
@@ -114,7 +117,15 @@ Why don't scientists trust atoms?
114
  Because they make up everything.
115
  ```
116
 
117
- ## Prompt Template for Panda Coder 13B
 
 
 
 
 
 
 
 
118
 
119
  In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
120
 
@@ -124,18 +135,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
124
  "[INST] Do you have mayonnaise recipes? [/INST]"
125
 
126
  ```
127
- ## πŸ”— Key Features:
128
-
129
- 🎯 Precision and Efficiency: The model is tailored for accuracy, ensuring your code is not just functional but also efficient.
130
-
131
- ✨ Unleash Creativity: Whether you're a novice or an expert coder, Panda-Coder is here to support your coding journey, offering creative solutions to your programming challenges.
132
-
133
- πŸ“š Evol Instruct Code: It's built on the robust Evol Instruct Code 80k-v1 dataset, guaranteeing top-notch code generation.
134
-
135
- πŸ“’ What's Next?: We believe in continuous improvement and are excited to announce that in our next release, Panda-Coder will be enhanced with a custom dataset. This dataset will not only expand the language support but also include hardware programming languages like MATLAB, Embedded C, and Verilog. πŸ§°πŸ’‘
136
-
137
 
138
- ## Get in Touch
139
 
140
  You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
141
 
@@ -153,8 +154,8 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
153
  ### Citation
154
 
155
  ```
156
- @misc {Chaitanya890,
157
- author = { {Chaitanya Singhal} },
158
  title = { Buddhi-128k-Chat by AI Planet},
159
  year = 2024,
160
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: text-generation
4
  ---
5
 
6
  <p align="center" style="font-size:34px;"><b>Buddhi 7B</b></p>
7
 
8
  # Buddhi-7B vLLM Inference: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
9
 
10
+ ## Model Description
11
 
12
  <!-- Provide a quick summary of what the model is/does. -->
13
 
14
  Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
15
 
16
+ ## Dataset Creation
17
+
18
  ## Architecture
19
 
20
  ### Hardware requirements:
 
117
  Because they make up everything.
118
  ```
119
 
120
+ ## Evaluation
121
+
122
+ | Model | HellaSWAG | ARC-Challenge | MMLU | TruthfulQA | Winogrande |
123
+ |--------------------------------------|-----------|---------------|-------|------------|------------|
124
+ | Buddhi-128K-Chat | 82.78 | 57.51 | 57.39 | 55.44 | 78.37 |
125
+ | NousResearch/Yarn-Mistral-7b-128k | 80.58 | 58.87 | 60.64 | 42.46 | 72.85 |
126
+
127
+
128
+ ## Prompt Template for Buddi-128-Chat
129
 
130
  In order to leverage instruction fine-tuning, your prompt should be surrounded by [INST] and [/INST] tokens. The very first instruction should begin with a begin of sentence id. The next instructions should not. The assistant generation will be ended by the end-of-sentence token id.
131
 
 
135
  "[INST] Do you have mayonnaise recipes? [/INST]"
136
 
137
  ```
 
 
 
 
 
 
 
 
 
 
138
 
139
+ ## Get in Touch
140
 
141
  You can schedule a 1:1 meeting with our DevRel & Community Team to get started with AI Planet Open Source LLMs and GenAI Stack. Schedule the call here: [https://calendly.com/jaintarun](https://calendly.com/jaintarun)
142
 
 
154
  ### Citation
155
 
156
  ```
157
+ @misc {Chaitanya890, lucifertrj ,
158
+ author = { {Chaitanya Singhal},{Tarun Jain} },
159
  title = { Buddhi-128k-Chat by AI Planet},
160
  year = 2024,
161
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },