lucifertrj commited on
Commit
151c18f
1 Parent(s): 6442a22

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -9,8 +9,6 @@ pipeline_tag: text-generation
9
 
10
  ## Model Description
11
 
12
- <!-- Provide a quick summary of what the model is/does. -->
13
-
14
  Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
15
 
16
  ## Dataset Creation
@@ -36,13 +34,18 @@ Please check out [Flash Attention 2](https://github.com/Dao-AILab/flash-attentio
36
 
37
  **Implementation**:
38
 
 
 
39
  ```python
40
  from vllm import LLM, SamplingParams
41
 
42
  llm = LLM(
43
- model='aiplanet/Buddhi-128K-Chat',
44
- gpu_memory_utilization=0.99,
45
- max_model_len=131072
 
 
 
46
  )
47
 
48
  prompts = [
@@ -63,8 +66,12 @@ for output in outputs:
63
  generated_text = output.outputs[0].text
64
  print(generated_text)
65
  print("\n\n")
 
 
66
  ```
67
 
 
 
68
  ### Transformers - Basic Implementation
69
 
70
  ```python
@@ -155,7 +162,7 @@ In order to leverage instruction fine-tuning, your prompt should be surrounded b
155
 
156
  ```
157
  @misc {Chaitanya890, lucifertrj ,
158
- author = { {Chaitanya Singhal},{Tarun Jain} },
159
  title = { Buddhi-128k-Chat by AI Planet},
160
  year = 2024,
161
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },
 
9
 
10
  ## Model Description
11
 
 
 
12
  Buddhi is a general-purpose chat model, meticulously fine-tuned on the Mistral 7B Instruct, and optimised to handle an extended context length of up to 128,000 tokens using the innovative YaRN [(Yet another Rope Extension)](https://arxiv.org/abs/2309.00071) Technique. This enhancement allows Buddhi to maintain a deeper understanding of context in long documents or conversations, making it particularly adept at tasks requiring extensive context retention, such as comprehensive document summarization, detailed narrative generation, and intricate question-answering.
13
 
14
  ## Dataset Creation
 
34
 
35
  **Implementation**:
36
 
37
+ > Note: The actual hardware requirements to run the model is roughly around 70GB VRAM. For experimentation, we are limiting the context length to 75K instead of 128K. This make it suitable for testing the model in 30-35 GB VRAM
38
+
39
  ```python
40
  from vllm import LLM, SamplingParams
41
 
42
  llm = LLM(
43
+ model='aiplanet/buddhi-128k-chat-7b',
44
+ trust_remote_code=True,
45
+ download_dir='aiplanet/buddhi-128k-chat-7b',
46
+ dtype = 'bfloat16',
47
+ gpu_memory_utilization=1,
48
+ max_model_len= 75000
49
  )
50
 
51
  prompts = [
 
66
  generated_text = output.outputs[0].text
67
  print(generated_text)
68
  print("\n\n")
69
+
70
+ # we have also attached a colab notebook, that contains: 2 more experimentations: Long Essay and Entire Book
71
  ```
72
 
73
+ For Output, do check out the colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11_8W8FpKK-856QdRVJLyzbu9g-DMxNfg?usp=sharing)
74
+
75
  ### Transformers - Basic Implementation
76
 
77
  ```python
 
162
 
163
  ```
164
  @misc {Chaitanya890, lucifertrj ,
165
+ author = { Chaitanya Singhal, Tarun Jain },
166
  title = { Buddhi-128k-Chat by AI Planet},
167
  year = 2024,
168
  url = { https://huggingface.co/aiplanet//Buddhi-128K-Chat },