ChangranHuuu commited on
Commit
536b5be
1 Parent(s): 83e22f4

Changran done a pass of edits and comments. updated model description, out of scope use, training procudure.

Browse files
Files changed (1) hide show
  1. README.md +12 -16
README.md CHANGED
@@ -8,7 +8,7 @@ license: other
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
- BLOOMChat is [BigScience Group BLOOM model](https://huggingface.co/bigscience/bloom) instruction-tuned on a subset of 100k datapoints per data source from the [OIG dataset](https://huggingface.co/datasets/laion/OIG) from the [OpenChatKit](https://www.together.xyz/blog/openchatkit). Then aligned using [Dolly 2.0](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and [Oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1).
12
 
13
  ## Model Details
14
 
@@ -58,7 +58,7 @@ BLOOMChat should NOT be used for:
58
  - Making highly important decisions
59
  - Important automated pipelines
60
 
61
- This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a good baseline.
62
 
63
  ### Recommendations
64
 
@@ -150,15 +150,16 @@ python -m inference_server.cli --model_name sambanovasystems/BLOOMChat-176B-v1 -
150
  ```
151
 
152
  ```
153
- <human>: give a python code to open a http server in 8080 port using python 3.7
154
  <bot>:
155
  ```
156
 
157
  ```
158
- <human>: Create an itemized list of tasks to complete to start a clothing brand
159
  <bot>:
160
  ```
161
 
 
162
  </details>
163
 
164
  ---
@@ -326,21 +327,16 @@ Estos son solo algunos ejemplos de juegos que podrían interesarte según tus cr
326
 
327
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
328
 
329
- We trained BLOOMChat with SambaStudio, a platform built on SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from [BLOOM](https://huggingface.co/bigscience/bloom), an OSS multilingual 176B GPT model pretrained by the [BigScience group](https://huggingface.co/bigscience). There was also some preprocessing done on the training datasets.
 
330
 
331
- ### Prompting Style Used For Training
332
- ```
333
- <human>: {input that the user wants from the bot}
334
- <bot>:
335
- ```
336
 
 
337
  ```
338
- <human>: {fewshot1 input}
339
- <bot>: {fewshot1 response}
340
- <human>: {fewshot2 input}
341
- <bot>: {fewshot2 response}
342
- <human>: {input that the user wants from the bot}
343
- <bot>:
344
  ```
345
 
346
  ### Hyperparameters
 
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
+ BLOOMChat is a 176 billion parameter multilingual chat model. It is instruction tuned from [BLOOM (176B)](https://huggingface.co/bigscience/bloom) on assistant-style conversation datasets and supports conversation, question answering and generative answers in multiple languages.
12
 
13
  ## Model Details
14
 
 
58
  - Making highly important decisions
59
  - Important automated pipelines
60
 
61
+ This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a multilingual chat LLM baseline.
62
 
63
  ### Recommendations
64
 
 
150
  ```
151
 
152
  ```
153
+ <human>: Create an itemized list of tasks to complete to start a clothing brand
154
  <bot>:
155
  ```
156
 
157
  ```
158
+ <human>: 十七岁的风是什么颜色的?
159
  <bot>:
160
  ```
161
 
162
+
163
  </details>
164
 
165
  ---
 
327
 
328
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
329
 
330
+ We trained BLOOMChat with [SambaNova DataScale systems](https://sambanova.ai/products/datascale/) with SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from [BLOOM (176B)](https://huggingface.co/bigscience/bloom), an open-source multilingual LLM pretrained by the [BigScience group](https://huggingface.co/bigscience). We instruction-tune BLOOM (176B) on OpenChatKit with each data source subsampled to 100k for one epoch, followed by three epochs over the combined OpenChatKit and Dolly 2.0.
331
+ All of the code used to prepare the datasets and the scripts to run training and inference are open-sourced and freely available at [sambanova/bloomchat](https://github.com/sambanova/bloomchat/tree/main)
332
 
 
 
 
 
 
333
 
334
+ ### Prompting Style Used For Training
335
  ```
336
+ <human>: {input1 that the user wants from the bot}
337
+ <bot>: {response1}</s>
338
+ <human>: {input2 that the user wants from the bot}
339
+ <bot>: {response2}</s>
 
 
340
  ```
341
 
342
  ### Hyperparameters