sambanovasystems
/

BLOOMChat-176B-v1

Text Generation

Transformers

PyTorch

bloom

text-generation-inference

Model card Files Files and versions Community

ChangranHuuu commited on May 18, 2023

Commit

536b5be

1 Parent(s): 83e22f4

Changran done a pass of edits and comments. updated model description, out of scope use, training procudure.

Browse files

Files changed (1) hide show

README.md +12 -16

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ license: other
 <!-- Provide a quick summary of what the model is/does. -->
-BLOOMChat is [BigScience Group BLOOM model](https://huggingface.co/bigscience/bloom) instruction-tuned on a subset of 100k datapoints per data source from the [OIG dataset](https://huggingface.co/datasets/laion/OIG) from the [OpenChatKit](https://www.together.xyz/blog/openchatkit). Then aligned using [Dolly 2.0](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and [Oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1).
 ## Model Details
@@ -58,7 +58,7 @@ BLOOMChat should NOT be used for:
 - Making highly important decisions
 - Important automated pipelines
-This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a good baseline.
 ### Recommendations
@@ -150,15 +150,16 @@ python -m inference_server.cli --model_name sambanovasystems/BLOOMChat-176B-v1 -
 ```
 ```
-<human>: give a python code to open a http server in 8080 port using python 3.7
 <bot>:
 ```
 ```
-<human>: Create an itemized list of tasks to complete to start a clothing brand
 <bot>:
 ```
 </details>
 ---
@@ -326,21 +327,16 @@ Estos son solo algunos ejemplos de juegos que podrían interesarte según tus cr
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-We trained BLOOMChat with SambaStudio, a platform built on SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from [BLOOM](https://huggingface.co/bigscience/bloom), an OSS multilingual 176B GPT model pretrained by the [BigScience group](https://huggingface.co/bigscience). There was also some preprocessing done on the training datasets.
-### Prompting Style Used For Training
-```
-<human>: {input that the user wants from the bot}
-<bot>:
-```
 ```
-<human>: {fewshot1 input}
-<bot>: {fewshot1 response}
-<human>: {fewshot2 input}
-<bot>: {fewshot2 response}
-<human>: {input that the user wants from the bot}
-<bot>:
 ```
 ### Hyperparameters

 <!-- Provide a quick summary of what the model is/does. -->
+BLOOMChat is a 176 billion parameter multilingual chat model. It is instruction tuned from [BLOOM (176B)](https://huggingface.co/bigscience/bloom) on assistant-style conversation datasets and supports conversation, question answering and generative answers in multiple languages.
 ## Model Details
 - Making highly important decisions
 - Important automated pipelines
+This model is still in early development and can be prone to mistakes and hallucinations, there is still room for improvement. This model is intended to provide the community with a multilingual chat LLM baseline.
 ### Recommendations
 ```
 ```
+<human>: Create an itemized list of tasks to complete to start a clothing brand
 <bot>:
 ```
 ```
+<human>: 十七岁的风是什么颜色的?
 <bot>:
 ```
 </details>
 ---
 <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+We trained BLOOMChat with [SambaNova DataScale systems](https://sambanova.ai/products/datascale/) with SambaNova's in-house Reconfigurable Dataflow Unit (RDU). We started from [BLOOM (176B)](https://huggingface.co/bigscience/bloom), an open-source multilingual LLM pretrained by the [BigScience group](https://huggingface.co/bigscience). We instruction-tune BLOOM (176B) on OpenChatKit with each data source subsampled to 100k for one epoch, followed by three epochs over the combined OpenChatKit and Dolly 2.0.
+All of the code used to prepare the datasets and the scripts to run training and inference are open-sourced and freely available at [sambanova/bloomchat](https://github.com/sambanova/bloomchat/tree/main)
+### Prompting Style Used For Training
 ```
+<human>: {input1 that the user wants from the bot}
+<bot>: {response1}</s>
+<human>: {input2 that the user wants from the bot}
+<bot>: {response2}</s>
 ```
 ### Hyperparameters