natolambert
commited on
Commit
•
5058bd8
1
Parent(s):
972ac9f
Update README.md
Browse files
README.md
CHANGED
@@ -44,8 +44,17 @@ StarChat Alpha is intended for educational and/or research purposes and in that
|
|
44 |
|
45 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
46 |
|
47 |
-
StarChat Alpha has not been aligned to human preferences with techniques like RLHF, so the model can produce problematic outputs (especially when prompted to do so).
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
## How to Get Started with the Model
|
51 |
|
|
|
44 |
|
45 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
46 |
|
47 |
+
StarChat Alpha has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
|
48 |
+
Models trained primarily on code data will also have a more skewed demographic bias commensurate with the demographics of the GitHub community, for more on this see the [StarCoder dataset](https://huggingface.co/datasets/bigcode/starcoderdata) which is derived from The Stack.
|
49 |
+
|
50 |
+
|
51 |
+
Since the base model was pretrained on a large corpus of code, it may produce code snippets that are syntactically valid but semantically incorrect.
|
52 |
+
For example, it may produce code that does not compile or that produces incorrect results.
|
53 |
+
It may also produce code that is vulnerable to security exploits.
|
54 |
+
We have observed the model also has a tendency to produce false URLs which should be carefully inspected before clicking.
|
55 |
+
|
56 |
+
StarChat Alpha was fine-tuned from the base model [StarCoder Base](https://huggingface.co/bigcode/starcoderbase), please refer to its model card's [Limitations Section](https://huggingface.co/bigcode/starcoderbase#limitations) for relevant information.
|
57 |
+
In particular, the model was evaluated on some categories of gender biases, propensity for toxicity, and risk of suggesting code completions with known security flaws; these evaluations are reported in its [technical report](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view).
|
58 |
|
59 |
## How to Get Started with the Model
|
60 |
|