Limitations & Biases: Model Card Update
#3
by
Ezi
- opened
README.md
CHANGED
@@ -60,6 +60,19 @@ The supervised training tasks datasets can be downloaded on [Link](https://www.d
|
|
60 |
|
61 |
The model could be used to generate lisp inspired DSL code given the human language description tasks.
|
62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
## Training
|
65 |
|
|
|
60 |
|
61 |
The model could be used to generate lisp inspired DSL code given the human language description tasks.
|
62 |
|
63 |
+
## Risks, Limitations and Biases
|
64 |
+
|
65 |
+
|
66 |
+
As detailed in this model’s [publication](https://arxiv.org/pdf/2104.02443.pdf), this model makes use of the data-set [One Billion Word Language Model Benchmark corpus](https://www.researchgate.net/publication/259239818_One_Billion_Word_Benchmark_for_Measuring_Progress_in_Statistical_Language_Modeling) in order to gather the self-supervised English data samples.
|
67 |
+
|
68 |
+
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
|
69 |
+
As such, it should be noted that language models that are pretrained from text corpus such as the One Billion Word Word Language Model Benchmark corpus have been further explored (e.g by [Ngo, Helen & Araújo et al(2021)](https://www.researchgate.net/publication/355582954_No_News_is_Good_News_A_Critique_of_the_One_Billion_Word_Benchmark) reports that the One Billion Word Word Language Model Benchmark corpus
|
70 |
+
> “generate text in the linguistic style of news, without any grounding in the real world. In addition to potential harms from models which are inadvertently optimized for generating fake news.”
|
71 |
+
|
72 |
+
The aforementioned publication continues to warn that the One Billion Word Word Language Model Benchmark corpus
|
73 |
+
> contains sentences which contain words commonly found on blocklists. While these sentences may have plausibly been used in expository contexts within the article, the destructive sentence-level preprocessing and shuffling applied to lm1b [One Billion Word Word Language Model Benchmark corpus] removes all long-range structure from the text and makes it infeasible to track the context and intent of individual examples.
|
74 |
+
|
75 |
+
[Ngo, Helen & Araújo et al(2021)](https://www.researchgate.net/publication/355582954_No_News_is_Good_News_A_Critique_of_the_One_Billion_Word_Benchmark)
|
76 |
|
77 |
## Training
|
78 |
|