Ezi commited on
Commit
8875554
1 Parent(s): 4bb648a

Hi! 👋
This PR has a preliminary model card, based on the format we are using as part of our effort to standardise model cards at Hugging Face. It is generated automatically using our [our tool](https://huggingface.co/spaces/huggingface/Model_Cards_Writing_Tool), as we're testing our automatic Model Card generation abilities and running a study to see the effects of model cards on models.
Initial evidence suggests that model cards increase usage.
Please take a look when you get a chance, feel free to merge if you are ok with the changes or incorporate any additional information 🤗

Files changed (1) hide show
  1. README.md +181 -0
README.md ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - clip
4
+ ---
5
+
6
+ # Model Card for stable-diffusion-safety-checker
7
+
8
+ # Model Details
9
+
10
+ ## Model Description
11
+
12
+ More information needed
13
+
14
+ - **Developed by:** More information needed
15
+ - **Shared by [Optional]:** CompVis
16
+ - **Model type:** Image Identification
17
+ - **Language(s) (NLP):** More information needed
18
+ - **License:** More information needed
19
+ - **Parent Model:** [CLIP](https://huggingface.co/openai/clip-vit-large-patch14)
20
+ - **Resources for more information:**
21
+ - [CLIP Paper](https://arxiv.org/abs/2103.00020)
22
+ - [Stable Diffusion Model Card](https://github.com/CompVis/stable-diffusion/blob/main/Stable_Diffusion_v1_Model_Card.md)
23
+
24
+
25
+ # Uses
26
+
27
+
28
+ ## Direct Use
29
+ This model can be used for identifying NSFW image
30
+
31
+ The CLIP model devlopers note in their [model card](https://huggingface.co/openai/clip-vit-large-patch14) :
32
+
33
+ >The primary intended users of these models are AI researchers.
34
+
35
+ We primarily imagine the model will be used by researchers to better understand robustness, generalization, and other capabilities, biases, and constraints of computer vision models.
36
+
37
+
38
+
39
+ ## Downstream Use [Optional]
40
+
41
+ More information needed.
42
+
43
+ ## Out-of-Scope Use
44
+
45
+ The model is not intended to be used with transformers but with diffusers. This model should also not be used to intentionally create hostile or alienating environments for people.
46
+
47
+ # Bias, Risks, and Limitations
48
+
49
+
50
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
51
+
52
+ The CLIP model devlopers note in their [model card](https://huggingface.co/openai/clip-vit-large-patch14) :
53
+ > We find that the performance of CLIP - and the specific biases it exhibits - can depend significantly on class design and the choices one makes for categories to include and exclude. We tested the risk of certain kinds of denigration with CLIP by classifying images of people from Fairface into crime-related and non-human animal categories. We found significant disparities with respect to race and gender. Additionally, we found that these disparities could shift based on how the classes were constructed.
54
+
55
+ > We also tested the performance of CLIP on gender, race and age classification using the Fairface dataset (We default to using race categories as they are constructed in the Fairface dataset.) in order to assess quality of performance across different demographics. We found accuracy >96% across all races for gender classification with ‘Middle Eastern’ having the highest accuracy (98.4%) and ‘White’ having the lowest (96.5%). Additionally, CLIP averaged ~93% for racial classification and ~63% for age classification
56
+
57
+ ## Recommendations
58
+
59
+
60
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
61
+
62
+ # Training Details
63
+
64
+ ## Training Data
65
+
66
+ More information needed
67
+
68
+ ## Training Procedure
69
+
70
+
71
+ ### Preprocessing
72
+
73
+ More information needed
74
+
75
+
76
+
77
+ ### Speeds, Sizes, Times
78
+
79
+ More information needed
80
+
81
+
82
+
83
+ # Evaluation
84
+
85
+
86
+ ## Testing Data, Factors & Metrics
87
+
88
+ ### Testing Data
89
+
90
+ More information needed
91
+
92
+ ### Factors
93
+ More information needed
94
+
95
+ ### Metrics
96
+
97
+ More information needed
98
+
99
+
100
+ ## Results
101
+
102
+ More information needed
103
+
104
+
105
+ # Model Examination
106
+
107
+ More information needed
108
+
109
+ # Environmental Impact
110
+
111
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
112
+
113
+ - **Hardware Type:** More information needed
114
+ - **Hours used:** More information needed
115
+ - **Cloud Provider:** More information needed
116
+ - **Compute Region:** More information needed
117
+ - **Carbon Emitted:** More information needed
118
+
119
+ # Technical Specifications [optional]
120
+
121
+ ## Model Architecture and Objective
122
+
123
+ The CLIP model devlopers note in their [model card](https://huggingface.co/openai/clip-vit-large-patch14) :
124
+
125
+ > The base model uses a ViT-L/14 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss.
126
+
127
+ ## Compute Infrastructure
128
+
129
+ More information needed
130
+
131
+ ### Hardware
132
+
133
+
134
+ More information needed
135
+
136
+ ### Software
137
+
138
+ More information needed.
139
+
140
+ # Citation
141
+
142
+
143
+ **BibTeX:**
144
+
145
+ More information needed
146
+
147
+
148
+
149
+
150
+ **APA:**
151
+
152
+ More information needed
153
+
154
+ # Glossary [optional]
155
+
156
+ More information needed
157
+
158
+ # More Information [optional]
159
+ More information needed
160
+
161
+ # Model Card Authors [optional]
162
+
163
+ CompVis in collaboration with Ezi Ozoani and the Hugging Face team
164
+
165
+ # Model Card Contact
166
+
167
+ More information needed
168
+
169
+ # How to Get Started with the Model
170
+
171
+ Use the code below to get started with the model.
172
+
173
+ <details>
174
+ <summary> Click to expand </summary>
175
+
176
+ ```python
177
+ from transformers import AutoProcessor, SafetyChecker
178
+ processor = AutoProcessor.from_pretrained("CompVis/stable-diffusion-safety-checker")
179
+ safety_checker = SafetyChecker.from_pretrained("CompVis/stable-diffusion-safety-checker")
180
+ ```
181
+ </details>