VictorSanh commited on
Commit
a6e5964
1 Parent(s): 40ba628

integrate Giada's comments

Browse files
Files changed (1) hide show
  1. README.md +24 -6
README.md CHANGED
@@ -26,7 +26,9 @@ IDEFICS is on par with the original model on various image-text benchmarks, incl
26
 
27
  We also fine-tune these base models on a mixture of supervised and instruction fine-tuning datasets, which boosts the downstream performance while making the models more usable in conversational settings: [idefics-80b-instruct](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct) and [idefics-9b-instruct](https://huggingface.co/HuggingFaceM4/idefics-9b-instruct). As they reach higher performance, we recommend using these instructed versions first.
28
 
29
- Read more about some of the technical challenges encountered during training IDEFICS [here](https://github.com/huggingface/m4-logs/blob/master/memos/README.md).
 
 
30
 
31
  # Model Details
32
 
@@ -356,9 +358,9 @@ When looking at the response to the arrest prompt for the FairFace dataset, the
356
  Comparing generated responses to the resume prompt by gender across both datasets, we see for FairFace that the terms `financial`, `development`, `product` and `software` appear more frequently for `man`. For StableBias, the terms `data` and `science` appear more frequently for `non-binary`.
357
 
358
  ![Notebook Screenshot](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/resolve/main/bias_nb_screenshot.png)
359
- The [notebook](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/blob/main/m4_bias_eval.ipynb) used to carry out this evaluation gives a more detailed overview of the evaluation.
360
- You can access a [demo](https://huggingface.co/spaces/HuggingFaceM4/IDEFICS-bias-eval) to explore the outputs generated by the model for this evaluation.
361
- You can also access the generations produced in this evaluation at [HuggingFaceM4/m4-bias-eval-stable-bias](https://huggingface.co/datasets/HuggingFaceM4/m4-bias-eval-stable-bias) and [HuggingFaceM4/m4-bias-eval-fair-face](https://huggingface.co/datasets/HuggingFaceM4/m4-bias-eval-fair-face). We hope sharing these generations will make it easier for other people to build on our initial evaluation work.
362
 
363
  Alongside this evaluation, we also computed the classification accuracy on FairFace for both the base and instructed models:
364
 
@@ -372,7 +374,23 @@ Alongside this evaluation, we also computed the classification accuracy on FairF
372
  *Per bucket standard deviation. Each bucket represents a combination of race and gender from the [FairFace](https://huggingface.co/datasets/HuggingFaceM4/FairFace) dataset.
373
  ## Other limitations
374
 
375
- - The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
376
 
377
  # License
378
 
@@ -401,4 +419,4 @@ Stas Bekman, Léo Tronchon, Hugo Laurençon, Lucile Saulnier, Amanpreet Singh, A
401
 
402
  # Model Card Contact
403
 
404
- Please open a discussion on the Community tab!
 
26
 
27
  We also fine-tune these base models on a mixture of supervised and instruction fine-tuning datasets, which boosts the downstream performance while making the models more usable in conversational settings: [idefics-80b-instruct](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct) and [idefics-9b-instruct](https://huggingface.co/HuggingFaceM4/idefics-9b-instruct). As they reach higher performance, we recommend using these instructed versions first.
28
 
29
+ Read more about some of the technical challenges we encountered during training IDEFICS [here](https://github.com/huggingface/m4-logs/blob/master/memos/README.md).
30
+
31
+ *How do I pronounce the name? [Youtube tutorial](https://www.youtube.com/watch?v=YKO0rWnPN2I&ab_channel=FrenchPronunciationGuide)*
32
 
33
  # Model Details
34
 
 
358
  Comparing generated responses to the resume prompt by gender across both datasets, we see for FairFace that the terms `financial`, `development`, `product` and `software` appear more frequently for `man`. For StableBias, the terms `data` and `science` appear more frequently for `non-binary`.
359
 
360
  ![Notebook Screenshot](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/resolve/main/bias_nb_screenshot.png)
361
+ The [notebook](https://huggingface.co/spaces/HuggingFaceM4/m4-bias-eval/blob/main/m4_bias_eval.ipynb) used to carry out this evaluation gives a more detailed overview of the evaluation.
362
+ You can access a [demo](https://huggingface.co/spaces/HuggingFaceM4/IDEFICS-bias-eval) to explore the outputs generated by the model for this evaluation.
363
+ You can also access the generations produced in this evaluation at [HuggingFaceM4/m4-bias-eval-stable-bias](https://huggingface.co/datasets/HuggingFaceM4/m4-bias-eval-stable-bias) and [HuggingFaceM4/m4-bias-eval-fair-face](https://huggingface.co/datasets/HuggingFaceM4/m4-bias-eval-fair-face). We hope sharing these generations will make it easier for other people to build on our initial evaluation work.
364
 
365
  Alongside this evaluation, we also computed the classification accuracy on FairFace for both the base and instructed models:
366
 
 
374
  *Per bucket standard deviation. Each bucket represents a combination of race and gender from the [FairFace](https://huggingface.co/datasets/HuggingFaceM4/FairFace) dataset.
375
  ## Other limitations
376
 
377
+ - The model currently will offer medical diagnosis when prompted to do so. For example, the prompt `Does this X-ray show any medical problems?` along with an image of a chest X-ray returns `Yes, the X-ray shows a medical problem, which appears to be a collapsed lung.`. We strongly discourage users to use the model on medical applications without proper adaptation and evaluation.
378
+ - Despite our efforts on filtering the training data, we found a small proportion of content that is not suitable for all audience. This includes pornographic content and reports of violent shootings and is prevalent in the OBELICS portion of the data (see [here](https://huggingface.co/datasets/HuggingFaceM4/OBELICS#content-warnings) for more details). As such, the model is suceptible to generate text that resemble these content.
379
+
380
+ # Misuse and Out-of-scope use
381
+
382
+ Using the model in [high-stakes](https://huggingface.co/bigscience/bloom/blob/main/README.md#glossary-and-calculations) settings is out of scope for this model. The model is not designed for [critical decisions](https://huggingface.co/bigscience/bloom/blob/main/README.md#glossary-and-calculations) nor uses with any material consequences on an individual's livelihood or wellbeing. The model outputs content that appears factual but may not be correct. Out-of-scope uses include:
383
+ - Usage for evaluating or scoring individuals, such as for employment, education, or credit
384
+ - Applying the model for critical automatic decisions, generating factual content, creating reliable summaries, or generating predictions that must be correct
385
+
386
+ Intentionally using the model for harm, violating [human rights](https://huggingface.co/bigscience/bloom/blob/main/README.md#glossary-and-calculations), or other kinds of malicious activities, is a misuse of this model. This includes:
387
+ - Spam generation
388
+ - Disinformation and influence operations
389
+ - Disparagement and defamation
390
+ - Harassment and abuse
391
+ - [Deception](https://huggingface.co/bigscience/bloom/blob/main/README.md#glossary-and-calculations)
392
+ - Unconsented impersonation and imitation
393
+ - Unconsented surveillance
394
 
395
  # License
396
 
 
419
 
420
  # Model Card Contact
421
 
422
+ Please open a discussion on the Community tab!