somosnlp-hackathon-2022
/

es_text_neutralizer

Text2Text Generation

Text2Text Generation

Inclusive Language

Text Neutralization

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

fermaat commited on Apr 1, 2022

Commit

fccbfaa

•

1 Parent(s): 7fde946

Update README.md

Files changed (1) hide show

README.md +10 -9

README.md CHANGED Viewed

@@ -7,21 +7,22 @@ tags:
 - Inclusive Language
 - Text Neutralization
 - pytorch
-# datasets:
-#- {Pending}  # Example: common_voice. Use dataset id from https://hf.co/datasets
 metrics:
 - sacrebleu
 model-index:
-- name: es_nlp_text_neutralizer
   results:
   - task:
       type: Text2Text Generation
       name: Neutralization of texts in Spanish
-#     dataset:
-#       type: {Pending}  # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
-#      name: {handcrafted dataset}  # Optional. Example: Common Voice zh-CN
-#       args: {es}         # Optional. Example: zh-CN
     metrics:
       - type: sacrebleu    # Required. Example: wer
         value: 93.8347  # Required. Example: 20.90
@@ -50,8 +51,8 @@ By using gender inclusive models we can help reducing gender bias in a language
 ## Training and evaluation data
-The data used for the model training has been manually created form a compilation of sources, obtained from a series of guidelines and manuals issued by Spanish Ministry of Health, Social Services and Equality in the matter of the usage of non-sexist language, stipulated in this linked [document](https://www.inmujeres.gob.es/servRecursos/formacion/GuiasLengNoSexista/docs/Guiaslenguajenosexista_.pdf):
 ### Compiled sources

 - Inclusive Language
 - Text Neutralization
 - pytorch
+datasets:
+- hackathon-pln-es/neutral-es
 metrics:
 - sacrebleu
 model-index:
+- name: es_text_neutralizer
   results:
   - task:
       type: Text2Text Generation
       name: Neutralization of texts in Spanish
+    dataset:
+      type: hackathon-pln-es/neutral-es
+      name: neutral-es
     metrics:
       - type: sacrebleu    # Required. Example: wer
         value: 93.8347  # Required. Example: 20.90
 ## Training and evaluation data
+One of the major challenges was to obtain a valuable dataset that would suit our purpose, therefore, the team opted to dedicate a considerable amount of time to build it from a scratch.
+The data used for the model training has been created form a compilation of sources, obtained from a series of guidelines and manuals issued by Spanish Ministry of Health, Social Services and Equality in the matter of the usage of non-sexist language, stipulated in this linked [document:](https://www.inmujeres.gob.es/servRecursos/formacion/GuiasLengNoSexista/docs/Guiaslenguajenosexista_.pdf):
 ### Compiled sources