mmarimon commited on
Commit
ebf005e
1 Parent(s): 1f88a3a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -23
README.md CHANGED
@@ -51,6 +51,9 @@ widget:
51
  # Catalan BERTa (RoBERTa-base) finetuned for Named Entity Recognition.
52
 
53
  ## Table of Contents
 
 
 
54
  - [Model Description](#model-description)
55
  - [Intended Uses and Limitations](#intended-uses-and-limitations)
56
  - [How to Use](#how-to-use)
@@ -60,20 +63,33 @@ widget:
60
  - [Evaluation](#evaluation)
61
  - [Variable and Metrics](#variable-and-metrics)
62
  - [Evaluation Results](#evaluation-results)
63
- - [Licensing Information](#licensing-information)
64
- - [Citation Information](#citation-information)
65
- - [Funding](#funding)
66
- - [Contributions](#contributions)
67
- - [Disclaimer](#disclaimer)
 
 
 
 
68
 
69
  ## Model description
70
 
71
  The **roberta-base-ca-cased-ner** is a Named Entity Recognition (NER) model for the Catalan language fine-tuned from the [BERTa](https://huggingface.co/PlanTL-GOB-ES/roberta-base-ca) model, a [RoBERTa](https://arxiv.org/abs/1907.11692) base model pre-trained on a medium-size corpus collected from publicly available corpora and crawlers (check the BERTa model card for more details).
72
 
73
- ## Datasets
 
 
 
 
 
 
 
 
 
74
  We used the NER dataset in Catalan called [Ancora-ca-ner](https://huggingface.co/datasets/projecte-aina/ancora-ca-ner) for training and evaluation.
75
 
76
- ## Evaluation and results
77
  We evaluated the _roberta-base-ca-cased-ner_ on the Ancora-ca-ner test set against standard multilingual and monolingual baselines:
78
 
79
  | Model | Ancora-ca-ner (F1)|
@@ -85,7 +101,25 @@ We evaluated the _roberta-base-ca-cased-ner_ on the Ancora-ca-ner test set again
85
 
86
  For more details, check the fine-tuning and evaluation scripts in the official [GitHub repository](https://github.com/projecte-aina/club).
87
 
88
- ## Citing
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  If you use any of these resources (datasets or models) in your work, please cite our latest paper:
90
  ```bibtex
91
  @inproceedings{armengol-estape-etal-2021-multilingual,
@@ -109,21 +143,7 @@ If you use any of these resources (datasets or models) in your work, please cite
109
  }
110
  ```
111
 
112
- ## Licensing Information
113
-
114
- [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
115
-
116
- ## Citation Information
117
-
118
-
119
- ### Funding
120
- This work was funded by the [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
121
-
122
- ## Contributions
123
-
124
- [N/A]
125
-
126
- ## Disclaimer
127
 
128
  <details>
129
  <summary>Click to expand</summary>
 
51
  # Catalan BERTa (RoBERTa-base) finetuned for Named Entity Recognition.
52
 
53
  ## Table of Contents
54
+ <details>
55
+ <summary>Click to expand</summary>
56
+
57
  - [Model Description](#model-description)
58
  - [Intended Uses and Limitations](#intended-uses-and-limitations)
59
  - [How to Use](#how-to-use)
 
63
  - [Evaluation](#evaluation)
64
  - [Variable and Metrics](#variable-and-metrics)
65
  - [Evaluation Results](#evaluation-results)
66
+ - [Author](#author)
67
+ - [Contact information](#contact-information)
68
+ - [Copyright](#copyright)
69
+ - [Licensing information](#licensing-information)
70
+ - [Funding](#funding)
71
+ - [Citing information](#citing-information)
72
+ - [Disclaimer](#disclaimer))
73
+ </details>
74
+
75
 
76
  ## Model description
77
 
78
  The **roberta-base-ca-cased-ner** is a Named Entity Recognition (NER) model for the Catalan language fine-tuned from the [BERTa](https://huggingface.co/PlanTL-GOB-ES/roberta-base-ca) model, a [RoBERTa](https://arxiv.org/abs/1907.11692) base model pre-trained on a medium-size corpus collected from publicly available corpora and crawlers (check the BERTa model card for more details).
79
 
80
+ ## Intended uses and limitations
81
+
82
+
83
+ ## How to use
84
+
85
+
86
+ ## Limitations and bias
87
+ At the time of submission, no measures have been taken to estimate the bias embedded in the model. However, we are well aware that our models may be biased since the corpora have been collected using crawling techniques on multiple web sources. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
88
+
89
+ ## Training
90
  We used the NER dataset in Catalan called [Ancora-ca-ner](https://huggingface.co/datasets/projecte-aina/ancora-ca-ner) for training and evaluation.
91
 
92
+ ## Evaluation
93
  We evaluated the _roberta-base-ca-cased-ner_ on the Ancora-ca-ner test set against standard multilingual and monolingual baselines:
94
 
95
  | Model | Ancora-ca-ner (F1)|
 
101
 
102
  For more details, check the fine-tuning and evaluation scripts in the official [GitHub repository](https://github.com/projecte-aina/club).
103
 
104
+ ## Additional information
105
+
106
+ ### Author
107
+ Text Mining Unit (TeMU) at the Barcelona Supercomputing Center (bsc-temu@bsc.es)
108
+
109
+ ### Contact information
110
+ For further information, send an email to aina@bsc.es
111
+
112
+ ### Copyright
113
+ Copyright (c) 2021 Text Mining Unit at Barcelona Supercomputing Center
114
+
115
+ ### Licensing Information
116
+ [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
117
+
118
+ ### Funding
119
+ This work was funded by the [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
120
+
121
+ ### Citation information
122
+
123
  If you use any of these resources (datasets or models) in your work, please cite our latest paper:
124
  ```bibtex
125
  @inproceedings{armengol-estape-etal-2021-multilingual,
 
143
  }
144
  ```
145
 
146
+ ### Disclaimer
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
 
148
  <details>
149
  <summary>Click to expand</summary>