crodri commited on
Commit
2583146
1 Parent(s): c4a56de

added accuracies

Browse files
Files changed (1) hide show
  1. README.md +77 -4
README.md CHANGED
@@ -66,12 +66,85 @@ It has been trained with a dataset that contains 9 main types and 52 subtypes on
66
  At the time of submission, no measures have been taken to estimate the bias embedded in the model. However, we are well aware that our models may be biased since the corpora have been collected using crawling techniques on multiple web sources. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
67
 
68
  ## Training
69
- We used the NER dataset in Catalan called [Catalan Entity Identification and Linking](https://huggingface.co/datasets/crodri/ceil) for training and evaluation.
70
 
71
  ## Evaluation
72
- |
73
 
74
- For more details, check the fine-tuning and evaluation scripts in the official [GitHub repository](https://github.com/projecte-aina/club).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
  ## Additional information
77
 
@@ -103,4 +176,4 @@ The models published in this repository are intended for a generalist purpose an
103
 
104
  When third parties, deploy or provide systems and/or services to other parties using any of these models (or using systems based on these models) or become users of the models, they should note that it is their responsibility to mitigate the risks arising from their use and, in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
105
 
106
- In no event shall the owner and creator of the models (BSC – Barcelona Supercomputing Center) be liable for any results arising from the use made by third parties of these models.
 
66
  At the time of submission, no measures have been taken to estimate the bias embedded in the model. However, we are well aware that our models may be biased since the corpora have been collected using crawling techniques on multiple web sources. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
67
 
68
  ## Training
69
+ We used the NERC dataset in Catalan called [Catalan Entity Identification and Linking](https://huggingface.co/datasets/crodri/ceil) for training and evaluation.
70
 
71
  ## Evaluation
 
72
 
73
+ Accuracy was calculated using the development set, and reflects the non-balanced nature of the dataset.
74
+
75
+ ### Major types
76
+
77
+ | Type | Accuracy | num. Instances in dev set |
78
+ | ------ | ------ | ------ |
79
+ | CW | 0.842 | 4551 |
80
+ | GPE | 0.914 | 19751 |
81
+ | Other | 0.69 | 2824 |
82
+ | building | 0.736 | 2188 |
83
+ | event | 0.739 | 3000 |
84
+ | location | 0.819 | 3408 |
85
+ | organization | 0.895 | 17285 |
86
+ | person | 0.903 | 21689 |
87
+ | product | 0.64 | 1038 |
88
+
89
+
90
+ ### Subtypes
91
+
92
+ | Type | Accuracy | num. Instances in dev set |
93
+ | ------ | ------ | ------ |
94
+ | CW-broadcastprogram | 0.854 | 765 |
95
+ | CW-film | 0.809 | 549 |
96
+ | CW-music | 0.862 | 1027 |
97
+ | CW-other | 0.495 | 555 |
98
+ | CW-painting | 0.654 | 205 |
99
+ | CW-writtenart | 0.814 | 1450 |
100
+ | GPE | 0.914 | 19751 |
101
+ | Other | 0.69 | 2824 |
102
+ | building-airport | 0.733 | 176 |
103
+ | building-governmentfacility | 0.514 | 72 |
104
+ | building-hospital | 0.805 | 113 |
105
+ | building-hotel | 0.688 | 32 |
106
+ | building-other | 0.726 | 1585 |
107
+ | building-religious | 0.0 | 1 |
108
+ | building-restaurant | 0.458 | 48 |
109
+ | building-shops | 0.206 | 34 |
110
+ | building-sportsfacility | 0.74 | 127 |
111
+ | event-attack/terrorism/militaryconflict | 0.866 | 411 |
112
+ | event-disaster | 0.261 | 23 |
113
+ | event-other | 0.695 | 1069 |
114
+ | event-political | 0.527 | 444 |
115
+ | event-protest | 0.207 | 29 |
116
+ | event-sportsevent | 0.822 | 1024 |
117
+ | location-bodiesofwater | 0.865 | 673 |
118
+ | location-island | 0.457 | 140 |
119
+ | location-mountain | 0.781 | 515 |
120
+ | location-other | 0.757 | 1602 |
121
+ | location-park | 0.581 | 93 |
122
+ | location-road/railway/highway/transit | 0.805 | 385 |
123
+ | organization-education | 0.868 | 2097 |
124
+ | organization-government | 0.905 | 2939 |
125
+ | organization-media | 0.888 | 1963 |
126
+ | organization-onlinebusiness | 0.538 | 197 |
127
+ | organization-other | 0.788 | 4733 |
128
+ | organization-politicalparty | 0.956 | 2272 |
129
+ | organization-privatecompany | 0.849 | 1809 |
130
+ | organization-religious | 0.638 | 210 |
131
+ | organization-sportsteam | 0.946 | 1065 |
132
+ | person-actor/director | 0.797 | 1480 |
133
+ | person-artist/author | 0.853 | 5812 |
134
+ | person-athlete | 0.871 | 1306 |
135
+ | person-group | 0.485 | 699 |
136
+ | person-influencer | 0.0 | 17 |
137
+ | person-other | 0.811 | 8444 |
138
+ | person-politician | 0.863 | 3259 |
139
+ | person-scholar/scientist | 0.728 | 672 |
140
+ | product-E-device | 0.51 | 102 |
141
+ | product-clothing | 0.222 | 27 |
142
+ | product-consumer_good | 0.0 | 20 |
143
+ | product-food | 0.673 | 324 |
144
+ | product-other | 0.0 | 69 |
145
+ | product-software | 0.67 | 382 |
146
+ | product-vehicle | 0.825 | 114 |
147
+
148
 
149
  ## Additional information
150
 
 
176
 
177
  When third parties, deploy or provide systems and/or services to other parties using any of these models (or using systems based on these models) or become users of the models, they should note that it is their responsibility to mitigate the risks arising from their use and, in any event, to comply with applicable regulations, including regulations regarding the use of Artificial Intelligence.
178
 
179
+ In no event shall the owner and creator of the models (BSC – Barcelona Supercomputing Center) be liable for any results arising from the use made by third parties of these models.