scampion commited on
Commit
2c8ad5c
1 Parent(s): 376bb59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -10
README.md CHANGED
@@ -34,24 +34,55 @@ language:
34
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
35
  should probably proofread and complete it, then remove this comment. -->
36
 
37
- # EUBERT
38
 
39
- This model is a pretrained BERT uncased model trained on the last 30 years of documents registered by the [European Publications Office](https://op.europa.eu/)
40
 
 
41
 
42
- ![EUBERT](https://huggingface.co/EuropeanParliament/EUBERT/resolve/main/EUBERT_small.png)
 
 
 
 
 
43
 
44
- ## Model description
45
 
46
- More information needed
 
 
 
47
 
48
- ## Intended uses & limitations
49
 
50
- More information needed
 
51
 
52
- ## Training and evaluation data
53
 
54
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ## Training procedure
57
 
@@ -87,7 +118,7 @@ Coming soon
87
  - **Compute Region:** Meluxina
88
 
89
 
90
- # Model Card Authors [optional]
91
 
92
  Sebastien Campion
93
 
 
34
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
35
  should probably proofread and complete it, then remove this comment. -->
36
 
 
37
 
38
+ ## Model Card: EUBERT
39
 
40
+ ### Overview
41
 
42
+ - **Model Name**: EUBERT
43
+ - **Model Version**: 1.0
44
+ - **Date of Release**: 02 October 2023
45
+ - **Model Architecture**: BERT (Bidirectional Encoder Representations from Transformers)
46
+ - **Training Data**: Documents registered by the European Publications Office
47
+ - **Model Use Case**: Text Classification, Question Answering, Language Understanding
48
 
49
+ ### Model Description
50
 
51
+ EUBERT is a pretrained BERT uncased model that has been trained on a vast corpus of documents registered by the [European Publications Office](https://op.europa.eu/).
52
+ These documents span the last 30 years, providing a comprehensive dataset that encompasses a wide range of topics and domains.
53
+ EUBERT is designed to be a versatile language model that can be fine-tuned for various natural language processing tasks,
54
+ making it a valuable resource for a variety of applications.
55
 
56
+ ### Intended Use
57
 
58
+ EUBERT serves as a starting point for building more specific natural language understanding models.
59
+ Its versatility makes it suitable for a wide range of tasks, including but not limited to:
60
 
61
+ 1. **Text Classification**: EUBERT can be fine-tuned for classifying text documents into different categories, making it useful for applications such as sentiment analysis, topic categorization, and spam detection.
62
 
63
+ 2. **Question Answering**: By fine-tuning EUBERT on question-answering datasets, it can be used to extract answers from text documents, facilitating tasks like information retrieval and document summarization.
64
+
65
+ 3. **Language Understanding**: EUBERT can be employed for general language understanding tasks, including named entity recognition, part-of-speech tagging, and text generation.
66
+
67
+ ### Performance
68
+
69
+ The specific performance metrics of EUBERT may vary depending on the downstream task and the quality and quantity of training data used for fine-tuning.
70
+ Users are encouraged to fine-tune the model on their specific task and evaluate its performance accordingly.
71
+
72
+ ### Considerations
73
+
74
+ - **Data Privacy and Compliance**: Users should ensure that the use of EUBERT complies with all relevant data privacy and compliance regulations, especially when working with sensitive or personally identifiable information.
75
+
76
+ - **Fine-Tuning**: The effectiveness of EUBERT on a given task depends on the quality and quantity of the training data, as well as the fine-tuning process. Careful experimentation and evaluation are essential to achieve optimal results.
77
+
78
+ - **Bias and Fairness**: Users should be aware of potential biases in the training data and take appropriate measures to mitigate bias when fine-tuning EUBERT for specific tasks.
79
+
80
+ ### Conclusion
81
+
82
+ EUBERT is a pretrained BERT model that leverages a substantial corpus of documents from the European Publications Office. It offers a versatile foundation for developing natural language processing solutions across a wide range of applications, enabling researchers and developers to create custom models for text classification, question answering, and language understanding tasks. Users are encouraged to exercise diligence in fine-tuning and evaluating the model for their specific use cases while adhering to data privacy and fairness considerations.
83
+
84
+
85
+ ---
86
 
87
  ## Training procedure
88
 
 
118
  - **Compute Region:** Meluxina
119
 
120
 
121
+ # Model Card Authors
122
 
123
  Sebastien Campion
124