usmiva
/

gpt-web-bg

@@ -5,11 +5,11 @@ language:
 pipeline_tag: text-generation
 ---
-# Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
-This model is pre-trained with the causal language modelling objective on a private web scraped dataset created at the Bulgarian Academy of Sciences under the [ClaDa-BG Project](https://clada-bg.eu/en/).
 The dataset is cleaned and balanced with a specialized procedure to avoid cultural, political, racial and other biases. The procedure is described in the paper dedicated to this model- coming soon!
@@ -18,7 +18,7 @@ The dataset is cleaned and balanced with a specialized procedure to avoid cultur
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
@@ -39,7 +39,8 @@ The dataset is cleaned and balanced with a specialized procedure to avoid cultur
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
@@ -75,7 +76,17 @@ Users (both direct and downstream) should be made aware of the risks, biases and
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details

 pipeline_tag: text-generation
 ---
+# Model Card for GPT-WEB-BG
 <!-- Provide a quick summary of what the model is/does. -->
+This model is pre-trained with the causal language modelling objective on a private dataset with web scraped content created at the Bulgarian Academy of Sciences under the [ClaDa-BG Project](https://clada-bg.eu/en/).
 The dataset is cleaned and balanced with a specialized procedure to avoid cultural, political, racial and other biases. The procedure is described in the paper dedicated to this model- coming soon!
 ### Model Description
+The model is the first from a series of Large Languege Models for Bulgarian.
 ## Uses
+The model is trained on the causal language modeling objective and can be used to generate content based on textual input. It can be further finetuned for specific NLP tasks in the online media domain such as Event Extraction, Relation Extracation, Named Entity Recognition, etc.
+This model is intended for use from researchers and practitioners in the NLP field.
 ### Direct Use
 Use the code below to get started with the model.
+```python
+from transformers import pipeline, set_seed
+gpt_web_bg = pipeline('text-generation', model='/usmiva/gpt_web_bg', max_length=50, num_beams=3, temperature=0.8)
+set_seed(42)
+```
+```python
+gpt_web_bg("По професия той е ")
+```
+[{'generated_text': 'По професия той е строителен работник, който е �'}]
 ## Training Details