DeividasM
/

gpt2_lithuanian_small

Text Generation

Inference Endpoints

Model card Files Files and versions Community

DeividasM commited on Jul 6, 2022

Commit

f6fc929

•

1 Parent(s): 91378fc

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -7,20 +7,20 @@ license: apache-2.0
 tags:
   - "text generation"
 ---
-## Model description
 GPT-2 model from Lithuania using Wikipedia corpus dataset based on GPT-2 small model.
 This is only the first version of the model, over time model will be improved using a bigger dataset and better data preparation.
-## Training data
 This model was pre-trained with 180MB of Lithuanian Wikipedia. The texts are tokenized using a byte-level version of Byte Pair Encoding (BPE).
-## Training
 The model was trained on wiki-corpus for 40 hours using NVIDIA Tesla P100 GPU.
-## How to use
-### Load model
 ``` from transformers import AutoTokenizer, TFAutoModelWithLMHead
 import tensorflow as tf
@@ -33,7 +33,7 @@ tokenizer.model_max_length=1024
 model.eval()
 ```
-### Generate text
 ``` text = "tekstas"
 inputs = tokenizer.encode(text, return_tensors="tf")

 tags:
   - "text generation"
 ---
+### Model description
 GPT-2 model from Lithuania using Wikipedia corpus dataset based on GPT-2 small model.
 This is only the first version of the model, over time model will be improved using a bigger dataset and better data preparation.
+### Training data
 This model was pre-trained with 180MB of Lithuanian Wikipedia. The texts are tokenized using a byte-level version of Byte Pair Encoding (BPE).
+### Training
 The model was trained on wiki-corpus for 40 hours using NVIDIA Tesla P100 GPU.
+### How to use
+## Load model
 ``` from transformers import AutoTokenizer, TFAutoModelWithLMHead
 import tensorflow as tf
 model.eval()
 ```
+## Generate text
 ``` text = "tekstas"
 inputs = tokenizer.encode(text, return_tensors="tf")