File size: 4,800 Bytes
d3373bb 7db3c3a d3373bb 95f965b 5b40be3 95f965b 1d23860 6d224a5 95f965b 2cda0fa 95f965b 4193aa5 95f965b bbcbb86 95f965b 33f6d75 95f965b 92b7c3d 6b7900e c841d9e 92b7c3d 95f965b 6b7900e 92b7c3d e466a20 b44f552 92b7c3d 95f965b 92b7c3d 95f965b 2dc988b 95f965b 92b7c3d 2dc988b 95f965b 6b7900e 92b7c3d 6b7900e 92b7c3d 95f965b 2dc988b 95f965b 33f6d75 95f965b 33f6d75 95f965b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
---
language:
- sr
tags:
- Srpski
- Serbian
- GPT2
- generisanje
- generation
name:
- Serbian-GPT-2
---
# The Best Generative GPT-2 Model For The Serbian Language
**NOTE**: This model is locked with a key, if you need decryption keys, feel free to contact us at info@edukom.rs
![flag.png](https://cdn-uploads.huggingface.co/production/uploads/64fc6ba4e0dc35986bc3b6ee/gCUs3UIix41opzOu1mkD7.png)
By sharing this model, we aim to foster further research and applications in Serbian language processing.
### Introduction:
This GPT-2 model has been tuned on an extensive Serbian corpus, boasting a richness of 750 million tokens. It is designed to generate high-quality text in Serbian, capturing the nuances and intricacies of the language.
### Dataset Details:
The dataset encompasses a diverse range of topics, representing various aspects of the Serbian language and culture. Size: 750 million tokens.
### Model Usage:
This model can be utilized for various NLP tasks such as text generation, summarization, translation, and more. Due to its comprehensive training on a vast corpus, it promises accurate and contextually relevant outputs, especially for tasks related to the Serbian language.
### Download & Decryption the Model:
import os
import requests
import shutil
import threading
import time
from transformers import GPT2LMHeadModel
from cryptography.fernet import Fernet
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# Download Serbian-GPT-2 model
print("\nDownload Serbian-GPT-2 model...")
model_name = 'edukom/Serbian-GPT-2'
base_url = f'https://huggingface.co/{model_name}/resolve/main/'
files_to_download = ['added_tokens.json', 'config.json', 'generation_config.json', 'merges.txt', 'pytorch_model.bin', 'special_tokens_map.json', 'tokenizer.json', 'tokenizer_config.json', 'vocab.json']
cache_dir = 'path/to/where/you/want/to/store/the/model'
for file in files_to_download:
response = requests.get(base_url + file)
with open(os.path.join(cache_dir, file), 'wb') as f:
f.write(response.content)
# Decryption pytorch_model.bin
key = input("\nEnter the decryption key: ").encode()
cipher_suite = Fernet(key)
decryption_data = os.path.join(cache_dir, 'pytorch_model.bin')
try:
with open(decryption_data, 'rb') as file:
encrypted_data = file.read()
decrypted_data = cipher_suite.decrypt(encrypted_data)
with open(decryption_data, 'wb') as file:
file.write(decrypted_data)
def find_and_copy():
base_snapshot_dir = os.path.join(cache_dir, 'models--edukom--Serbian-GPT-2', 'snapshots')
while not os.path.exists(base_snapshot_dir):
time.sleep(0.1)
while True:
existing_dirs = [d for d in os.listdir(base_snapshot_dir) if os.path.isdir(os.path.join(base_snapshot_dir, d))]
if existing_dirs:
destination_path = os.path.join(base_snapshot_dir, existing_dirs[0], 'pytorch_model.bin')
shutil.copyfile(decryption_data, destination_path)
break
time.sleep(0.1)
# Start the copy process in parallel
copy_thread = threading.Thread(target=find_and_copy, name="find_and_copy")
copy_thread.start()
# Loading Serbian-GPT-2 model
model = GPT2LMHeadModel.from_pretrained(model_name, cache_dir=cache_dir)
# Ensure the copying finishes
copy_thread.join()
print("\nCongratulations, the Serbian-GPT-2 model is ready for use ヅ\n")
except Exception as e:
print(f"\nError during decryption: {e}")
print("\nYou can decrypt the model by contacting the author of this model who will add the key, email: info@edukom.rs")
# Now you can use the Serbian-GPT-2 model for further operations...
### Model Usage License:
The author of this model is the company **Edukom AI**. The model is protected by encryption and its use requires a decryption key.
This model is available under the following license:
**For private and non-public use**: This model is freely available for use without any additional obligations. You can use it in your internal projects and experiments without any restrictions.
**For commercial use**: For commercial use of this model, users are required to contact Edukom AI company to obtain the appropriate license and agreement.
Please adhere to the license terms when using this model. For any questions or if you need decryption keys, feel free to contact us at **info@edukom.rs**
Thank you for using our model! ヅ
![Screenshot.png](https://cdn-uploads.huggingface.co/production/uploads/64fc6ba4e0dc35986bc3b6ee/UoIvwAez4ZoiEsHyx-vn6.png)
|