Kuduxaaa commited on
Commit
85d3e38
โ€ข
1 Parent(s): d374759

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md CHANGED
@@ -1,3 +1,77 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - wikimedia/wikipedia
5
+ language:
6
+ - ka
7
+ - en
8
+ pipeline_tag: text-generation
9
  ---
10
+
11
+ # GPT-2Geo: Georgian Language Model
12
+
13
+ ## Overview
14
+
15
+ GPT-2Geo is a powerful language model tailored for the Georgian language, built upon OpenAI's GPT-2 architecture. This model is designed for various natural language processing tasks, including text generation and understanding. [Github](https://github.com/Kuduxaaa/gpt2-geo)
16
+
17
+ ## Features
18
+
19
+ - **Georgian Language Model:** Specifically trained to understand and generate text in the Georgian language.
20
+ - **GPT-2 Architecture:** Built upon OpenAI's GPT-2, providing a versatile and efficient language model.
21
+ - **Easy Integration:** Seamless integration with the Hugging Face Transformers library.
22
+
23
+ ## Training Information
24
+
25
+ ### Environment:
26
+ - **GPU:** Nvidia T4 (15GB)
27
+ - **Model Memory Requirement:** Minimum 13.5GB
28
+
29
+ ### Training Configuration:
30
+ - **Number of Epochs:** 20
31
+ - **Time Consumed:** 49 minutes
32
+
33
+ ### Training Progress:
34
+ The GPT-2Geo model underwent training in a high-performance environment utilizing the Nvidia T4 GPU with 15GB of dedicated memory. This powerful hardware met the minimum model memory requirement of 13.5GB, ensuring optimal performance during the training process.
35
+
36
+ The training configuration included 20 epochs, allowing the model to iteratively learn from the dataset. The entire training procedure was completed in a time-efficient manner, consuming approximately 49 minutes.
37
+
38
+ For detailed insights into the model's performance, refer to the training logs, which capture key metrics such as validation loss over epochs. This information provides users with a comprehensive understanding of the training environment, configuration, and progress.
39
+
40
+ Ensure that your GPU environment is correctly configured to harness the full potential of the available hardware during the training phase.
41
+
42
+ ## Usage
43
+
44
+ ### Example
45
+ ```python
46
+ import torch
47
+
48
+ from transformers import GPT2LMHeadModel, GPT2Config, ElectraTokenizerFast
49
+
50
+ model_name = 'Kuduxaaa/gpt2-geo'
51
+ model = GPT2LMHeadModel.from_pretrained(model_name)
52
+ tokenizer = ElectraTokenizerFast.from_pretrained(model_name)
53
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
54
+ model.to(device)
55
+
56
+ prompt = 'แƒฅแƒแƒ แƒ—แƒฃแƒš แƒ›แƒ˜แƒ—แƒแƒšแƒแƒ’แƒ˜แƒแƒจแƒ˜ '
57
+
58
+ input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
59
+ output = model.generate(
60
+ input_ids,
61
+ max_length = 100,
62
+ num_beams = 5,
63
+ no_repeat_ngram_size = 2,
64
+ top_k = 50,
65
+ top_p = 0.95,
66
+ temperature = 0.7
67
+ )
68
+
69
+ result = tokenizer.decode(output[0], skip_special_tokens=True)
70
+ print(result)
71
+ # แƒฅแƒแƒ แƒ—แƒฃแƒš แƒ›แƒ˜แƒ—แƒแƒšแƒแƒ’แƒ˜แƒแƒจแƒ˜, แƒ›แƒ˜แƒ—แƒ”แƒ‘แƒ˜แƒก แƒžแƒ”แƒ แƒกแƒแƒœแƒแƒŸแƒ”แƒ‘แƒ˜ แƒ“แƒ. แƒ›แƒ˜แƒ—แƒ”แƒ‘แƒ˜ แƒ“แƒแƒ™แƒแƒ•แƒจแƒ˜แƒ แƒ”แƒ‘แƒฃแƒšแƒ˜ แƒ›แƒ˜แƒ—แƒฃแƒ แƒ˜ แƒฌแƒแƒ แƒ›แƒแƒจแƒแƒ‘แƒแƒก, แƒ แƒแƒ›แƒ”แƒšแƒ˜แƒช แƒฌแƒแƒ แƒ›แƒแƒ˜แƒจแƒ•แƒ แƒ›แƒ˜แƒ—แƒ˜ แƒ’แƒแƒ แƒ”แƒ›แƒแƒช, แƒ แƒแƒ› แƒแƒ› แƒžแƒ”แƒ แƒ˜แƒแƒ“แƒจแƒ˜ แƒ“แƒ แƒกแƒฎแƒ•แƒ แƒกแƒฎแƒ•แƒ. แƒแƒ’แƒ แƒ”แƒ—แƒ•แƒ” แƒ›แƒ˜แƒ—แƒ˜แƒ“แƒแƒœ แƒฌแƒแƒ แƒ›แƒแƒแƒ“แƒ’แƒ”แƒœแƒก แƒ›แƒ˜แƒ—แƒ”แƒ‘แƒ—แƒแƒœ แƒ”แƒ แƒ—แƒแƒ“, แƒ แƒแƒ’แƒแƒ แƒช แƒกแƒแƒจแƒฃแƒแƒšแƒ”แƒ‘แƒ”แƒ‘แƒ˜, แƒกแƒแƒคแƒฃแƒซแƒ•แƒšแƒแƒ“ แƒฌแƒแƒ แƒ›แƒแƒ”แƒ‘แƒ˜แƒก แƒฌแƒแƒ แƒกแƒฃแƒšแƒจแƒ˜. แƒšแƒ˜แƒขแƒ”แƒ แƒแƒขแƒฃแƒ แƒ แƒฌแƒแƒ แƒ›แƒแƒ›แƒแƒ•แƒšแƒแƒ‘แƒ”แƒ‘แƒก แƒ›แƒ˜แƒ—
72
+
73
+ ```
74
+
75
+ ## Acknowledgments
76
+
77
+ This project is made possible by the contributions of Nika Kudukashvili and the open-source community. Special thanks to OpenAI for the GPT-2 architecture and `jnz/electra-ka` for georgian tokenizer.