Kirill Gelvan commited on
Commit
505f6f7
1 Parent(s): a0f2756

add some descriptions

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -5,11 +5,15 @@ tags:
5
  ---
6
  ### Description
7
 
 
8
 
9
- ### Inference
10
 
11
- ```python
 
12
 
 
 
 
13
  def get_length_param(text: str, tokenizer) -> str:
14
  tokens_count = len(tokenizer.encode(text))
15
  if tokens_count <= 15:
 
5
  ---
6
  ### Description
7
 
8
+ DialoGPT trained on Russian language and fine tuned on my telegram chat.
9
 
 
10
 
11
+ This model was created by [sberbank-ai](https://hf.co/sberbank-ai) and trained on Russian forums (see [Grossmend's model](https://hf.co/Grossmend/rudialogpt3_medium_based_on_gpt2)). You can find info about how it has been trained on [habr](https://habr.com/ru/company/icl_services/blog/548244/) (in Russian). I have created a **simple pipeline** and **fine tuned** that model on my own **exported telegram chat** (~30mb json). It is in fact very easy to get the data from telegram and fine tune a model. Therefore, I made a **colab tutorial** for it: link
12
+
13
 
14
+ ### How to use
15
+
16
+ ```python
17
  def get_length_param(text: str, tokenizer) -> str:
18
  tokens_count = len(tokenizer.encode(text))
19
  if tokens_count <= 15: