--- license: apache-2.0 tags: - multilingual - PyTorch - Transformers - gpt3 - gpt2 - Deepspeed - Megatron datasets: - mc4 - Wikipedia pipeline_tag: text-generation widget: - text: 'I know you''re tired, but can we go for another walk this evening? peter szemraj: ' example_title: walk - text: 'What do you call an alligator who''s just had surgery to remove his left arm? peter szemraj: ' example_title: alligator - text: 'If you could live anywhere, where would it be? peter szemraj: ' example_title: dream living place - text: 'What really makes you angry? peter szemraj: ' example_title: pet peeve - text: 'My friend says that she knows every language, but she doesn''t speak any of them.. what''s wrong with her? peter szemraj: ' example_title: language - text: 'What would you change about yourself if you could? peter szemraj: ' example_title: change - text: 'My first is in Asia, my second is in Europe, my third is in North America, and my fourth is in South America. What am I? peter szemraj: ' example_title: continent - text: 'Can you take me for dinner somewhere nice this time? peter szemraj: ' example_title: dinner - text: 'Honey, I have clogged the toilet for the third time this month.. sorry.. peter szemraj: ' example_title: overflow - text: 'A man pushes his car to a hotel and tells the owner he''s bankrupt. Why? peter szemraj: ' example_title: brain teaser inference: parameters: min_length: 2 max_length: 64 length_penalty: 0.4 no_repeat_ngram_size: 3 do_sample: true top_p: 0.95 top_k: 30 temperature: 0.65 repetition_penalty: 3.5 base_model: sberbank-ai/mGPT --- # mGPT: fine-tune on message data MWE This model is a fine-tuned version of [sberbank-ai/mGPT](https://huggingface.co/sberbank-ai/mGPT) on 80k messages. Trained for one epoch, will be updated in a (separate) model repo later. ## Model description - testing if fine-tuned personality data bleeds over to other languages without being trained in them explicitly ### Usage in python Install the transformers library if you don't have it: ``` pip install -U transformers ``` load the model into a pipeline object: ``` from transformers import pipeline import torch device = 'cuda' if torch.cuda.is_available() else 'cpu' my_chatbot = pipeline('text-generation', 'pszemraj/mGPT-Peter-mwe', device=0 if device == 'cuda' else -1, ) ``` ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - gradient_accumulation_steps: 8 - total_train_batch_size: 32 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine_with_restarts - lr_scheduler_warmup_ratio: 0.05 - num_epochs: 1 ### Framework versions - Transformers 4.18.0 - Pytorch 1.11.0+cu113 - Datasets 2.1.0 - Tokenizers 0.12.1