File size: 1,715 Bytes
6e5c182
 
765e0ec
 
 
 
 
 
 
 
6e5c182
765e0ec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
license: gpl
model_name: GPT2
model_type: GPT2
language: en
pipeline_tag: text-generation
tags:
- pytorch
- gpt
- gpt2
---


# Fine-tuning GPT2 with energy plus medical dataset

Fine tuning pre-trained language models for text generation.

Pretrained model on Chinese language using a GPT2 for Large Language Head Model objective.

## Model description

transferlearning from DavidLanz/uuu_fine_tune_taipower and fine-tuning with medical dataset for the GPT-2 architecture.

### How to use

You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we
set a seed for reproducibility:

```python
>>> from transformers import GPT2LMHeadModel, BertTokenizer, TextGenerationPipeline

>>> model_path = "DavidLanz/DavidLanz/uuu_fine_tune_gpt2"
>>> model = GPT2LMHeadModel.from_pretrained(model_path)
>>> tokenizer = BertTokenizer.from_pretrained(model_path)

>>> max_length = 200
>>> prompt = "歐洲能源政策"
>>> text_generator = TextGenerationPipeline(model, tokenizer)
>>> text_generated = text_generator(prompt, max_length=max_length, do_sample=True)
>>> print(text_generated[0]["generated_text"].replace(" ",""))
```

```python
>>> from transformers import GPT2LMHeadModel, BertTokenizer, TextGenerationPipeline

>>> model_path = "DavidLanz/DavidLanz/uuu_fine_tune_gpt2"
>>> model = GPT2LMHeadModel.from_pretrained(model_path)
>>> tokenizer = BertTokenizer.from_pretrained(model_path)

>>> max_length = 200
>>> prompt = "蕁麻疹過敏"
>>> text_generator = TextGenerationPipeline(model, tokenizer)
>>> text_generated = text_generator(prompt, max_length=max_length, do_sample=True)
>>> print(text_generated[0]["generated_text"].replace(" ",""))
```