umarzein commited on
Commit
2afc1bb
1 Parent(s): 0677bc7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -7,7 +7,7 @@ language:
7
 
8
  This is [indonesian-nlp/gpt2-medium-indonesian](https://huggingface.co/indonesian-nlp/gpt2-medium-indonesian) finetuned
9
  on [databrick's dolly 15k dataset translated using m2m100_418](https://huggingface.co/datasets/umarzein/databricks-dolly-15k-en)
10
- over 1024 iterations, 3 epochs
11
 
12
  template: `<|konteks|>{konteks}<|instruksi|>{instruksi}<|jawaban|>{jawaban}`
13
 
@@ -29,7 +29,7 @@ tokenizer = GPT2Tokenizer.from_pretrained(config.base_model_name_or_path)
29
 
30
  model = PeftModel.from_pretrained(model, peft_model_path)
31
 
32
- batch = tokenizer(f"<|konteks|><|instruksi|>Apa itu internet?<|jawaban|>", return_tensors='pt')
33
 
34
  output_tokens = model.generate(**batch, max_new_tokens=50, repetition_penalty=1.17)
35
 
@@ -60,12 +60,16 @@ tokenizer = GPT2Tokenizer.from_pretrained(config.base_model_name_or_path)
60
 
61
  model = PeftModel.from_pretrained(model, peft_model_path)
62
 
63
- batch = tokenizer(f"<|konteks|><|instruksi|>Apa itu internet?<|jawaban|>", return_tensors='pt')
64
 
65
  with torch.cuda.amp.autocast():
66
  output_tokens = model.generate(**batch, max_new_tokens=50, repetition_penalty=1.17)
67
 
68
  print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))
 
 
 
 
69
  ```
70
 
71
  ### Some Results
 
7
 
8
  This is [indonesian-nlp/gpt2-medium-indonesian](https://huggingface.co/indonesian-nlp/gpt2-medium-indonesian) finetuned
9
  on [databrick's dolly 15k dataset translated using m2m100_418](https://huggingface.co/datasets/umarzein/databricks-dolly-15k-en)
10
+ over 1024 steps, 3 epochs
11
 
12
  template: `<|konteks|>{konteks}<|instruksi|>{instruksi}<|jawaban|>{jawaban}`
13
 
 
29
 
30
  model = PeftModel.from_pretrained(model, peft_model_path)
31
 
32
+ batch = tokenizer("<|konteks|><|instruksi|>Apa itu internet?<|jawaban|>", return_tensors='pt')
33
 
34
  output_tokens = model.generate(**batch, max_new_tokens=50, repetition_penalty=1.17)
35
 
 
60
 
61
  model = PeftModel.from_pretrained(model, peft_model_path)
62
 
63
+ batch = tokenizer("<|konteks|><|instruksi|>Apa itu internet?<|jawaban|>", return_tensors='pt')
64
 
65
  with torch.cuda.amp.autocast():
66
  output_tokens = model.generate(**batch, max_new_tokens=50, repetition_penalty=1.17)
67
 
68
  print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))
69
+
70
+ # <|konteks|><|instruksi|>Apa itu internet?<|jawaban|>Internet adalah jaringan global yang menghubungkan komputer
71
+ # di seluruh dunia. Internet terdiri dari jutaan server dan ribuan perangkat lunak, termasuk sistem operasi, aplikasi
72
+ # web, browser, email, dll.<|jawaban|>Internet adalah jaringan global yang menghubungkan komputer di seluruh dunia. Internet
73
  ```
74
 
75
  ### Some Results