rhysjones commited on
Commit
dab5ba3
·
1 Parent(s): e9442f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -15,7 +15,7 @@ _This is a Welsh version of the [ALMA](https://github.com/fe1ixxu/ALMA) LLM-base
15
  Mae'r model LLM yn seiliedig ar Lama-2-13B, gyda hyfforddiant parhaus ar ddata Gymreig [OSCAR-2301](https://huggingface.co/datasets/oscar-corpus/OSCAR-2301) am 3 Epoch
16
  ac yna hyfforddiant cywrain pellach ar ddata Cofnod y Cynulliad a ddarparir gan [TechIaith](https://huggingface.co/datasets/techiaith/cofnodycynulliad_en-cy).
17
 
18
- Mae fersiwn sydd wedi ei gywasgu i 4.0bpw er mwyn llwytho mewn cof GPU o 10GB ar gael [yma](https://huggingface.co/BangorAI/ALMA-Cymraeg-13B-0.1-4.0bpw-exl2).
19
 
20
  ### Fformat Sgwrs
21
 
@@ -34,6 +34,32 @@ Cyfieithwch y testun Saesneg canlynol i'r Gymraeg.
34
  #### Esiampl
35
 
36
  ```python
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
38
 
39
  ## Hawlfraint
 
15
  Mae'r model LLM yn seiliedig ar Lama-2-13B, gyda hyfforddiant parhaus ar ddata Gymreig [OSCAR-2301](https://huggingface.co/datasets/oscar-corpus/OSCAR-2301) am 3 Epoch
16
  ac yna hyfforddiant cywrain pellach ar ddata Cofnod y Cynulliad a ddarparir gan [TechIaith](https://huggingface.co/datasets/techiaith/cofnodycynulliad_en-cy).
17
 
18
+ Mae fersiwn cyflymach sydd wedi ei gywasgu i 4.0bpw er mwyn llwytho mewn cof GPU o 10GB ar gael [yma](https://huggingface.co/BangorAI/ALMA-Cymraeg-13B-0.1-4.0bpw-exl2).
19
 
20
  ### Fformat Sgwrs
21
 
 
34
  #### Esiampl
35
 
36
  ```python
37
+ from transformers import AutoModelForCausalLM, AutoTokenizer
38
+
39
+ device = "cuda"
40
+
41
+ model = AutoModelForCausalLM.from_pretrained("./trosi13b", torch_dtype=torch.float16, load_in_8bit=True)
42
+ tokenizer = AutoTokenizer.from_pretrained("./trosi13b")
43
+
44
+ prompt = """Cyfieithwch y testun Saesneg canlynol i'r Gymraeg.
45
+ ### Saesneg:
46
+ For the first time, GPs no longer have to physically print, sign and hand a green paper prescription form to the patient or wait for it to be taken to the pharmacy. Instead, the prescription is sent electronically from the surgery via the IT system to the patient’s chosen pharmacy - even without the patient needing to visit the surgery to pick up a repeat prescription form.
47
+
48
+ ### Cymraeg:
49
+ """
50
+
51
+ model_inputs = tokenizer([prompt], return_tensors="pt").to(device)
52
+
53
+ generated_ids = model.generate(**model_inputs,
54
+ eos_token_id=tokenizer.eos_token_id,
55
+ top_k=90,
56
+ top_p=1.0,
57
+ temperature=0.3,
58
+ repetition_penalty=1.2,
59
+ max_new_tokens=500,
60
+ do_sample=True)
61
+ print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0])
62
+
63
  ```
64
 
65
  ## Hawlfraint