zpn lucianosb commited on
Commit
34f9f4a
1 Parent(s): 40d37fb

Add instructions for inference (#1)

Browse files

- Add instructions for inference (f90087607f1617a077af3f2d5ada2f8e7839be99)


Co-authored-by: Luciano Santa Brígida <lucianosb@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +22 -1
README.md CHANGED
@@ -31,11 +31,32 @@ To download a model with a specific revision run
31
  ```python
32
  from transformers import AutoModelForCausalLM
33
 
34
- model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-falcon")
35
  ```
36
 
37
  Downloading without specifying `revision` defaults to `main`/`v1.0`.
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ### Model Sources [optional]
40
 
41
  <!-- Provide the basic links for the model. -->
 
31
  ```python
32
  from transformers import AutoModelForCausalLM
33
 
34
+ model = AutoModelForCausalLM.from_pretrained("nomic-ai/gpt4all-falcon", trust_remote_code=True)
35
  ```
36
 
37
  Downloading without specifying `revision` defaults to `main`/`v1.0`.
38
 
39
+ To use it for inference with Cuda, run
40
+
41
+ ```python
42
+ from transformers import AutoTokenizer, pipeline
43
+ import transformers
44
+ import torch
45
+
46
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
47
+ model.to("cuda:0")
48
+
49
+ prompt = "Describe a painting of a falcon in a very detailed way." # Change this to your prompt
50
+ prompt_template = f"### Instruction: {prompt}\n### Response:"
51
+
52
+ tokens = tokenizer(prompt_template, return_tensors="pt").input_ids.to("cuda:0")
53
+ output = model.generate(input_ids=tokens, max_new_tokens=256, do_sample=True, temperature=0.8)
54
+
55
+ # Print the generated text
56
+ print(tokenizer.decode(output[0]))
57
+ ```
58
+
59
+
60
  ### Model Sources [optional]
61
 
62
  <!-- Provide the basic links for the model. -->