Update README.md
Browse files
README.md
CHANGED
@@ -4,14 +4,30 @@ tags:
|
|
4 |
- custom_generate
|
5 |
---
|
6 |
|
|
|
|
|
|
|
7 |
## Base model:
|
8 |
`Qwen/Qwen2.5-0.5B-Instruct`
|
9 |
|
10 |
-
##
|
11 |
-
|
12 |
|
13 |
## Additional Arguments
|
14 |
`left_padding` (`int`, *optional*): number of padding tokens to add before the provided input
|
15 |
|
16 |
## Output Type changes
|
17 |
(none)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- custom_generate
|
5 |
---
|
6 |
|
7 |
+
## Description
|
8 |
+
Test repo to experiment with calling `generate` from the hub. It is a simplified implementation of greedy decoding.
|
9 |
+
|
10 |
## Base model:
|
11 |
`Qwen/Qwen2.5-0.5B-Instruct`
|
12 |
|
13 |
+
## Model compatibility
|
14 |
+
Most models. More specifically, any `transformer` LLM/VLM trained for causal language modeling.
|
15 |
|
16 |
## Additional Arguments
|
17 |
`left_padding` (`int`, *optional*): number of padding tokens to add before the provided input
|
18 |
|
19 |
## Output Type changes
|
20 |
(none)
|
21 |
+
|
22 |
+
## Example usage
|
23 |
+
|
24 |
+
```py
|
25 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
26 |
+
|
27 |
+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct")
|
28 |
+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", device_map="auto")
|
29 |
+
|
30 |
+
inputs = tokenizer(["The quick brown"], return_tensors="pt").to(model.device)
|
31 |
+
gen_out = model.generate(**inputs, left_padding=5, custom_generate="transformers-community/custom_generate_example", trust_remote_code=True)
|
32 |
+
print(tokenizer.batch_decode(gen_out, skip_special_tokens=True))
|
33 |
+
```
|