t5-11b-sharded / README.md
philschmid's picture
philschmid HF staff
Update README.md
adb5606
---
language:
- en
- fr
- ro
- de
datasets:
- c4
tags:
- text2text-generation
- endpoints-template
license: apache-2.0
---
# Fork of [t5-11b](https://huggingface.co/t5-11b)
> This is fork of [t5-11b](https://huggingface.co/t5-11b) implementing a custom `handler.py` as an example for how to use `t5-11b` with [inference-endpoints](https://hf.co/inference-endpoints) on a single NVIDIA T4.
---
# Model Card for T5 11B - fp16
![model image](https://camo.githubusercontent.com/623b4dea0b653f2ad3f36c71ebfe749a677ac0a1/68747470733a2f2f6d69726f2e6d656469756d2e636f6d2f6d61782f343030362f312a44304a31674e51663876727255704b657944387750412e706e67)
# Use with Inference Endpoints
Hugging Face Inference endpoints can be used with an HTTP client in any language. We will use Python and the `requests` library to send our requests. (make your you have it installed `pip install requests`)
![result](inference.png)
## Send requests with Pyton
```python
import json
import requests as r
ENDPOINT_URL=""# url of your endpoint
HF_TOKEN=""
# payload samples
regular_payload = { "inputs": "translate English to German: The weather is nice today." }
parameter_payload = {
"inputs": "translate English to German: Hello my name is Philipp and I am a Technical Leader at Hugging Face",
"parameters" : {
"max_length": 40,
}
}
# HTTP headers for authorization
headers= {
"Authorization": f"Bearer {HF_TOKEN}",
"Content-Type": "application/json"
}
# send request
response = r.post(ENDPOINT_URL, headers=headers, json=paramter_payload)
generated_text = response.json()
print(generated_text)
```