|
--- |
|
license: mit |
|
datasets: |
|
- databricks/databricks-dolly-15k |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- dolly |
|
- dolly-v2 |
|
- instruct |
|
- sharded |
|
--- |
|
|
|
# dolly-v2-7b: sharded checkpoint |
|
|
|
This is a sharded checkpoint (with ~2GB shards) of the `databricks/dolly-v2-7b` model. Refer to the [original model](https://huggingface.co/databricks/dolly-v2-7b) for all details. |
|
|
|
## Basic Usage |
|
|
|
|
|
install `transformers`, `accelerate`, and `bitsandbytes`. |
|
|
|
```bash |
|
pip install -U -q transformers bitsandbytes accelerate |
|
``` |
|
|
|
Load the model in 8bit, then [run inference](https://huggingface.co/docs/transformers/generation_strategies#contrastive-search): |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_name = "ethzanalytics/dolly-v2-7b-sharded" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True) |
|
``` |