File size: 3,401 Bytes
605e585
c6286f4
9fb070b
 
 
 
 
 
 
 
 
 
 
 
5eba716
9fb070b
5eba716
28d7e77
9728a0d
d3d66fb
f50bc94
ecbe501
9728a0d
 
 
 
 
 
 
 
 
 
 
 
 
 
a981ba2
 
23a455a
e8df085
afb8221
85f94d2
73a0acf
 
85f94d2
 
afb8221
 
 
 
 
 
 
85f94d2
afb8221
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
license: apache-2.0
language:
- ru
- en
- de
- es
- it
- ja
- vi
- zh
- fr
- pt
- id
- ko
pipeline_tag: text-generation
---
# 🌍 Vulture-180B
***Vulture-180B*** is a further fine-tuned causal Decoder-only LLM built by Virtual Interactive (VILM), on top of the famous **Falcon-180B** by [TII](https://www.tii.ae). We collected a new dataset from news articles and Wikipedia's pages of **12 languages** (Total: **80GB**) and continue the pretraining process of Falcon-180B. Finally, we construct a multilingual instructional dataset following **Alpaca**'s techniques.

While ***Vulture-180B*** is an adapter freely usable under **APACHE-2.0**, **Falcon-180B** itself remains available only under the **[Falcon-180B TII License](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/LICENSE.txt) and [Acceptable Use Policy](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/ACCEPTABLE_USE_POLICY.txt)**. Users should ensure any commercial applications based on ***Vulture-180B*** comply with the restrictions on **Falcon-180B**'s use.

*Technical Report coming soon* 🤗

## Prompt Format

The reccomended model usage is:

```
A chat between a curious user and an artificial intelligence assistant.

USER:{user's question}<|endoftext|>ASSISTANT:
```

# Model Details
## Model Description
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
- **Finetuned by:** [Virtual Interactive](https://vilm.org)
- **Language(s) (NLP):** English, German, Spanish, French, Portugese, Russian, Italian, Vietnamese, Indonesian, Chinese, Japanese and Korean
- **Training Time:** 3,000 A100 Hours

## Acknowledgement
- Thanks to **TII** for the amazing **Falcon** as the foundation model.
- Big thanks to **Google** for their generous Cloud credits.

## Out-of-Scope Use

Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful. 

## Bias, Risks, and Limitations

Vulture-180B is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.

## Recommendations

We recommend users of Vulture-180B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use.

## How to Get Started with the Model

To run inference with the model in full `bfloat16` precision you need approximately 8xA100 80GB or equivalent.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
from peft import PeftModel

model = "tiiuae/falcon-180b"
adapters_name = 'vilm/vulture-180b'

tokenizer = AutoTokenizer.from_pretrained(model)
m = AutoModelForCausalLM.from_pretrained(model, torch_dtype=torch.bfloat16, device_map="auto" )
m = PeftModel.from_pretrained(m, adapters_name)

prompt = "A chat between a curious user and an artificial intelligence assistant.\n\nUSER:Thành phố Hồ Chí Minh nằm ở đâu?<|endoftext|>ASSISTANT:"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

output = m.generate(input_ids=inputs["input_ids"],
                    attention_mask=inputs["attention_mask"],
                    do_sample=True,
                    temperature=0.6,
                    top_p=0.9,
                    max_new_tokens=50,)
output = output[0].to("cpu")
print(tokenizer.decode(output))
```