File size: 3,500 Bytes

a64359e
 
 
3b7e0fd
 
 
 
 
 
b0b4bea
56aec23
3b7e0fd
 
 
 
 
56aec23
3b7e0fd
56aec23
7b21e52
56aec23
7b21e52
56aec23
8d72701
56aec23
 
 
 
 
 
8d72701
56aec23
8d72701
3b7e0fd
 
b0b4bea
3b7e0fd
 
b0b4bea
3b7e0fd
 
 
56aec23
765d32a
 
56aec23
 
 
 
 
 
15e4261
 
 
 
 
 
56aec23
 
 
 
 
 
3b7e0fd
56aec23
3b7e0fd
56aec23
 
 
3b7e0fd
 
56aec23
3b7e0fd
9ac0016
56aec23
 
3b7e0fd
 
 
 
 
b0b4bea
131a9b7

---
license: other
---

![Aquila_logo](./log.jpeg)

<h4 align="center">
    <p>
        <b>English</b> |
        <a href="https://huggingface.co/BAAI/Aquila2-34B/blob/main/README_zh.md">简体中文</a> |
    <p>
</h4>


We opensource our **Aquila2** series, now including **Aquila2**, the base language models, namely **Aquila2-7B** and **Aquila2-34B**, as well as **AquilaChat2**, the chat models, namely **AquilaChat2-7B** and **AquilaChat2-34B**, as well as the long-text chat models, namely **AquilaChat2-7B-16k** and **AquilaChat2-34B-16k**

The additional details of the Aquila model will be presented in the official technical report. Please stay tuned for updates on official channels.

## Updates 2024.6.6

We have updated the basic language model **Aquila2-34B**, which has the following advantages compared to the previous model:

* Replaced tokenizer with higher compression ratio:

| Tokenizer | Size  | Zh                       | En     | Code  | Math   | Average |
|-----------|-------|--------------------------|--------|-------|-------|---------|
| Aquila2-original   | 100k  | **4.70**                 | 4.42   | 3.20  | 3.77  | 4.02    |
| Qwen1.5   | 151k  | 4.27                     | 4.51   | 3.62  | 3.35  | 3.94    |
| Llama3    | 128k  | 3.45                     | **4.61**   | 3.77  | **3.88** | 3.93    |
| Aquila2-new     | 143k  | 4.60                     | **4.61** | **3.78** | **3.88**  | **4.22** |

* The maximum processing length supported by the model has increased from 2048 to 8192



## Quick Start  Aquila2-34B

### 1. Inference
Aquila2-34B is a base model that can be used for continuation.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import BitsAndBytesConfig

device= "cuda:0"

# Model Name
model_name = 'BAAI/Aquila2-34B'

# load model and tokenizer
quantization_config=BitsAndBytesConfig(
                        load_in_4bit=True,
                        bnb_4bit_use_double_quant=True,
                        bnb_4bit_quant_type="nf4",
                        bnb_4bit_compute_dtype=torch.bfloat16,
                    )
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, trust_remote_code=True,
                        # quantization_config=quantization_config # Uncomment this one for 4-bit quantization
                        )

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)

model.eval()

model.to(device)

# Example
text = "The meaning of life is"
tokens = tokenizer.encode_plus(text)['input_ids']
tokens = torch.tensor(tokens)[None,].to(device)

with torch.no_grad():
        out = model.generate(tokens, do_sample=False, max_length=128, eos_token_id=tokenizer.eos_token_id)[0]
        out = tokenizer.decode(out.cpu().numpy().tolist())
        print(out)
```


## License

Aquila2 series open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/Aquila2-34B/blob/main/BAAI-Aquila-Model-License%20-Agreement.pdf)

## Citation
Feel free to cite the repo if you think Aquila2 is useful.

```python
@misc{zhang2024aquila2technicalreport,
      title={Aquila2 Technical Report}, 
      author={Bo-Wen Zhang and Liangdong Wang and Jijie Li and Shuhao Gu and Xinya Wu and Zhengduo Zhang and Boyan Gao and Yulong Ao and Guang Liu},
      year={2024},
      eprint={2408.07410},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.07410}, 
}
```