File size: 9,488 Bytes
580f67f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 |
# aiXcoder-7B Code Large Language Model
<p align="center">
🏠 <a href="https://www.aixcoder.com/" target="_blank">Official website</a>|🛠 <a href="https://marketplace.visualstudio.com/items?itemName=aixcoder-plugin.aixcoder" target="_blank">VS Code Plugin</a>|🛠 <a href="https://plugins.jetbrains.com/plugin/13574-aixcoder-code-completer" target="_blank">Jetbrains Plugin</a>|<a href="https://github.com/aixcoder-plugin/aiXcoder-7B" target="_blank">Github Project</a>
</p>
Welcome to the official repository of aiXcoder-7B Code Large Language Model. This model is designed to understand and generate code across multiple programming languages, offering state-of-the-art performance in code completion, comprehension, generation, and more tasks about programming languages.
Table of Contents
1. [Model Introduction](#model-introduction)
2. [Quickstart](#quickstart)
- [Environment Requirements](#environment-requirements)
- [Model Weights](#model-weights)
- [Inference Example](#inference-example)
3. [License](#license)
4. [Acknowledgments](#acknowledgments)
## Model Introduction
As the capabilities of large code models are gradually being unearthed, aiXcoder has consistently pondered on how to make these models more beneficial in real development scenarios. To this end, we have open-sourced aiXcoder 7B Base, which has undergone extensive training on 1.2T Unique Tokens, and the model's pre-training tasks as well as the contextual information have been uniquely designed for real-world code generation contexts.
aiXcoder 7B Base stands out as the most effective model in code completion scenarios among all models of similar parameter sizes, and it also surpasses mainstream models like codellama 34B and StarCoder2 15B in the average performance on the multilingual nl2code benchmark.
In our ongoing exploration to apply large code models, the release of aiXcoder 7B Base represents a significant milestone. The current version of aiXcoder 7B Base is a foundational model that focuses on improving the efficiency and accuracy of code completion and code generation tasks, aiming to provide robust support for developers in these scenarios. It is important to note that this version has not undergone specific instruct-tuning, which means it might not yet offer optimal performance for specialized higher-level tasks such as test case generation and code debugging.
However, we have plans for further development of the aiXcoder model series already in motion. In the near future, we aim to release new versions of the model that have been meticulously instruct-tuned for a wider range of programming tasks, including but not limited to test case generation and code debugging. Through these instruct-tuned models, we anticipate offering developers more comprehensive and deeper programming support, helping them to maximize efficiency at every stage of software development.
## Quickstart
### Environment Requirements
#### Option 1: Build Env
To run the model inference code, you'll need the following environment setup:
- Python 3.8 or higher
- PyTorch 2.1.0 or higher
- sentencepiece 0.2.0 or higher
- transformers 4.34.1 or higher (if run inference by transformers library)
Please ensure all dependencies are installed using the following command:
```bash
conda create -n aixcoder-7b python=3.11
conda activate aixcoder-7b
git clone git@github.com:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b
pip install -r requirements.txt
```
`requirements.txt` listed all necessary libraries and their versions.
To achieve faster inference speeds, especially for large models, we recommend installing `flash attention`. `Flash attention` is an optimized attention mechanism that significantly reduces computation time for transformer-based models without sacrificing accuracy.
Before proceeding, ensure your environment meets the CUDA requirements as `flash attention` leverages GPU acceleration. Follow these steps to install `flash attention`:
```bash
git clone git@github.com:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install
```
#### Option 2: Docker
For a consistent and isolated environment, we recommend running the model inference code using Docker. Here's how to set up and use Docker for our model:
1. Install Docker: If you haven't already, install Docker on your machine.
2. Pull the Docker Image: Pull the Docker image from Docker Hub.
```bash
docker pull pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel
```
3. Run the Container: Once the image is pulled, you can run the model inside a Docker container.
```bash
docker run --gpus all -it -v /dev/shm:/dev/shm --name aix_instance pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel /bin/bash
pip install sentencepiece
git clone git@github.com:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b
```
This command starts a container named aix_instance from the pytorch image. You can interact with the model inside this container.
To achieve faster inference speeds, especially for large models, we recommend installing `flash attention`.
```bash
git clone git@github.com:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install
```
4. Model Inference: Within the Docker container, you can run the model inference code as described in the Inference Example section.
Using Docker provides a clean, controlled environment that minimizes issues related to software versions and dependencies.
### Model Weights
You can download the model weights from the following link:
- [aiXcoder Base Download](https://huggingface.co/aiXcoder/aixcoder-7b-base)
- aiXcoder Instruct Download (Comming soon...)
### Inference Example
#### Command Line Execution
For a quick start, you can run the model inference directly from the command line:
```bash
torchrun --nproc_per_node 1 sess_megatron.py --model_dir "path/to/model_weights_dir"
```
Replace "path/to/model_weights_dir" with the actual path to your downloaded model weights.
or run inference with huggingface's transformers:
```bash
python sess_huggingface.py
```
#### Python Script Execution
Alternatively, you can invoke the model programmatically within your Python scripts. This method provides more flexibility for integrating the model into your applications or workflows. Here's a simple example on how to do it:
```python
from sess_megatron import TestInference
infer = TestInference()
res = infer.run_infer(
# for FIM style input, code_string stands for prefix context
code_string="""# 快速排序算法""",
# for FIM style input, later_code stands for suffix context
later_code="\n",
# file_path should be a path from project to file
file_path="test.py",
# max num for generated tokens
max_new_tokens=256,
)
print(res)
"""output:
def quick_sort(arr):
if len(arr) <= 1:
return arr
pivot = arr[0]
less = [i for i in arr[1:] if i <= pivot]
greater = [i for i in arr[1:] if i > pivot]
return quick_sort(less) + [pivot] + quick_sort(greater)
# 测试
arr = [3, 2, 1, 4, 5]
print(quick_sort(arr)) # [1, 2, 3, 4, 5]
"""
```
```python
import torch
import sys
from hf_mini.utils import input_wrapper
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aixcoder-7b-base")
model = AutoModelForCausalLM.from_pretrained("aiXcoder/aixcoder-7b-base", torch_dtype=torch.bfloat16)
text = input_wrapper(
# for FIM style input, code_string stands for prefix context
code_string="# 快速排序算法",
# for FIM style input, later_code stands for suffix context
later_code="\n# 测试\narr = [3, 2, 1, 4, 5]\nprint(quick_sort(arr)) # [1, 2, 3, 4, 5]",
# file_path should be a path from project to file
path="test.py"
)
if len(text) == 0:
sys.exit()
inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False)
inputs = inputs.to(device)
model.to(device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
"""output:
def quick_sort(arr):
# 如果数组长度小于等于1,直接返回
if len(arr) <= 1:
return arr
# 选择数组的第一个元素作为基准
pivot = arr[0]
# 初始化左右指针
left, right = 1, len(arr) - 1
# 循环直到左指针小于右指针
while left < right:
# 从右到左找到第一个小于基准的元素,与左指针元素交换
if arr[right] < pivot:
arr[left], arr[right] = arr[right], arr[left]
left += 1
# 从左到右找到第一个大于等于基准的元素,与右指针元素交换
if arr[left] >= pivot:
right -= 1
# 将基准元素与左指针元素交换
arr[left], arr[0] = arr[0], arr[left]
# 对左半部分进行递归排序
quick_sort(arr[:left])
# 对右半部分进行递归排序
quick_sort(arr[left + 1:])
return arr</s>
"""
```
## License
The model weights are licensed under the [Model License](./MODEL_LICENSE) for academic research use; for commercial use, please apply by sending an email to support@aiXcoder.com.
## Acknowledgments
We would like to thank all contributors to the open-source projects and datasets that made this work possible.
Thank you for your interest in our Code Large Language Model. We look forward to your contributions and feedback!
|