File size: 1,076 Bytes
9ef04aa
 
 
 
 
 
24965b6
 
58a9bac
b8b6ef9
5c9de66
 
 
 
 
 
 
062129f
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
---
license: apache-2.0
language:
- zh
- en
---
# Chinese-CodeLlama-7B-SFT-V2

We added [7k+ python code instructions](https://huggingface.co/datasets/frankminors123/Python-Code-Instructions-7k) and implemented SFT based on our [Chinese-CodeLlama-7B-SFT-V1](https://huggingface.co/frankminors123/Chinese-CodeLlama-7B-SFT-V1). Drawing on the work of [code-llama](https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/), we increased 
the base period of rotary positional embeddings (RoPE) from `10000` to `1000000`.

The Chinese prompt template used is as follows:
```python
PROMPT_TEMPLATE = (
  "下面是描述一项任务的指令,并且与一则输入配对用来提供更多的上下文。请给出尽可能满足请求的回答.\n"
  "### 指令:\n{instruction}\n### 输入:\n{input}\n### 回答:\n"
)
```

We use a sequence length of 1k for pre-training, and continue training based on this length during the fine-tuning stage. Based on a larger base period of RoPE, it can support up 15k context length extrapolation at inference time.