cyente commited on
Commit
bc87297
·
verified ·
1 Parent(s): 7b148ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -20,11 +20,12 @@ tags:
20
 
21
  ## Introduction
22
 
23
- Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). For Qwen2.5-Coder, we release three base language models and instruction-tuned language models, 1.5, 7 and 32 (coming soon) billion parameters. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
24
 
25
- - Significantly improvements in **code generation**, **code reasoning** and **code fixing**. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc.
26
  - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
27
  - **Long-context Support** up to 128K tokens.
 
28
 
29
  **This repo contains the instruction-tuned 7B Qwen2.5-Coder model**, which has the following features:
30
  - Type: Causal Language Models
@@ -37,7 +38,7 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (
37
  - Context Length: Full 131,072 tokens
38
  - Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2.5 for handling long texts.
39
 
40
- For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), [Documentation](https://qwen.readthedocs.io/en/latest/), [Arxiv](https://arxiv.org/abs/2409.12186).
41
 
42
  ## Requirements
43
 
@@ -111,7 +112,7 @@ We advise adding the `rope_scaling` configuration only when processing long cont
111
 
112
  ## Evaluation & Performance
113
 
114
- Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder/).
115
 
116
  For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).
117
 
 
20
 
21
  ## Introduction
22
 
23
+ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. All of these models follows the Apache License (except for the 3B); Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
24
 
25
+ - Significantly improvements in **code generation**, **code reasoning** and **code fixing**. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. Qwen2.5-Coder-32B has become the current state-of-the-art open-source coderLLM, with its coding abilities matching those of GPT-4o.
26
  - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
27
  - **Long-context Support** up to 128K tokens.
28
+ -
29
 
30
  **This repo contains the instruction-tuned 7B Qwen2.5-Coder model**, which has the following features:
31
  - Type: Causal Language Models
 
38
  - Context Length: Full 131,072 tokens
39
  - Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2.5 for handling long texts.
40
 
41
+ For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder-family/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), [Documentation](https://qwen.readthedocs.io/en/latest/), [Arxiv](https://arxiv.org/abs/2409.12186).
42
 
43
  ## Requirements
44
 
 
112
 
113
  ## Evaluation & Performance
114
 
115
+ Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder-family/).
116
 
117
  For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).
118