File size: 806 Bytes
44978aa
 
6e6bc60
 
 
0037b80
 
82d20ac
 
44978aa
6e6bc60
 
 
 
c87f914
6e6bc60
 
fa4e397
b16682e
 
 
 
39d6db0
b16682e
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: cc-by-4.0
datasets:
- zirui3/TSSB-3M-instructions
- conceptofmind/FLAN_2022
- zirui3/zhihu_qa
- zirui3/cMedQA2-instructions
tags:
- code
---


# summary

This model is bigcode/starcoder fine-tuned on  codegen dataset & natural language dataset(chinese/english instruction dataset)

# dataset
* codegen-instruct
* [zirui3/TSSB-3M-instructions](https://huggingface.co/datasets/zirui3/TSSB-3M-instructions)(python code bugfix)
* FLAN(english)
* [OIG](https://huggingface.co/datasets/laion/OIG) (Open-Assistant,engliesh)
* [zirui3/zhihu_qa](https://huggingface.co/datasets/zirui3/zhihu_qa)(chinese)
* [COIG](https://huggingface.co/datasets/BAAI/COIG) (chinese)
* pCLUE(chinese)
* [zirui3/cMedQA2-instructions](https://huggingface.co/datasets/zirui3/TSSB-3M-instructions) (chinese medical domain)