dittops commited on
Commit
ac276c2
•
1 Parent(s): 060e138

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -0
README.md CHANGED
@@ -1,3 +1,86 @@
1
  ---
2
  license: llama2
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ metrics:
4
+ - code_eval
5
+ library_name: transformers
6
+ tags:
7
+ - code
8
  ---
9
+
10
+
11
+ # Introducing Code Millenials 34B
12
+
13
+ Welcome to our Code Model repository! Our model is specifically fine-tuned for code generation tasks, aiming to revolutionize how systems understand and translate natural language instructions into code queries. Built on CodeLLaMa Python 34B, our model has been meticulously fine-tuned with a curated code generation instructions, ensuring quality and precision.
14
+
15
+ ### News 🔥🔥🔥
16
+
17
+ - [2024/01/03] We released **Code Millenials 34B** , which achieves the **80.48 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
18
+ - [2024/01/02] We released **Code Millenials 13B** , which achieves the **76.21 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
19
+
20
+
21
+ ### HumanEval
22
+
23
+ <p align="center" width="100%">
24
+ <a ><img src="result.png" alt="CodeMillenials" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
25
+ </p>
26
+
27
+ For the millenial models, the eval script in the github repo is used for the above result.
28
+
29
+ Note: The humaneval values of other models are taken from the official repos of [WizardCoder](https://github.com/nlpxucan/WizardLM), [DeepseekCoder](https://github.com/deepseek-ai/deepseek-coder), [Gemini](https://deepmind.google/technologies/gemini/#capabilities) etc.
30
+
31
+
32
+ ### Models
33
+
34
+ | Model | Checkpoint | HumanEval |
35
+ |---------|-------------|-----------|
36
+ |Code Millenials 34B | <a href="https://huggingface.co/budecosystem/code-millenials-34b" target="_blank">HF Link</a> | 80.48 |
37
+ |Code Millenials 13B | <a href="https://huggingface.co/budecosystem/code-millenials-13b" target="_blank">HF Link</a> | 76.21 |
38
+
39
+
40
+
41
+
42
+ ### 🚀 Quick Start
43
+
44
+ Inference code using the pre-trained model from the Hugging Face model hub
45
+
46
+ ```python
47
+ import torch
48
+ from transformers import AutoTokenizer, AutoModelForCausalLM
49
+
50
+ tokenizer = AutoTokenizer.from_pretrained("budecosystem/code-millenials-34b")
51
+ model = AutoModelForCausalLM.from_pretrained("budecosystem/code-millenials-34b")
52
+
53
+ template = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
54
+ ### Instruction: {instruction} ### Response:"""
55
+
56
+ instruction = <Your code instruction here>
57
+
58
+ prompt = template.format(instruction=instruction)
59
+
60
+ inputs = tokenizer(prompt, return_tensors="pt")
61
+ sample = model.generate(**inputs, max_length=128)
62
+ print(tokenizer.decode(sample[0]))
63
+
64
+ ```
65
+
66
+
67
+ ## Training details
68
+
69
+ The model is trained of 16 A100 80GB for approximately 50hrs.
70
+
71
+ | Hyperparameters | Value |
72
+ | :----------------------------| :-----: |
73
+ | per_device_train_batch_size | 16 |
74
+ | gradient_accumulation_steps | 1 |
75
+ | epoch | 3 |
76
+ | steps | 2157 |
77
+ | learning_rate | 2e-5 |
78
+ | lr schedular type | cosine |
79
+ | warmup ratio | 0.1 |
80
+ | optimizer | adamw |
81
+ | fp16 | True |
82
+ | GPU | 16 A100 80GB |
83
+
84
+ ### Important Note
85
+
86
+ - **Bias, Risks, and Limitations:** Model may sometimes make errors, produce misleading contents, or struggle to manage tasks that are not related to coding.