Text Generation
Transformers
Safetensors
mistral
code
text-generation-inference
Inference Endpoints
jtatman commited on
Commit
b12ecb6
1 Parent(s): bc89e8e

Filled model card

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - jtatman/python-code-dataset-500k
5
+ - jtatman/python-github-code-instruct-filtered-5k
6
+ - jtatman/pile_python_instruct_format
7
+ library_name: transformers
8
+ tags:
9
+ - code
10
+ ---
11
+ # Model Card for tinymistral-v2-pycoder-instruct-248m
12
+
13
+ This modelcard is for tinymistral-v2-pycoder-instruct, a python-specific code generation model on top of [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct).
14
+
15
+ ## Model Details
16
+
17
+ This instruct model follows the original in using ChatML format.
18
+
19
+ An empty prompt will return various information from the base model, but using the instruct format will deliver python code of varying quality.
20
+
21
+ ### Model Description
22
+
23
+ Model is in active development, base model is in active development, and all should be treated with caution.
24
+
25
+ - **Developed by:** [Locutusque and M4ai]
26
+ - **Funded by:** [Lint from a corner pocket]
27
+ - **Shared by:** [jtatman](https://huggingface.co/jtatman)
28
+ - **Model type:** [MistralForCausalLM](Locutusque/TinyMistral-248M-v2)
29
+ - **License:** [MIT]
30
+ - **Finetuned from model [Locutusque/TinyMistral-248M-v2](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
31
+
32
+ ## Uses
33
+
34
+ Generate python code.
35
+
36
+ ### Direct Use
37
+
38
+ Probably could be fine tuned with a more comprehensive dataset. Experiments are in progress.
39
+
40
+ ## How to Get Started with the Model
41
+
42
+ Use the prompt format below to get started with the model.
43
+
44
+ <|im_start|>user
45
+ Write a function for multiplying two numbers, from variables 'a' and 'b'.<|im_end|>
46
+ <|im_start|>assistant
47
+
48
+
49
+ ## Training Details
50
+
51
+ ### Training Data
52
+
53
+ Custom formatted existing python data from:
54
+ - [jtatman/python-code-dataset-500k](https://huggingface.co/datasets/jtatman/python-code-dataset-500k)
55
+ - [jtatman/python-github-code-instruct-filtered-5k](https://huggingface.co/datasets/jtatman/python-github-code-instruct-filtered-5k)
56
+ - [jtatman/pile_python_instruct_format](https://huggingface.co/datasets/jtatman/pile_python_instruct_format)
57
+
58
+ ### Training Procedure
59
+
60
+ Repeat training depending on compute budget.
61
+
62
+ #### Preprocessing
63
+
64
+ Conversion to alpaca/instruct format.
65
+
66
+
67
+ #### Training Hyperparameters
68
+
69
+ - **Training regime:** fp16, merge of parameter fine-tune adapters when necessary and helpful.
70
+
71
+ ## Evaluation
72
+
73
+ #### Metrics
74
+
75
+ Latest metrics:
76
+
77
+ - epoch: 4.87
78
+ - global_step: 220
79
+ - learning_rate: 0.00006713780918727916
80
+ - loss: 2.3736