elucidator8918 commited on
Commit
0e4d72d
1 Parent(s): dc952e5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - emre/llama-2-instruct-121k-code
5
+ language:
6
+ - en
7
+ ---
8
+ # Model Card for Model ID: [Your Model ID Here]
9
+
10
+ ## Overview
11
+
12
+ This model, elucidator8918/apigen-prototype-0.1, is tailored for API generation, based on the Mistral-7B-Instruct-v0.1-sharded architecture fine-tuned on the LLAMA-2 Instruct 121k Code dataset.
13
+
14
+ ## Key Information
15
+
16
+ - **Model Name**: Mistral-7B-Instruct-v0.1-sharded
17
+ - **Fine-tuned Model Name**: elucidator8918/apigen-prototype-0.1
18
+ - **Dataset**: emre/llama-2-instruct-121k-code
19
+ - **Language**: English (en)
20
+
21
+ ## Model Details
22
+
23
+ - **LoRA Parameters (QLoRA):**
24
+ - LoRA attention dimension: 64
25
+ - Alpha parameter for LoRA scaling: 16
26
+ - Dropout probability for LoRA layers: 0.1
27
+
28
+ - **bitsandbytes Parameters:**
29
+ - Activate 4-bit precision base model loading
30
+ - Compute dtype for 4-bit base models: float16
31
+ - Quantization type: nf4
32
+ - Activate nested quantization for 4-bit base models: No
33
+
34
+ - **TrainingArguments Parameters:**
35
+ - Number of training epochs: 1
36
+ - Batch size per GPU for training: 4
37
+ - Batch size per GPU for evaluation: 4
38
+ - Gradient accumulation steps: 1
39
+ - Enable gradient checkpointing: Yes
40
+ - Maximum gradient norm: 0.3
41
+ - Initial learning rate: 2e-4
42
+ - Weight decay: 0.001
43
+ - Optimizer: paged_adamw_32bit
44
+ - Learning rate scheduler type: cosine
45
+ - Warm-up ratio: 0.03
46
+ - Group sequences into batches with the same length: Yes
47
+
48
+ ## Usage
49
+
50
+
51
+ - **Example Code (API Generation):**
52
+
53
+ ```python
54
+ from transformers import pipeline
55
+
56
+ api_gen_pipeline = pipeline("text-generation", model="elucidator8918/apigen-prototype-0.1")
57
+
58
+ generated_api_code = api_gen_pipeline("Your prompt or input text here", max_length=150, num_return_sequences=1)
59
+ print(generated_api_code)
60
+ ```
61
+
62
+ ## License
63
+
64
+ This model is released under the MIT License.