Zangs3011 commited on
Commit
2790e4b
1 Parent(s): b30c1c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -1,28 +1,30 @@
1
  ---
2
  datasets:
3
- - b-mc2/sql-create-context
4
  library_name: peft
5
  tags:
6
- - meta-llama/Llama-2-7b
7
  - code
8
  - instruct
9
  - instruct-code
10
- - sql-create-context
11
- - text-to-sql
12
- - LLM
 
 
13
  ---
14
 
15
- We finetuned Meta-Llama-2-7B on the SQL Create Context Dataset (b-mc2/sql-create-context) for 3 epochs using [MonsterAPI](https://monsterapi.ai) no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm).
16
 
17
- This dataset is an enhanced version of WikiSQL and Spider, focused on providing natural language queries and corresponding SQL CREATE TABLE statements. The dataset contains 78,577 examples and aims to improve the model's grounding in text-to-SQL tasks. The CREATE TABLE statements are particularly useful for limiting token usage and avoiding exposure to sensitive data.
18
 
19
- The finetuning session took 6 hrs 17 mins and costed us a total of `$18.56`.
20
 
21
  #### Hyperparameters & Run details:
22
  - Model Path: meta-llama/Llama-2-7b
23
- - Dataset: b-mc2/sql-create-context
24
  - Learning rate: 0.0003
25
- - Number of epochs: 3
26
  - Data split: Training: 90% / Validation: 10%
27
  - Gradient accumulation steps: 1
28
 
@@ -31,4 +33,4 @@ Loss metrics:
31
 
32
  ---
33
  license: apache-2.0
34
- ---
 
1
  ---
2
  datasets:
3
+ - sahil2801/CodeAlpaca-20k
4
  library_name: peft
5
  tags:
6
+ - llama2-7b
7
  - code
8
  - instruct
9
  - instruct-code
10
+ - code-alpaca
11
+ - alpaca-instruct
12
+ - alpaca
13
+ - llama7b
14
+ - gpt2
15
  ---
16
 
17
+ We finetuned Llama2-7B on Code-Alpaca-Instruct Dataset (sahil2801/CodeAlpaca-20k) for 5 epochs or ~ 25,000 steps using [MonsterAPI](https://monsterapi.ai) no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm).
18
 
19
+ This dataset is HuggingFaceH4/CodeAlpaca_20K unfiltered, removing 36 instances of blatant alignment.
20
 
21
+ The finetuning session got completed in 4 hours and costed us only `$16` for the entire finetuning run!
22
 
23
  #### Hyperparameters & Run details:
24
  - Model Path: meta-llama/Llama-2-7b
25
+ - Dataset: sahil2801/CodeAlpaca-20k
26
  - Learning rate: 0.0003
27
+ - Number of epochs: 5
28
  - Data split: Training: 90% / Validation: 10%
29
  - Gradient accumulation steps: 1
30
 
 
33
 
34
  ---
35
  license: apache-2.0
36
+ ---