SwastikM commited on
Commit
11d61c0
·
verified ·
1 Parent(s): 2923e41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -49
README.md CHANGED
@@ -41,7 +41,11 @@ Addressing the efficay of Quantization and PEFT. Implemented as a personal Proje
41
 
42
  ### How to use
43
 
44
- The quantized model is finetuned as PEFT. We have the trained Adapter. <br>The trained adpated needs to be merged with Base Model on which it was trained.
 
 
 
 
45
 
46
  ```python
47
  instruction = """model_input = "Help me set up my daily to-do list!""""
@@ -73,20 +77,15 @@ print(code)
73
 
74
  HuggingFace Accelerate with Training Loop.
75
 
76
- #### Preprocessing
77
-
78
- - ***Encoder Input:*** "sql_prompt: " + data['sql_prompt']+" sql_context: "+data['sql_context']
79
- - ***Decoder Input:*** data['sql']
80
-
81
 
82
  #### Training Hyperparameters
83
 
84
  - **Optimizer:** AdamW
85
  - **lr:** 2e-5
86
  - **decay:** linear
87
- - **num_warmup_steps:** 0
88
- - **batch_size:** 8
89
- - **num_training_steps:** 12500
90
 
91
 
92
  #### Hardware
@@ -94,51 +93,18 @@ HuggingFace Accelerate with Training Loop.
94
  - **GPU:** P100
95
 
96
 
97
- ### Citing Dataset and BaseModel
98
-
99
- ```
100
- @software{gretel-synthetic-text-to-sql-2024,
101
- author = {Meyer, Yev and Emadi, Marjan and Nathawani, Dhruv and Ramaswamy, Lipika and Boyd, Kendrick and Van Segbroeck, Maarten and Grossman, Matthew and Mlocek, Piotr and Newberry, Drew},
102
- title = {{Synthetic-Text-To-SQL}: A synthetic dataset for training language models to generate SQL queries from natural language prompts},
103
- month = {April},
104
- year = {2024},
105
- url = {https://huggingface.co/datasets/gretelai/synthetic-text-to-sql}
106
- }
107
- ```
108
-
109
- ```
110
- @article{DBLP:journals/corr/abs-1910-13461,
111
- author = {Mike Lewis and
112
- Yinhan Liu and
113
- Naman Goyal and
114
- Marjan Ghazvininejad and
115
- Abdelrahman Mohamed and
116
- Omer Levy and
117
- Veselin Stoyanov and
118
- Luke Zettlemoyer},
119
- title = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
120
- Generation, Translation, and Comprehension},
121
- journal = {CoRR},
122
- volume = {abs/1910.13461},
123
- year = {2019},
124
- url = {http://arxiv.org/abs/1910.13461},
125
- eprinttype = {arXiv},
126
- eprint = {1910.13461},
127
- timestamp = {Thu, 31 Oct 2019 14:02:26 +0100},
128
- biburl = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib},
129
- bibsource = {dblp computer science bibliography, https://dblp.org}
130
- }
131
-
132
- ```
133
-
134
  ## Additional Information
135
 
136
- - ***Github:*** [Repository](https://github.com/swastikmaiti/SwastikM-bart-large-nl2sql.git)
 
 
 
 
137
 
138
  ## Acknowledgment
139
 
140
- Thanks to [@AI at Meta](https://huggingface.co/facebook) for adding the Pre Trained Model.
141
- Thanks to [@Gretel.ai](https://huggingface.co/gretelai) for adding the datset.
142
 
143
 
144
  ## Model Card Authors
 
41
 
42
  ### How to use
43
 
44
+ ```
45
+ The quantized model is finetuned as PEFT. We have the trained Adapter.
46
+ Merging LoRA adapated with GPTQ quantized model is not yet supported.
47
+ So instead of loading a single finetuned model, we need to load the mase model and merge the finetuned adapter on top.
48
+ ```
49
 
50
  ```python
51
  instruction = """model_input = "Help me set up my daily to-do list!""""
 
77
 
78
  HuggingFace Accelerate with Training Loop.
79
 
 
 
 
 
 
80
 
81
  #### Training Hyperparameters
82
 
83
  - **Optimizer:** AdamW
84
  - **lr:** 2e-5
85
  - **decay:** linear
86
+ - **batch_size:** 4
87
+ - **gradient_accumulation_steps:** 8
88
+ - **global_step:** 625
89
 
90
 
91
  #### Hardware
 
93
  - **GPU:** P100
94
 
95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96
  ## Additional Information
97
 
98
+ - ***Github:*** [Repository]()
99
+ - ***Intro to quantization:*** [Blog](https://huggingface.co/blog/merve/quantization)
100
+ - ***Emergent Feature:*** [Academic](https://timdettmers.com/2022/08/17/llm-int8-and-emergent-features)
101
+ - ***GPTQ Paper:*** [GPTQ](https://arxiv.org/pdf/2210.17323)
102
+ - ***BITSANDBYTES and further*** [LLM.int8()](https://arxiv.org/pdf/2208.07339)
103
 
104
  ## Acknowledgment
105
 
106
+ Thanks to [@AMerve Noyan](https://huggingface.co/blog/merve/quantization) for precise intro.
107
+ Thanks to [@HuggungFace Team](https://colab.research.google.com/drive/1_TIrmuKOFhuRRiTWN94iLKUFu6ZX4ceb?usp=sharing#scrollTo=vT0XjNc2jYKy) for coding guide on gptq.
108
 
109
 
110
  ## Model Card Authors