band2001 commited on
Commit
873cbb2
·
verified ·
1 Parent(s): f73bdd4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -4,15 +4,15 @@ datasets:
4
  - band2001/stolaf-angora
5
  ---
6
 
7
- # Model Card for Angora-4000
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
  This model has been created to help computer science students at St. Olaf College (Northfield, MN) answer questions about fundamental CS principles as well as questions about the specific technical stacks and procedures St. Olaf Computer Science uses.
12
 
13
- ## Angora-4000 Details
14
 
15
- This model is built off of [Google's Gemma 7b-it](https://huggingface.co/google/gemma-7b-it) model. It was fine tuned with a dataset created with the purpose of addressing St. Olaf specific Computer Science questions. Some of these questions reference the specific instance of git the institution uses or address steps to declare the computer science major. This model was fine-tuned using MLX on an Apple M3 Max Chip. This model was trained for 4000 iterations using LoRA as the method for finetuning.
16
 
17
  - **Developed by:** Ben Anderson & Keegan Murray
18
  - **Funded by:** St. Olaf College MSCS Department
@@ -41,15 +41,15 @@ Use the code below to get started with the model.
41
  ```python
42
  from transformers import pipeline
43
 
44
- pipe = pipeline("text-generation", model="band2001/stolaf-angora-4000")
45
  ```
46
 
47
  #### Load model directly
48
  ```python
49
  from transformers import AutoTokenizer, AutoModelForCausalLM
50
 
51
- tokenizer = AutoTokenizer.from_pretrained("band2001/stolaf-angora-4000")
52
- model = AutoModelForCausalLM.from_pretrained("band2001/stolaf-angora-4000", device_map="auto")
53
 
54
  input_ids = tokenizer("YOUR PROMPT HERE", return_tensors="pt").to("YOUR DEVICE IF USING GPU ACCELERATION")
55
 
@@ -74,7 +74,7 @@ def format_prompt(prompt, system_prompt = "YOUR SYSTEM PROMPT"):
74
  <start_of_turn>model
75
  """.format(system_prompt, prompt)
76
 
77
- model, tokenizer = load("band2001/stolaf-angora-4000")
78
 
79
  prompt = format_prompt("YOUR PROMPT HERE")
80
 
@@ -171,7 +171,7 @@ def format_prompt(prompt, system_prompt = SYSTEM_PROMPT):
171
 
172
  #### Training Process
173
 
174
- The MLX LoRA fine-tuning approach was used. You can learn more about [MLX LoRA here](https://github.com/ml-explore/mlx-examples/blob/main/lora/README.md). The Gemma-7b-it was loaded in without any conversion. The default `batch_size = 16` was chosen and to reach a 4000 iteration fine-tuned model the model was tuned with 800 iterations five times. Once the fine-tuned weights were created, the model was fused using MLX's fuse functionality. You can learn more about [fusing with MLX here](https://github.com/ml-explore/mlx-examples/blob/main/lora/README.md#Fuse-and-Upload). One important change made when fusing with MLX was to change some of the MLX package code to include `"format":"pt"` in the metadata so this model can be used with the transformers library. To do that, the following was done: you can tweak the library code like below in <path_to_your_site-packages>/mlx_lm/utils.py by replacing `mx.save_safetensors(str(shard_path), shard, metadata={"format":"mlx"})` with `mx.save_safetensors(str(shard_path), shard, metadata={"format":"pt"})` to output fused weights with the metadata attribute. Special thanks to [Alexweberk's guide on GitHub](https://gist.github.com/alexweberk/635431b5c5773efd6d1755801020429f) to help solve this issue. Finally, the fused model was uploaded to this HuggingFace repo!
175
 
176
  If you look at the GitHub repo for this project, mlx_lora.sh includes the command used for the LoRA fine-tuning, mlx_fuse.sh includes the command for the model fusing, and mlx_upload.sh includes the upload command. There is additionally an optional mlx_convert.sh for converting the Google Gemma 7b-it model before fine-tuning if desired.
177
 
 
4
  - band2001/stolaf-angora
5
  ---
6
 
7
+ # Model Card for Angora-2400
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
  This model has been created to help computer science students at St. Olaf College (Northfield, MN) answer questions about fundamental CS principles as well as questions about the specific technical stacks and procedures St. Olaf Computer Science uses.
12
 
13
+ ## Angora-2400 Details
14
 
15
+ This model is built off of [Google's Gemma 7b-it](https://huggingface.co/google/gemma-7b-it) model. It was fine tuned with a dataset created with the purpose of addressing St. Olaf specific Computer Science questions. Some of these questions reference the specific instance of git the institution uses or address steps to declare the computer science major. This model was fine-tuned using MLX on an Apple M3 Max Chip. This model was trained for 2400 iterations using LoRA as the method for finetuning.
16
 
17
  - **Developed by:** Ben Anderson & Keegan Murray
18
  - **Funded by:** St. Olaf College MSCS Department
 
41
  ```python
42
  from transformers import pipeline
43
 
44
+ pipe = pipeline("text-generation", model="band2001/stolaf-angora-2400")
45
  ```
46
 
47
  #### Load model directly
48
  ```python
49
  from transformers import AutoTokenizer, AutoModelForCausalLM
50
 
51
+ tokenizer = AutoTokenizer.from_pretrained("band2001/stolaf-angora-2400")
52
+ model = AutoModelForCausalLM.from_pretrained("band2001/stolaf-angora-2400", device_map="auto")
53
 
54
  input_ids = tokenizer("YOUR PROMPT HERE", return_tensors="pt").to("YOUR DEVICE IF USING GPU ACCELERATION")
55
 
 
74
  <start_of_turn>model
75
  """.format(system_prompt, prompt)
76
 
77
+ model, tokenizer = load("band2001/stolaf-angora-2400")
78
 
79
  prompt = format_prompt("YOUR PROMPT HERE")
80
 
 
171
 
172
  #### Training Process
173
 
174
+ The MLX LoRA fine-tuning approach was used. You can learn more about [MLX LoRA here](https://github.com/ml-explore/mlx-examples/blob/main/lora/README.md). The Gemma-7b-it was loaded in without any conversion. The default `batch_size = 16` was chosen and to reach a 2400 iteration fine-tuned model the model was tuned with 800 iterations three times. Once the fine-tuned weights were created, the model was fused using MLX's fuse functionality. You can learn more about [fusing with MLX here](https://github.com/ml-explore/mlx-examples/blob/main/lora/README.md#Fuse-and-Upload). One important change made when fusing with MLX was to change some of the MLX package code to include `"format":"pt"` in the metadata so this model can be used with the transformers library. To do that, the following was done: you can tweak the library code like below in <path_to_your_site-packages>/mlx_lm/utils.py by replacing `mx.save_safetensors(str(shard_path), shard, metadata={"format":"mlx"})` with `mx.save_safetensors(str(shard_path), shard, metadata={"format":"pt"})` to output fused weights with the metadata attribute. Special thanks to [Alexweberk's guide on GitHub](https://gist.github.com/alexweberk/635431b5c5773efd6d1755801020429f) to help solve this issue. Finally, the fused model was uploaded to this HuggingFace repo!
175
 
176
  If you look at the GitHub repo for this project, mlx_lora.sh includes the command used for the LoRA fine-tuning, mlx_fuse.sh includes the command for the model fusing, and mlx_upload.sh includes the upload command. There is additionally an optional mlx_convert.sh for converting the Google Gemma 7b-it model before fine-tuning if desired.
177