Naphula commited on
Commit
0caf794
·
verified ·
1 Parent(s): f43fd2b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -8,7 +8,7 @@ pinned: false
8
  ---
9
 
10
  # Model Tools by Naphula
11
- Tools to enhance LLM quantizations and merging. Merge and audit large language models with low VRAM.
12
 
13
  # [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
14
  - Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
@@ -17,6 +17,10 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
17
  # config.py
18
  - Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
19
 
 
 
 
 
20
  # [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
21
  - Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
22
  - `mergekit/merge.py`
@@ -55,6 +59,7 @@ Tools to enhance LLM quantizations and merging. Merge and audit large language m
55
 
56
  # [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
57
  - Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
 
58
 
59
  # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
60
  - Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.
 
8
  ---
9
 
10
  # Model Tools by Naphula
11
+ Tools to enhance LLM quantizations and merging. Merge and audit large language models on low VRAM GPUs.
12
 
13
  # [graph_v18.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/graph_v18.py)
14
  - Merge models in minutes instead of hours on low VRAM. For a 3060/3060 Ti user: This script enables functionality that is otherwise impossible (merging 70B models or large 7B merges with `--cuda`) without OOM. [More details here](https://huggingface.co/spaces/Naphula/model_tools/blob/main/mergekit_low-VRAM-graph_patch.md)
 
17
  # config.py
18
  - Simply replace line 13 | BEFORE `ScalarOrGradient: TypeAlias = Union[float, List[float]]` → AFTER `ScalarOrGradient: TypeAlias = Union[float, List[float], str, bool]` | to allow for custom filepath strings within parameter settings.
19
 
20
+ # [embed_12B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_12B.py) and [embed_24B.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/embed_24B.py)
21
+ - This is an alternate solution in cases where `--fix-mistral-regex` and `tokensurgeon` fail, such as `della` or `passthrough` merges between models with mismatched `vocab_size`. Read [the guide](https://huggingface.co/spaces/Naphula/model_tools/blob/main/Mergekit-Robustness-Patch-embed_v2.md) here, download either file and save it as `mergekit-main\mergekit\tokenizer\embed.py`. Attached is one for Mistral Nemo 12B (v2d), and another for Mistral Small 24B (v2a).
22
+ - I noticed that sometimes the default `embed.py` works best so keep a copy of that too, and if it fails for some reason try the 12B or 24B version.
23
+
24
  # [enable_fix_mistral_regex_true.md](https://huggingface.co/spaces/Naphula/model_tools/blob/main/enable_fix_mistral_regex_true.md)
25
  - Merge models with extreme tokenizer incompatibility. Requires modifying the `mergekit.yaml` `tokenizer` section and adding `--fix-mistral-regex` to your merge commands. (Note: Do not use `token_surgeon.py`, `gen_id_patcher.py`, or `vocab_id_patcher.py` with this, they are obsolete now.) Configured for MN 12B by default. Follow the steps in this guide to modify these scripts:
26
  - `mergekit/merge.py`
 
59
 
60
  # [arcee_fusion_salience_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner.py)
61
  - Scan the salience % of your arcee_fusion merges. The default `tukey_fence` value is 1.5 which results in 12.5% salience, but [this can be adjusted (see guide here)](modify_arcee_fusion_tukey_fence_parameter.md).
62
+ - Updated version here [arcee_fusion_salience_scanner_v3.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/arcee_fusion_salience_scanner_v3.py)
63
 
64
  # [eos_scanner.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner.py)
65
  - Updated! This tool scans the tokenizer jsons to detect any mismatches with EOS tokens, which cause early termination bugs. You can then use the [gen_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/gen_id_patcher.py) and [vocab_id_patcher.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/vocab_id_patcher.py), or the [chatml_to_mistral.py](https://huggingface.co/spaces/Naphula/model_tools/blob/main/chatml_to_mistral.py) to patch missing `generation_config.json` files for EOS token. See [this post](https://huggingface.co/Naphula/Q0_Bench/discussions/1?not-for-all-audiences=true#6987717c762f0a45f672e250) as well as the [EOS Scanner ReadMe](https://huggingface.co/spaces/Naphula/model_tools/blob/main/eos_scanner_readme.md) for more info.