fuzzy-mittenz commited on
Commit
430027a
·
verified ·
1 Parent(s): 148dc23

Update README.md

Browse files

![tangu.png](https://cdn-uploads.huggingface.co/production/uploads/6593502ca2607099284523db/YJ9qZGWWpROi_PwNl8DWL.png)

Files changed (1) hide show
  1. README.md +42 -33
README.md CHANGED
@@ -11,47 +11,56 @@ tags:
11
  - llama-cpp
12
  - gguf-my-repo
13
  ---
 
 
14
 
15
- # fuzzy-mittenz/SmallThinker-3B-Preview-Q8_0-GGUF
16
- This model was converted to GGUF format from [`PowerInfer/SmallThinker-3B-Preview`](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
17
- Refer to the [original model card](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) for more details on the model.
18
 
19
- ## Use with llama.cpp
20
- Install llama.cpp through brew (works on Mac and Linux)
21
 
22
- ```bash
23
- brew install llama.cpp
24
 
25
- ```
26
- Invoke the llama.cpp server or the CLI.
27
 
28
- ### CLI:
29
- ```bash
30
- llama-cli --hf-repo fuzzy-mittenz/SmallThinker-3B-Preview-Q8_0-GGUF --hf-file smallthinker-3b-preview-q8_0.gguf -p "The meaning to life and the universe is"
31
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
- ### Server:
34
- ```bash
35
- llama-server --hf-repo fuzzy-mittenz/SmallThinker-3B-Preview-Q8_0-GGUF --hf-file smallthinker-3b-preview-q8_0.gguf -c 2048
 
 
 
 
 
 
36
  ```
 
37
 
38
- Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
39
-
40
- Step 1: Clone llama.cpp from GitHub.
41
- ```
42
- git clone https://github.com/ggerganov/llama.cpp
43
- ```
44
 
45
- Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
46
- ```
47
- cd llama.cpp && LLAMA_CURL=1 make
48
- ```
49
 
50
- Step 3: Run inference through the main binary.
51
- ```
52
- ./llama-cli --hf-repo fuzzy-mittenz/SmallThinker-3B-Preview-Q8_0-GGUF --hf-file smallthinker-3b-preview-q8_0.gguf -p "The meaning to life and the universe is"
53
- ```
54
- or
55
- ```
56
- ./llama-server --hf-repo fuzzy-mittenz/SmallThinker-3B-Preview-Q8_0-GGUF --hf-file smallthinker-3b-preview-q8_0.gguf -c 2048
57
- ```
 
11
  - llama-cpp
12
  - gguf-my-repo
13
  ---
14
+ # TANGU Quant a QwenStar/GPT4ALL/PowerInfer (o#/QwQ) series Reasoner
15
+ ## Final Small reasoning for CPU using SmallThinker-3B-Preview-Q8_0-GGUF We are labeling it Tangu 3B-for our GPT4ALL Community(a fallen star bound to Earth)
16
 
17
+ ![tangu.png](https://cdn-uploads.huggingface.co/production/uploads/6593502ca2607099284523db/YJ9qZGWWpROi_PwNl8DWL.png)
 
 
18
 
19
+ (Our efforts to create a pure and CPU friendly local test time compute model were realized by the PowerInfer team before we were able to realize a more advanced reasoning base model after a month of merging and training in our "QwenStar" project. It seems as if the universe provides, or at least Huggingface has. Offering more Test time reasoning than our other models it may use more tokens to come to many of the same conclusions but this makes it more accurate overall. If you're looking for something similar,faster yet slightly less effective I'd point you to our Reasoning-Rabbit or Replicant models. if you don't need tool use and simply need something solid and small go with the Kaiju or THOTH models. This model was converted to GGUF format from [`PowerInfer/SmallThinker-3B-Preview`](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) using llama.cpp
20
+ Refer to the [original model card](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) for more details on the model.
21
 
22
+ ### The Model is Renamed Tangu for personal use and has not undergone any Importance matrix Quantization yet for lack of responce exploration but is so far very functional and other sizes can be found in Bartowski's repository bartowski/SmallThinker-3B-Preview-GGUF and following the original model tree. Our QwenStar project is mostly for the users of GPT4ALL and Offering resources for applying tool use to reasoning models like this, offering a recursive thought method with not just code inference but actual execution and calculation. Things like factorals or distance estimation as well as many other information non existent in an LLM(or SLM) is now available so you can, without a GPU compete with the likes of o1 and o3 inside the GPT4ALL env with it's new behind the scenes “Analyzing” function. Also using RAG/embedding we believe these powerful features are revolutionary. We also believe to restrict someone's freedoms and opportunities for how they "Might" be used is both Jealous and Unjust. Just as the founders and philosophers which brought forth this age of abundance did. Please comment with unique use cases and other information you find either here or on our X/Discord (both offer set-up instructions)
 
23
 
24
+ ## Use with GPT4ALL
 
25
 
26
+ ### Jinja "Chat Template"
 
 
27
  ```
28
+ {{- '<|im_start|>system\n' }}
29
+ {% if toolList|length > 0 %}You have access to the following functions:
30
+ {% for tool in toolList %}
31
+ Use the function '{{tool.function}}' to: '{{tool.description}}'
32
+ {% if tool.parameters|length > 0 %}
33
+ parameters:
34
+ {% for info in tool.parameters %}
35
+ {{info.name}}:
36
+ type: {{info.type}}
37
+ description: {{info.description}}
38
+ required: {{info.required}}
39
+ {% endfor %}
40
+ {% endif %}
41
+ # Tool Instructions
42
+ If you CHOOSE to call this function ONLY reply with the following format:
43
+ '{{tool.symbolicFormat}}'
44
+ Here is an example. If the user says, '{{tool.examplePrompt}}', then you reply
45
+ '{{tool.exampleCall}}'
46
+ After the result you might reply with, '{{tool.exampleReply}}'
47
+ {% endfor %}
48
+ You MUST include both the start and end tags when you use a function.
49
 
50
+ You are a helpful aware AI assistant made by Intelligent Estate who uses the functions to break down, analyze, perform, and verify complex reasoning tasks. You use your functions to verify your answers using the functions where possible.
51
+ {% endif %}
52
+ {{- '<|im_end|>\n' }}
53
+ {% for message in messages %}
54
+ {{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n' }}
55
+ {% endfor %}
56
+ {% if add_generation_prompt %}
57
+ {{ '<|im_start|>assistant\n' }}
58
+ {% endif %}
59
  ```
60
+ ### GPT4ALL "System Message"
61
 
62
+ So far not neccisary but may be tuned as needed for suggestions refer to Reasoning-Rabbit and Replicant models
 
 
 
 
 
63
 
64
+ ### Other models
65
+ This should work well on other UIs the [original model](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) has usage instructions for them
 
 
66