limcheekin commited on
Commit
4c9ba1e
1 Parent(s): ea29fa4

feat: updated code and doc to serve the orca_mini_v3_7B-GGUF (Q4_K_M) model

Browse files
Files changed (4) hide show
  1. Dockerfile +1 -1
  2. README.md +9 -10
  3. index.html +7 -7
  4. main.py +2 -2
Dockerfile CHANGED
@@ -15,7 +15,7 @@ RUN pip install -U pip setuptools wheel && \
15
 
16
  # Download model
17
  RUN mkdir model && \
18
- curl -L https://huggingface.co/TheBloke/orca_mini_v3_7B-GGML/resolve/main/orca_mini_v3_7b.ggmlv3.q4_1.bin -o model/ggmlv3-model.bin
19
 
20
  COPY ./start_server.sh ./
21
  COPY ./main.py ./
 
15
 
16
  # Download model
17
  RUN mkdir model && \
18
+ curl -L https://huggingface.co/TheBloke/orca_mini_v3_7B-GGUF/resolve/main/orca_mini_v3_7b.Q4_K_M.gguf -o model/gguf-model.bin
19
 
20
  COPY ./start_server.sh ./
21
  COPY ./main.py ./
README.md CHANGED
@@ -1,21 +1,20 @@
1
  ---
2
- title: orca_mini_v3_7B-GGML (q4_1)
3
  colorFrom: purple
4
  colorTo: blue
5
  sdk: docker
6
  models:
7
- - psmathur/orca_mini_v3_7b
8
- - TheBloke/orca_mini_v3_7B-GGML
9
  tags:
10
- - inference api
11
- - openai-api compatible
12
- - llama-cpp-python
13
- - orca_mini_v3_7B
14
- - ggml
15
  pinned: false
16
- duplicated_from: limcheekin/orca_mini_v3_7B-GGML
17
  ---
18
 
19
- # orca_mini_v3_7B-GGML (q4_1)
20
 
21
  Please refer to the [index.html](index.html) for more information.
 
1
  ---
2
+ title: orca_mini_v3_7B-GGUF (Q4_K_M)
3
  colorFrom: purple
4
  colorTo: blue
5
  sdk: docker
6
  models:
7
+ - psmathur/orca_mini_v3_7b
8
+ - TheBloke/orca_mini_v3_7B-GGUF
9
  tags:
10
+ - inference api
11
+ - openai-api compatible
12
+ - llama-cpp-python
13
+ - orca_mini_v3_7B
14
+ - gguf
15
  pinned: false
 
16
  ---
17
 
18
+ # orca_mini_v3_7B-GGUF (Q4_K_M)
19
 
20
  Please refer to the [index.html](index.html) for more information.
index.html CHANGED
@@ -1,14 +1,14 @@
1
  <!DOCTYPE html>
2
  <html>
3
  <head>
4
- <title>orca_mini_v3_7B-GGML (q4_1)</title>
5
  </head>
6
  <body>
7
- <h1>orca_mini_v3_7B-GGML (q4_1)</h1>
8
  <p>
9
  With the utilization of the
10
  <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
11
- package, we are excited to introduce the GGML model hosted in the Hugging
12
  Face Docker Spaces, made accessible through an OpenAI-compatible API. This
13
  space includes comprehensive API documentation to facilitate seamless
14
  integration.
@@ -16,14 +16,14 @@
16
  <ul>
17
  <li>
18
  The API endpoint:
19
- <a href="https://limcheekin-orca-mini-v3-7b-ggml.hf.space/v1"
20
- >https://limcheekin-orca-mini-v3-7b-ggml.hf.space/v1</a
21
  >
22
  </li>
23
  <li>
24
  The API doc:
25
- <a href="https://limcheekin-orca-mini-v3-7b-ggml.hf.space/docs"
26
- >https://limcheekin-orca-mini-v3-7b-ggml.hf.space/docs</a
27
  >
28
  </li>
29
  </ul>
 
1
  <!DOCTYPE html>
2
  <html>
3
  <head>
4
+ <title>orca_mini_v3_7B-GGUF (Q4_K_M)</title>
5
  </head>
6
  <body>
7
+ <h1>orca_mini_v3_7B-GGUF (Q4_K_M)</h1>
8
  <p>
9
  With the utilization of the
10
  <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
11
+ package, we are excited to introduce the GGUF model hosted in the Hugging
12
  Face Docker Spaces, made accessible through an OpenAI-compatible API. This
13
  space includes comprehensive API documentation to facilitate seamless
14
  integration.
 
16
  <ul>
17
  <li>
18
  The API endpoint:
19
+ <a href="https://limcheekin-orca-mini-v3-7b-gguf.hf.space/v1"
20
+ >https://limcheekin-orca-mini-v3-7b-gguf.hf.space/v1</a
21
  >
22
  </li>
23
  <li>
24
  The API doc:
25
+ <a href="https://limcheekin-orca-mini-v3-7b-gguf.hf.space/docs"
26
+ >https://limcheekin-orca-mini-v3-7b-gguf.hf.space/docs</a
27
  >
28
  </li>
29
  </ul>
main.py CHANGED
@@ -4,8 +4,8 @@ import os
4
 
5
  app = create_app(
6
  Settings(
7
- n_threads=2, # set to number of cpu cores
8
- model="model/ggmlv3-model.bin",
9
  embedding=False
10
  )
11
  )
 
4
 
5
  app = create_app(
6
  Settings(
7
+ n_threads=2, # set to number of cpu cores
8
+ model="model/gguf-model.bin",
9
  embedding=False
10
  )
11
  )