Spaces:

limcheekin
/

orca_mini_v3_7B-GGUF

Runtime error

limcheekin commited on Sep 8, 2023

Commit

4c9ba1e

•

1 Parent(s): ea29fa4

feat: updated code and doc to serve the orca_mini_v3_7B-GGUF (Q4_K_M) model

Files changed (4) hide show

Dockerfile CHANGED Viewed

@@ -15,7 +15,7 @@ RUN pip install -U pip setuptools wheel && \
 # Download model
 RUN mkdir model && \
-    curl -L https://huggingface.co/TheBloke/orca_mini_v3_7B-GGML/resolve/main/orca_mini_v3_7b.ggmlv3.q4_1.bin -o model/ggmlv3-model.bin
 COPY ./start_server.sh ./
 COPY ./main.py ./

 # Download model
 RUN mkdir model && \
+    curl -L https://huggingface.co/TheBloke/orca_mini_v3_7B-GGUF/resolve/main/orca_mini_v3_7b.Q4_K_M.gguf -o model/gguf-model.bin
 COPY ./start_server.sh ./
 COPY ./main.py ./

README.md CHANGED Viewed

@@ -1,21 +1,20 @@
 ---
-title: orca_mini_v3_7B-GGML (q4_1)
 colorFrom: purple
 colorTo: blue
 sdk: docker
 models:
-- psmathur/orca_mini_v3_7b
-- TheBloke/orca_mini_v3_7B-GGML
 tags:
-- inference api
-- openai-api compatible
-- llama-cpp-python
-- orca_mini_v3_7B
-- ggml
 pinned: false
-duplicated_from: limcheekin/orca_mini_v3_7B-GGML
 ---
-# orca_mini_v3_7B-GGML (q4_1)
 Please refer to the [index.html](index.html) for more information.

 ---
+title: orca_mini_v3_7B-GGUF (Q4_K_M)
 colorFrom: purple
 colorTo: blue
 sdk: docker
 models:
+  - psmathur/orca_mini_v3_7b
+  - TheBloke/orca_mini_v3_7B-GGUF
 tags:
+  - inference api
+  - openai-api compatible
+  - llama-cpp-python
+  - orca_mini_v3_7B
+  - gguf
 pinned: false
 ---
+# orca_mini_v3_7B-GGUF (Q4_K_M)
 Please refer to the [index.html](index.html) for more information.

index.html CHANGED Viewed

@@ -1,14 +1,14 @@
 <!DOCTYPE html>
 <html>
   <head>
-    <title>orca_mini_v3_7B-GGML (q4_1)</title>
   </head>
   <body>
-    <h1>orca_mini_v3_7B-GGML (q4_1)</h1>
     <p>
       With the utilization of the
       <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
-      package, we are excited to introduce the GGML model hosted in the Hugging
       Face Docker Spaces, made accessible through an OpenAI-compatible API. This
       space includes comprehensive API documentation to facilitate seamless
       integration.
@@ -16,14 +16,14 @@
     <ul>
       <li>
         The API endpoint:
-        <a href="https://limcheekin-orca-mini-v3-7b-ggml.hf.space/v1"
-          >https://limcheekin-orca-mini-v3-7b-ggml.hf.space/v1</a
         >
       </li>
       <li>
         The API doc:
-        <a href="https://limcheekin-orca-mini-v3-7b-ggml.hf.space/docs"
-          >https://limcheekin-orca-mini-v3-7b-ggml.hf.space/docs</a
         >
       </li>
     </ul>

 <!DOCTYPE html>
 <html>
   <head>
+    <title>orca_mini_v3_7B-GGUF (Q4_K_M)</title>
   </head>
   <body>
+    <h1>orca_mini_v3_7B-GGUF (Q4_K_M)</h1>
     <p>
       With the utilization of the
       <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
+      package, we are excited to introduce the GGUF model hosted in the Hugging
       Face Docker Spaces, made accessible through an OpenAI-compatible API. This
       space includes comprehensive API documentation to facilitate seamless
       integration.
     <ul>
       <li>
         The API endpoint:
+        <a href="https://limcheekin-orca-mini-v3-7b-gguf.hf.space/v1"
+          >https://limcheekin-orca-mini-v3-7b-gguf.hf.space/v1</a
         >
       </li>
       <li>
         The API doc:
+        <a href="https://limcheekin-orca-mini-v3-7b-gguf.hf.space/docs"
+          >https://limcheekin-orca-mini-v3-7b-gguf.hf.space/docs</a
         >
       </li>
     </ul>

main.py CHANGED Viewed

@@ -4,8 +4,8 @@ import os
 app = create_app(
     Settings(
-        n_threads=2, # set to number of cpu cores
-        model="model/ggmlv3-model.bin",
         embedding=False
     )
 )

 app = create_app(
     Settings(
+        n_threads=2,  # set to number of cpu cores
+        model="model/gguf-model.bin",
         embedding=False
     )
 )