limcheekin commited on
Commit
a5c9525
1 Parent(s): 58a8533

feat: added chat_format and update Q6_K model

Browse files
Files changed (4) hide show
  1. Dockerfile +1 -1
  2. README.md +2 -2
  3. index.html +2 -2
  4. main.py +2 -1
Dockerfile CHANGED
@@ -15,7 +15,7 @@ RUN pip install -U pip setuptools wheel && \
15
 
16
  # Download model
17
  RUN mkdir model && \
18
- curl -L https://huggingface.co/TheBloke/rocket-3B-GGUF/resolve/main/rocket-3b.Q8_0.gguf -o model/gguf-model.bin
19
 
20
  COPY ./start_server.sh ./
21
  COPY ./main.py ./
 
15
 
16
  # Download model
17
  RUN mkdir model && \
18
+ curl -L https://huggingface.co/TheBloke/rocket-3B-GGUF/resolve/main/rocket-3b.Q6_K.gguf -o model/gguf-model.bin
19
 
20
  COPY ./start_server.sh ./
21
  COPY ./main.py ./
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: rocket-3B-GGUF (Q8_0)
3
  colorFrom: purple
4
  colorTo: blue
5
  sdk: docker
@@ -15,6 +15,6 @@ tags:
15
  pinned: false
16
  ---
17
 
18
- # rocket-3B-GGUF (Q8_0)
19
 
20
  Please refer to the [index.html](index.html) for more information.
 
1
  ---
2
+ title: rocket-3B-GGUF (Q6_K)
3
  colorFrom: purple
4
  colorTo: blue
5
  sdk: docker
 
15
  pinned: false
16
  ---
17
 
18
+ # rocket-3B-GGUF (Q6_K)
19
 
20
  Please refer to the [index.html](index.html) for more information.
index.html CHANGED
@@ -1,10 +1,10 @@
1
  <!DOCTYPE html>
2
  <html>
3
  <head>
4
- <title>rocket-3B-GGUF (Q8_0)</title>
5
  </head>
6
  <body>
7
- <h1>rocket-3B-GGUF (Q8_0)</h1>
8
  <p>
9
  With the utilization of the
10
  <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
 
1
  <!DOCTYPE html>
2
  <html>
3
  <head>
4
+ <title>rocket-3B-GGUF (Q6_K)</title>
5
  </head>
6
  <body>
7
+ <h1>rocket-3B-GGUF (Q6_K)</h1>
8
  <p>
9
  With the utilization of the
10
  <a href="https://github.com/abetlen/llama-cpp-python">llama-cpp-python</a>
main.py CHANGED
@@ -6,7 +6,8 @@ app = create_app(
6
  Settings(
7
  n_threads=2, # set to number of cpu cores
8
  model="model/gguf-model.bin",
9
- embedding=True
 
10
  )
11
  )
12
 
 
6
  Settings(
7
  n_threads=2, # set to number of cpu cores
8
  model="model/gguf-model.bin",
9
+ embedding=True,
10
+ chat_format="chatml"
11
  )
12
  )
13