Spaces:
Running
on
T4
Running
on
T4
Better clone-ability
Browse files- Dockerfile +2 -1
- README.md +23 -4
Dockerfile
CHANGED
@@ -1,10 +1,11 @@
|
|
1 |
FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
|
2 |
ARG MODEL
|
|
|
3 |
RUN mkdir /opt/koboldcpp
|
4 |
RUN apt update && apt install git build-essential libopenblas-dev wget python3-pip -y
|
5 |
RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
|
6 |
WORKDIR /opt/koboldcpp
|
7 |
RUN make LLAMA_OPENBLAS=1 LLAMA_CUBLAS=1 LLAMA_PORTABLE=1
|
8 |
RUN wget -O model.ggml $MODEL
|
9 |
-
CMD ["/bin/python3", "./koboldcpp.py", "--model", "model.ggml", "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096", "--port", "7860", "--hordeconfig", "
|
10 |
|
|
|
1 |
FROM nvidia/cuda:11.8.0-devel-ubuntu22.04
|
2 |
ARG MODEL
|
3 |
+
ARG MODEL_NAME
|
4 |
RUN mkdir /opt/koboldcpp
|
5 |
RUN apt update && apt install git build-essential libopenblas-dev wget python3-pip -y
|
6 |
RUN git clone https://github.com/lostruins/koboldcpp /opt/koboldcpp
|
7 |
WORKDIR /opt/koboldcpp
|
8 |
RUN make LLAMA_OPENBLAS=1 LLAMA_CUBLAS=1 LLAMA_PORTABLE=1
|
9 |
RUN wget -O model.ggml $MODEL
|
10 |
+
CMD ["/bin/python3", "./koboldcpp.py", "--model", "model.ggml", "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096", "--port", "7860", "--hordeconfig", "HF_SPACE_$MODEL_NAME", "1", "1"]
|
11 |
|
README.md
CHANGED
@@ -1,11 +1,30 @@
|
|
1 |
---
|
2 |
title: Koboldcpp Tiefigther
|
3 |
-
emoji:
|
4 |
-
colorFrom:
|
5 |
-
colorTo:
|
6 |
sdk: docker
|
7 |
pinned: false
|
8 |
license: agpl-3.0
|
9 |
---
|
10 |
|
11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
title: Koboldcpp Tiefigther
|
3 |
+
emoji: 🦎
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: orange
|
6 |
sdk: docker
|
7 |
pinned: false
|
8 |
license: agpl-3.0
|
9 |
---
|
10 |
|
11 |
+
# Koboldcpp in a Space!
|
12 |
+
Welcome to the Koboldcpp space, Koboldcpp allows you to easily make your own demonstration spaces of a GGUF model.
|
13 |
+
|
14 |
+
### For the users
|
15 |
+
|
16 |
+
In this space:
|
17 |
+
- You can use the KoboldAI Lite UI for Instructions, Writing, Chat and Adventure use.
|
18 |
+
- You can use the model shown with a KoboldAI compatible API (Use the instance link that is shows + /api) or as an OpenAI compatible API (Use the instance link that it shows, optionally with /v1 if your solution requires this)
|
19 |
+
- In the UI all your data is stored locally without a sign-in.
|
20 |
+
- View the API documentation by accessing the frame link + /api in your browser (For example https://koboldai-koboldcpp-tiefighter.hf.space/api)
|
21 |
+
|
22 |
+
### For model / space developers
|
23 |
+
This space was designed to be easy to clone, first make sure you convert your model to the GGUF format and quantize it to something that fits on the GPU you allocated to your space.
|
24 |
+
|
25 |
+
If you have a GPU available for your space, clone this space and point the MODEL variable to your model's download location, then force a rebuild so it can use your own custom model. You can customize the model that is being displayed by setting the MODEL_NAME.
|
26 |
+
|
27 |
+
Want to run on the CPU tier? The following line enables multiuser GPU usage.
|
28 |
+
, "--usecublas", "mmq", "--gpulayers", "99", "--multiuser", "--contextsize", "4096"
|
29 |
+
If you remove this from the CMD in the Dockerfile your instance will now be compatible with CPU only usage.
|
30 |
+
|