hf-llm-api

Sleeping

App Files Files Community

Hansimov commited on Jan 22

Commit

37be019

•

1 Parent(s): 06a233d

:pencil: [Doc] New model, and prettify formats

Browse files

Files changed (1) hide show

README.md +15 -7

README.md CHANGED Viewed

@@ -10,15 +10,17 @@ app_port: 23333
 ## HF-LLM-API
 Huggingface LLM Inference API in OpenAI message format.
 ## Features
-- Available Models (2024/01/15): (https://github.com/Hansimov/hf-llm-api/issues/5)
-  - `mixtral-8x7b`, `mistral-7b`
-- Adaptive prompt templates for different models
 - Support OpenAI API format
   - Enable api endpoint via official `openai-python` package
 - Support both stream and no-stream response
-- Support API Key via both HTTP auth header and env varible (https://github.com/Hansimov/hf-llm-api/issues/4)
 - Docker deployment
 ## Run API service
@@ -60,7 +62,7 @@ sudo docker run -p 23333:23333 --env http_proxy="http://<server>:<port>" hf-llm-
 ### Using `openai-python`
-See: [examples/chat_with_openai.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
 ```py
 from openai import OpenAI
@@ -69,6 +71,8 @@ from openai import OpenAI
 base_url = "http://127.0.0.1:23333"
 # Your own HF_TOKEN
 api_key = "hf_xxxxxxxxxxxxxxxx"
 client = OpenAI(base_url=base_url, api_key=api_key)
 response = client.chat.completions.create(
@@ -93,7 +97,7 @@ for chunk in response:
 ### Using post requests
-See: [examples/chat_with_post.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
 ```py
@@ -104,7 +108,11 @@ import re
 # If runnning this service with proxy, you might need to unset `http(s)_proxy`.
 chat_api = "http://127.0.0.1:23333"
-api_key = "sk-xxxxx"
 requests_headers = {}
 requests_payload = {
     "model": "mixtral-8x7b",

 ## HF-LLM-API
 Huggingface LLM Inference API in OpenAI message format.
+Project link: https://github.com/Hansimov/hf-llm-api
 ## Features
+- Available Models (2024/01/22): [#5](https://github.com/Hansimov/hf-llm-api/issues/5)
+  - `mistral-7b`, `mixtral-8x7b`, `nous-mixtral-8x7b`
+  - Adaptive prompt templates for different models
 - Support OpenAI API format
   - Enable api endpoint via official `openai-python` package
 - Support both stream and no-stream response
+- Support API Key via both HTTP auth header and env varible [#4](https://github.com/Hansimov/hf-llm-api/issues/4)
 - Docker deployment
 ## Run API service
 ### Using `openai-python`
+See: [`examples/chat_with_openai.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
 ```py
 from openai import OpenAI
 base_url = "http://127.0.0.1:23333"
 # Your own HF_TOKEN
 api_key = "hf_xxxxxxxxxxxxxxxx"
+# use below as non-auth user
+# api_key = "sk-xxx"
 client = OpenAI(base_url=base_url, api_key=api_key)
 response = client.chat.completions.create(
 ### Using post requests
+See: [`examples/chat_with_post.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
 ```py
 # If runnning this service with proxy, you might need to unset `http(s)_proxy`.
 chat_api = "http://127.0.0.1:23333"
+# Your own HF_TOKEN
+api_key = "hf_xxxxxxxxxxxxxxxx"
+# use below as non-auth user
+# api_key = "sk-xxx"
 requests_headers = {}
 requests_payload = {
     "model": "mixtral-8x7b",