nnnn / docs /my-website /docs /proxy /virtual_keys.md
nonhuman's picture
Upload 225 files
4ec8dba

Key Management

Track Spend and create virtual keys for the proxy

Grant other's temporary access to your proxy, with keys that expire after a set duration.

Requirements:

You can then generate temporary keys by hitting the /key/generate endpoint.

See code

Step 1: Save postgres db url

model_list:
  - model_name: gpt-4
    litellm_params:
        model: ollama/llama2
  - model_name: gpt-3.5-turbo
    litellm_params:
        model: ollama/llama2

general_settings: 
  master_key: sk-1234 # [OPTIONAL] if set all calls to proxy will require either this key or a valid generated token
  database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>"

Step 2: Start litellm

litellm --config /path/to/config.yaml

Step 3: Generate temporary keys

curl 'http://0.0.0.0:8000/key/generate' \
--h 'Authorization: Bearer sk-1234' \
--d '{"models": ["gpt-3.5-turbo", "gpt-4", "claude-2"], "duration": "20m"}'
  • models: list or null (optional) - Specify the models a token has access too. If null, then token has access to all models on server.

  • duration: str or null (optional) Specify the length of time the token is valid for. If null, default is set to 1 hour. You can set duration as seconds ("30s"), minutes ("30m"), hours ("30h"), days ("30d").

Expected response:

{
    "key": "sk-kdEXbIqZRwEeEiHwdg7sFA", # Bearer token
    "expires": "2023-11-19T01:38:25.838000+00:00" # datetime object
}

Managing Auth - Upgrade/Downgrade Models

If a user is expected to use a given model (i.e. gpt3-5), and you want to:

  • try to upgrade the request (i.e. GPT4)
  • or downgrade it (i.e. Mistral)
  • OR rotate the API KEY (i.e. open AI)
  • OR access the same model through different end points (i.e. openAI vs openrouter vs Azure)

Here's how you can do that:

Step 1: Create a model group in config.yaml (save model name, api keys, etc.)

model_list:
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8001
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8002
  - model_name: my-free-tier
    litellm_params:
        model: huggingface/HuggingFaceH4/zephyr-7b-beta
        api_base: http://0.0.0.0:8003
    - model_name: my-paid-tier
    litellm_params:
        model: gpt-4
        api_key: my-api-key

Step 2: Generate a user key - enabling them access to specific models, custom model aliases, etc.

curl -X POST "https://0.0.0.0:8000/key/generate" \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
    "models": ["my-free-tier"], 
    "aliases": {"gpt-3.5-turbo": "my-free-tier"}, 
    "duration": "30min"
}'
  • How to upgrade / downgrade request? Change the alias mapping
  • How are routing between diff keys/api bases done? litellm handles this by shuffling between different models in the model list with the same model_name. See Code

Managing Auth - Tracking Spend

You can get spend for a key by using the /key/info endpoint.

curl 'http://0.0.0.0:8000/key/info?key=<user-key>' \
     -X GET \
     -H 'Authorization: Bearer <your-master-key>'

This is automatically updated (in USD) when calls are made to /completions, /chat/completions, /embeddings using litellm's completion_cost() function. See Code.

Sample response

{
    "key": "sk-tXL0wt5-lOOVK9sfY2UacA",
    "info": {
        "token": "sk-tXL0wt5-lOOVK9sfY2UacA",
        "spend": 0.0001065,
        "expires": "2023-11-24T23:19:11.131000Z",
        "models": [
            "gpt-3.5-turbo",
            "gpt-4",
            "claude-2"
        ],
        "aliases": {
            "mistral-7b": "gpt-3.5-turbo"
        },
        "config": {}
    }
}