YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Chill Watcher

consider deploy on:

  • huggingface inference point
  • replicate api
  • lightning.ai

platform comparison

all support autoscaling

platform prediction speed charges deploy handiness
huggingface fast:20s high:$0.6/hr (without autoscaling) easy:git push
replicate fast if used frequently: 30s, slow if needs initialization: 5min low: $0.02 per generation difficult: build image and upload
lightning.ai fast with app running: 20s, slow if idle: XXs low: free $30 per month, $0.18 per init, $0.02 per run easy: one command

platform deploy options

huggingface

docs

  • requirements: use pip packages in requirements.txt
  • init() and predict() function: use handler.py, implement the EndpointHandler class
  • more: modify handler.py for requests and inference and explore more highly-customized features
  • deploy: git (lfs) push to huggingface repository(the whole directory including models and weights, etc.), and use inference endpoints to deploy. Click and deploy automaticly, very simple.
  • call api: use the url provide by inference endpoints after endpoint is ready(build, initialize and in a "running" state), make a post request to the url using request schema definied in the handler.py

replicate

docs

  • requirements: specify all requirements(pip packages, system packages, python version, cuda, etc.) in cog.yaml
  • init() and predict() function: use predict.py, implement the Predictor class
  • more: modify predict.py
  • deploy:
    1. get a linux GPU machine with 60GB disk space;
    2. install cog and docker
    3. git pull the current repository from huggingface, including large model files
    4. after predict.py and cog.yaml is correctly coded, run cog login, cog push, then cog will build a docker image locally and push the image to replicate. As the image could take 30GB or so disk space, it would cost a lot network bandwidth.
  • call api: if everything runs successfully and the docker image is pushed to replicate, you will see a web-ui and an API example directly in your replicate repository

lightning.ai

docs: code, deploy

  • requirements:
    • pip packages are listed in requirements.txt, note that some requirements are different from those in huggingface, and you need to modify some lines in requirements.txt according to the comment in the requirements.txt
    • other pip packages, system packages and some big model weight files download commands, can be listed using a custom build config. Checkout class CustomBuildConfig(BuildConfig) in app.py. In a custom build config you can use many linux commands such as wget and sudo apt-get update. The custom build config will be executed on the __init__() of the PythonServer class
  • init() and predict() function: use app.py, implement the PythonServer class. Note:
    • some packages haven't been installed when the file is called(these packages may be installed when __init__() is called), so some import code should be in the function, not at the top of the file, or you may get import errors.
    • you can't save your own value to PythonServer.self unless it's predifined in the variables, so don't assign any self-defined variables to self
    • if you use the custom build config, you should implement PythonServer's __init()__ yourself, so don't forget to use the correct function signature
  • more: ...
  • deploy:
    • pip install lightning
    • prepare the directory on your local computer(no need to have a GPU)
    • list big files in the .lightningignore file to avoid big file upload and save deploy time cost
    • run lightning run app app.py --cloud in the local terminal, and it will upload the files in the directory to lightning cloud, and start deploying on the cloud
    • check error logs on the web-ui, use all logs
  • call api: only if the app starts successfully, you can see a valid url in the settings page of the web-ui. Open that url, and you can see the api

some stackoverflow:

install docker:

install git-lfs:

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

sudo apt-get install git-lfs

license: apache-2.0

Downloads last month
4
Inference API
Unable to determine this model’s pipeline type. Check the docs .