File size: 4,775 Bytes
1976a91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e000751
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# Chill Watcher
consider deploy on:
- huggingface inference point
- replicate api
- lightning.ai

# platform comparison
> all support autoscaling

|platform|prediction speed|charges|deploy handiness|
|-|-|-|-|
|huggingface|fast:20s|high:$0.6/hr (without autoscaling)|easy:git push|
|replicate|fast if used frequently: 30s, slow if needs initialization: 5min|low: $0.02 per generation|difficult: build image and upload|
|lightning.ai|fast with app running: 20s, slow if idle: XXs|low: free $30 per month, $0.18 per init, $0.02 per run|easy: one command|

# platform deploy options
## huggingface
> [docs](https://huggingface.co/docs/inference-endpoints/guides/custom_handler)

- requirements: use pip packages in `requirements.txt`
- `init()` and `predict()` function: use `handler.py`, implement the `EndpointHandler` class
- more: modify `handler.py` for requests and inference and explore more highly-customized features
- deploy: git (lfs) push to huggingface repository(the whole directory including models and weights, etc.), and use inference endpoints to deploy. Click and deploy automaticly, very simple.
- call api: use the url provide by inference endpoints after endpoint is ready(build, initialize and in a "running" state), make a post request to the url using request schema definied in the `handler.py`

## replicate
> [docs](https://replicate.com/docs/guides/push-a-model)

- requirements: specify all requirements(pip packages, system packages, python version, cuda, etc.) in `cog.yaml`
- `init()` and `predict()` function: use `predict.py`, implement the `Predictor` class
- more: modify `predict.py`
- deploy: 
    1. get a linux GPU machine with 60GB disk space;
    2. install [cog](https://replicate.com/docs/guides/push-a-model) and [docker](https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository)
    3. `git pull` the current repository from huggingface, including large model files
    4. after `predict.py` and `cog.yaml` is correctly coded, run `cog login`, `cog push`, then cog will build a docker image locally and push the image to replicate. As the image could take 30GB or so disk space, it would cost a lot network bandwidth.
- call api: if everything runs successfully and the docker image is pushed to replicate, you will see a web-ui and an API example directly in your replicate repository

## lightning.ai
> docs: [code](https://lightning.ai/docs/app/stable/levels/basic/real_lightning_component_implementations.html), [deploy](https://lightning.ai/docs/app/stable/workflows/run_app_on_cloud/)

- requirements: 
    - pip packages are listed in `requirements.txt`, note that some requirements are different from those in huggingface, and you need to modify some lines in `requirements.txt` according to the comment in the `requirements.txt`
    - other pip packages, system packages and some big model weight files download commands, can be listed using a custom build config. Checkout `class CustomBuildConfig(BuildConfig)` in `app.py`. In a custom build config you can use many linux commands such as `wget` and `sudo apt-get update`. The custom build config will be executed on the `__init__()` of the `PythonServer` class
- `init()` and `predict()` function: use `app.py`, implement the `PythonServer` class. Note: 
    - some packages haven't been installed when the file is called(these packages may be installed when `__init__()` is called), so some import code should be in the function, not at the top of the file, or you may get import errors.
    - you can't save your own value to `PythonServer.self` unless it's predifined in the variables, so don't assign any self-defined variables to `self`
    - if you use the custom build config, you should implement `PythonServer`'s `__init()__` yourself, so don't forget to use the correct function signature
- more: ...
- deploy:
    - `pip install lightning`
    - prepare the directory on your local computer(no need to have a GPU)
    - list big files in the `.lightningignore` file to avoid big file upload and save deploy time cost
    - run `lightning run app app.py --cloud` in the local terminal, and it will upload the files in the directory to lightning cloud, and start deploying on the cloud
    - check error logs on the web-ui, use `all logs`
- call api: only if the app starts successfully, you can see a valid url in the `settings` page of the web-ui. Open that url, and you can see the api

### some stackoverflow:
install docker:
- https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository

install git-lfs:
- https://github.com/git-lfs/git-lfs/blob/main/INSTALLING.md
linux:
```
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

sudo apt-get install git-lfs
```

---
license: apache-2.0
---