File size: 1,700 Bytes
f9d7028 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
# Deployment on Azure Machine Learning
## Pre-requisites
```
cd inference/triton_server
```
Set the environment for AML:
```
export RESOURCE_GROUP=Dhruva-prod
export WORKSPACE_NAME=dhruva--central-india
export DOCKER_REGISTRY=dhruvaprod
```
Also remember to edit the `yml` files accordingly.
## Registering the model
```
az ml model create --file azure_ml/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME
```
## Pushing the docker image to Container Registry
```
az acr login --name $DOCKER_REGISTRY
docker tag indictrans2_triton $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
docker push $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest
```
## Creating the execution environment
```
az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
```
## Publishing the endpoint for online inference
```
az ml online-endpoint create -f azure_ml/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME
```
Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.
## Attaching a deployment
```
az ml online-deployment create -f azure_ml/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME
```
## Testing if inference works
1. From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without `https://` or trailing `/`) and an authentication key.
2. In `client.py`, enable `ENABLE_SSL = True`, and then set the `ENDPOINT_URL` variable as well as `Authorization` value inside `HTTP_HEADERS`.
3. Run `python3 client.py`
|