File size: 1,700 Bytes
f9d7028
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# Deployment on Azure Machine Learning

## Pre-requisites

```

cd inference/triton_server

```

Set the environment for AML:
```

export RESOURCE_GROUP=Dhruva-prod

export WORKSPACE_NAME=dhruva--central-india

export DOCKER_REGISTRY=dhruvaprod

```

Also remember to edit the `yml` files accordingly.

## Registering the model

```

az ml model create --file azure_ml/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME

```

## Pushing the docker image to Container Registry

```

az acr login --name $DOCKER_REGISTRY

docker tag indictrans2_triton $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest

docker push $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest

```

## Creating the execution environment

```

az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME

```

## Publishing the endpoint for online inference

```

az ml online-endpoint create -f azure_ml/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME

```

Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image.



## Attaching a deployment



```

az ml online-deployment create -f azure_ml/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME
```



## Testing if inference works



1. From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without `https://` or trailing `/`) and an authentication key.

2. In `client.py`, enable `ENABLE_SSL = True`, and then set the `ENDPOINT_URL` variable as well as `Authorization` value inside `HTTP_HEADERS`.

3. Run `python3 client.py`