When testing model API using curl, It crashes.

#162
by yashhirulkar - opened

Here shows my model is running:-
+------------------+---------+--------+
| Model | Version | Status |
+------------------+---------+--------+
| ensemble | 1 | READY |
| postprocessing | 1 | READY |
| preprocessing | 1 | READY |
| tensorrt_llm | 1 | READY |
| tensorrt_llm_bls | 1 | READY |
+------------------+---------+--------+

I0718 10:43:44.606801 2070 metrics.cc:877] Collecting metrics for GPU 0: NVIDIA A100 80GB PCIe
I0718 10:43:44.609893 2070 metrics.cc:770] Collecting CPU metrics

I0718 10:43:44.619372 2070 grpc_server.cc:2466] Started GRPCInferenceService at 0.0.0.0:8001
I0718 10:43:44.619558 2070 http_server.cc:4636] Started HTTPService at 0.0.0.0:8000
I0718 10:43:44.669812 2070 http_server.cc:320] Started Metrics Service at 0.0.0.0:8002

ERROR:-

root@deployment:/opt/tritonserver# curl -X POST localhost:8000/v2/models/ensemble/generate -d '{"text_input": "What is machine learning?", "max_tokens": 1000, "bad_words": "", "stop_words": ""}'

{"error":"in ensemble 'ensemble', [TensorRT-LLM][ERROR] Assertion failed: Invalid tensor name: decoder_input_ids (/tmp/tritonbuild/tensorrtllm/inflight_batcher_llm/../tensorrt_llm/cpp/include/tensorrt_llm/batch_manager/inferenceRequest.h:239)\n1 0x7f52f0a297b5 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x77b5) [0x7f52f0a297b5]\n2 0x7f52f0a296bf /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x76bf) [0x7f52f0a296bf]\n3 0x7f52f0a3210c /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x1010c) [0x7f52f0a3210c]\n4 0x7f52f0a323f1 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x103f1) [0x7f52f0a323f1]\n5 0x7f52f0a33857 /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x11857) [0x7f52f0a33857]\n6 0x7f52f0a36ded /opt/tritonserver/backends/tensorrtllm/libtriton_tensorrtllm.so(+0x14ded) [0x7f52f0a36ded]\n7 0x7f52f0a2e0f5 TRITONBACKEND_ModelInstanceExecute + 101\n8 0x7f530691dd74 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1a8d74) [0x7f530691dd74]\n9 0x7f530691e0db /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1a90db) [0x7f530691e0db]\n10 0x7f5306a329bd /opt/tritonserver/bin/../lib/libtritonserver.so(+0x2bd9bd) [0x7f5306a329bd]\n11 0x7f5306921d64 /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1acd64) [0x7f5306921d64]\n12 0x7f53061e2253 /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f53061e2253]\n13 0x7f5305f71ac3 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f5305f71ac3]\n14 0x7f5306002a04 clone + 68"}root@deployment-pipeline-research-team1:/opt/tritonserver#

Sign up or log in to comment