Error while running for inference

#27
by sumitsoman - opened

Get this message, any suggestion how to resolve?

2024-04-23T11:39:19.958232Z ERROR batch{batch_size=1}:prefill:prefill{id=2 size=1}:prefill{id=2 size=1}: text_generation_client: router/client/src/lib.rs:33: Server error: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.
2024-04-23T11:39:19.959108Z ERROR HTTP request{otel.name=POST /generate http.client_ip= http.flavor=1.1 http.host=129.192.82.77:8080 http.method=POST http.route=/generate http.scheme=HTTP http.target=/generate http.user_agent=python-requests/2.31.0 otel.kind=server trace_id=7a57d270afb6fb24c60310a98d54c704}:generate{parameters=GenerateParameters { best_of: Some(1), temperature: Some(1e-6), repetition_penalty: Some(1.2), top_k: Some(50), top_p: Some(0.95), typical_p: Some(0.95), do_sample: true, max_new_tokens: 2000, return_full_text: Some(true), stop: ["<|endoftext|>"], truncate: Some(1023), watermark: false, details: false, decoder_input_details: false, seed: Some(42), top_n_tokens: Some(1) }}:generate{request=GenerateRequest { inputs: "what is 1+1", parameters: GenerateParameters { best_of: Some(1), temperature: Some(1e-6), repetition_penalty: Some(1.2), top_k: Some(50), top_p: Some(0.95), typical_p: Some(0.95), do_sample: true, max_new_tokens: 2000, return_full_text: Some(true), stop: ["<|endoftext|>"], truncate: Some(1023), watermark: false, details: false, decoder_input_details: false, seed: Some(42), top_n_tokens: Some(1) } }}:generate_stream{request=GenerateRequest { inputs: "what is 1+1", parameters: GenerateParameters { best_of: Some(1), temperature: Some(1e-6), repetition_penalty: Some(1.2), top_k: Some(50), top_p: Some(0.95), typical_p: Some(0.95), do_sample: true, max_new_tokens: 2000, return_full_text: Some(true), stop: ["<|endoftext|>"], truncate: Some(1023), watermark: false, details: false, decoder_input_details: false, seed: Some(42), top_n_tokens: Some(1) } }}:infer:send_error: text_generation_router::infer: router/src/infer.rs:588: Request failed during generation: Server error: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

Sign up or log in to comment